My name is Paul. This is my blog.

Today I was reading this article from Inabow, and realized something.

There is a lot about Python that I haven't got the foggiest about.

So to rectify this problem, I am going to learn something new about the Python ``universe'' every day until I run out of things to learn. I am also going to document my findings here, since teaching something is supposed to be the best way to learn it.

For my first installment, I am going to talk about the yield statement (and generators, since it's impossible to talk about one without the other), because I have yet to be able to figure out what exactly it does simply by the context that I usually see it in.

Having a basic understanding of Iterators definately helped me understand Generators and yield a little faster. My understanding is that iterators are what make lists, strings, files, etc. able to be iterated over. Sorry for using a word to define itself, but basically it means that this is possible:

myList = [1, 2, 3]
for i in myList:
    print i    # 1, 2, 3

Using a list comprehension is basically creating an iterator:

myIterator = [x * x for x in range(3)]
for i in myIterator:
    print i    # 0, 1, 4

(code excerpt taken from http://bit.ly/2GOT8x, which is also a great yield tutorial)

So now we know what iteration is, but what is a generator? Basically, since iterators store all of the values in memory, you can use them more than once. After you call

for i in myIterator:
    print i

the data doesn't go anywhere. You can call it again, and the data will be right there.

However, what if the list you are iterating over is huge? And I mean by-today's-hardware-capabilities-huge? Do you really want to keep all that data in memory? If you are going to use it more than once, then the answer might be yes. But if you are only going to use the data and then throw it away, what is the point of wasting memory like that? That is where generators come in. A simple way to make a generator sure looks familiar:

myGenerator = (x*x for x in range(3))
for i in myGenerator:
    print i #0 1 4

It's almost the same as a list comprehension, except without the list form. The difference is, after you have finished iterating over this data, calling it again means all the data has to be generated again. Not only that, but once 1 is generated, 0 is gone. Kaput. As in, it ain't there no mo'. Bad for cases where the data needs to be reused (more CPU cycles), but good for cases where the data is used once then thrown away (less memory usage).

So now we get to the real meat of this long winded shite of a post...
The python 2.6.2 docs say this about the yield statement:

The yield statement is only used when defining a generator function, and is only used in the body of the generator function. Using a yield statement in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.

Ok, cool. Sounds good.
But what's a generator function? We know what a generator is, so it's not to hard to figure this one out. Here's what the docs have to say:

When a generator function is called, it returns an iterator known as a generator iterator, or more commonly, a generator.

So a generator function returns a generator. Makes sense. One more thing from the docs:

The body of the generator function is executed by calling the generator’s next() method repeatedly until it raises an exception.

And that, my friends is where the yield statement comes in handy.
You see, when yield is used within a function, that function becomes a generator function, which returns a generator. The yield statement is used almost like the return statement in a regular function, but with one caveat: yield saves the state of the function after it executes.

So: the generator function's body is called, yield outputs a value, the state of the generator function is saved on the stack, the generators next() method is called, and the process repeats. Sounds great.
For a useful example of this great tool, here's a code snippet from http://bit.ly/yF87Q:

class Permutation: 
    def __init__(self, justalist): 
        self._data = justalist[:] 
        self._sofar = []
    def __iter__(self): 
        return self.next() 
    def next(self): 
        for elem in self._data: 
            if elem not in self._sofar: 
                self._sofar.append(elem) 
                if len(self._sofar) == len(self._data): 
                    yield self._sofar[:] 
                else: 
                    for v in self.next(): 
                        yield v 
                self._sofar.pop()
a = [1,2,3]
for i in Permutation(a):
    print i

I really enjoyed Ayman Hourieh's latest post, Python Debugging Techniques. As a self-proclaimed Python up-and-comer, it is good for me to break out of the bad habits I seem to get into every time I start learning a language. It's not that I'm lazy, per se, it's that when I really get into a program or problem, I don't really want to stop and think about the most efficient way to set up debugging.

alert(debugMsg);

was quick and dirty when I was learning Javascript,

cout << debugMsg << endl;

for C++, and until reading that post, having

print debugMsg

statements scattered about my python programs were the extent of my debugging helpers.

Immediately after reading his post, I incorporated file logging into my current python project, a data parser for a Physics student friend of mine. I had originally written his application in C#, and a couple weeks ago he came and asked for a couple new features. I was in a pickle, as I don't even have a copy of Visual Studio installed anymore, and in fact don't even have a full-fledged Windows machine running anymore (only VM's). I decided to use my newfound python chops to rewrite his program. In an hour, I had a functional replica of the program, a program that took at least twice that in C#. The last few days I have been going back to it occasionally, adding more features, error handling, etc, all in the hopes of making it a bit more robust.

I am excited to put some of these wonderful debugging tools to work tomorrow, and will keep this page updated.

In the works:

A WxWidgets GUI for the batch image resizing program I wrote a while back
I plan on redoing my genetic algorithm implementation in a more functional style...map(), filter(), and reduce() are going to be my friends...

My name is Paul. This is my blog.

Wednesday, September 2, 2009

Bring on Extensions for Chrome!

Tuesday, September 1, 2009

My Python Journey

Monday, August 31, 2009

Interesting, Ayman...

About Me

My kind of links...

Blog Archive