To Infinity

In the programming language Scheme the concept of infinite lists can be implemented using streams. A stream is really nothing more than a cons cell that contains a value in the cdr field and a pointer to some code to generate the cdr field. While simple in design, the concept is powerful, allowing the programmer to create infinite sequences by simply defining a function to generate the next value in the sequence.

Python supports a similar concept in the form of generator functions. The implementation here is a little more complex, but it can be thought of as a function that returns a result, but maintains its current state for later execution.

Both can be used to write rather elegant code. Take for instance the Sieve of Eratosthenes. This can be implemented in Scheme like so (borrowed from SICP):

In this code, a new stream is created to filter out the unwanted multiples from the previous stream.

Similarly, it can be expressed in Python using generators:

The expression is elegant and theoretically allows us to represent an infinite list that is evaluated lazily. However, the reality strays from the theory. While this code works just fine to find small primes, it breaks for larger values:

In both cases, we've exhausted the language implementation's ability to hold the data structures necessary to represent the lazy lists. The result is elegant code that fail rather quickly. Now, we could always buy ourselves a little more execution time by increasing the heap for Scheme and the recursion depth for Python, but that will only take us so far.

So, that leaves me wondering: is there a similarly elegant way to write code of this sort that won't expire so quickly? To clarify, I'd like to keep using streams and generators, but find some way to limit the overhead incurred from the creation of so many filters (1 for each prime!) without resorting to a different algorithm.

I want to address the alternative primes calculation listed in SICP:

This procedure is still recursive (primes uses prime? which uses primes), but has much less memory overhead -- I managed to generate primes much larger than the 1000th without hitting memory limits. However, this solution strikes me as not quite the same in spirit as the first. Here, a given integer is tested by comparing it to the current list of generated primes while the first definition is building up a function to generate the next prime without having to explicitly scan the list of the already generated primes.