Here’s what I learned.
Sometimes a new language feature comes along and I jump all over it and start using it all the time. That’s what happened with several other ES6 features. I made a list of my favorite ES6 features, and called it the ROAD MAP:
- R Rest & Spread
- O Object Literal Shortcuts (compact literals)
- A Arrow Functions
- D Destructuring & Default Parameters
- M Modules
- A Asynchronous P Programming (promises & generators)
When I made the list, I thought these were the ES6 features I would use most. Initially, I was very excited about generators, but now that I’ve been living with them for a while, I haven’t found a lot of good use cases for generators in my real application code. For most use-cases I might use generators for, I reach for RxJS instead because of it’s much richer API.
That doesn’t mean that generators don’t have lots of good use-cases. I tell myself I’ve been waiting for better JS engine support before I really go crazy with them, but it could be my mind simply isn’t thinking in terms of generators yet. The best way to fix that is to get more practice with them.
One of the use cases that immediately jumped into my mind when I heard about generators is that we might use them to grab values from any infinite series. That could have many applications, such as generative algorithms for graphics, computer game levels, music sequences, etc…
Fibonacci is a simple, canonical example that most of you are probably already familiar with. Here are the basics:
The fibonacci sequence is the series of numbers:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34…
After the seed numbers,0and1 , each successive number is the sum of the previous two numbers. One of the interesting properties of the sequence is that the ratio of the current number to the previous number in the sequence converges toward the golden ratio,1.61803398875…
You can use the Fibonacci sequence to generate all sorts of interesting things, like the Golden Spiral, which occurs in nature.
Generator functions are a new feature in ES6 that allow a function togenerate many values over timeby returning an object which can be iterated over to pull values from the function one value at a time.
When the generator function is called, instead of directly returning a value, it returns an iterator object.
The Iterator Protocol
The iterator object has a`.next()`method. When the`.next()`method is called, the function body resumes after the line that was executed the last time`.next()`was called. It continues execution until a`yield`is reached, at which point, it returns an object like this:
The`value`property contains the yielded value, and`done`indicates whether or not the generator has yielded its last value.
A stack overflow is like the cops raiding your party and telling all your friends to go home. Total buzz kill.
I was very excited when ES6 introduced tail call optimization, which lets a recursive function reuse the same stack frame for every iteration — but it only works when the recursive call is in the tail position. A call in tail position means that the function returns the result of the recursive call without any further computation.
Great! My first naive implementation uses a pretty straightforward twist on the canonical mathematical fibonacci definition:
Using the seed values0and1to start the sequence, and moving the addition into the function call signature, it looks something like this:
I love how clean this looks. The seed values are obvious in the function signature and the formula is expressed quite clearly in the recursive call.
The`if`condition allows the loop to terminate by using`return`instead of`yield`when`n`reaches0.If you don’t pass`n` , it will be ` undefined`and evaluate to`NaN`when we try to subtract`1`from it, so the function will never terminate.
This implementation is very straightforward… and naive. When I tested in on large values, it exploded.
Sadly, tail call optimization does not apply to generators. In the specification under function call Runtime Semantics: Evaluation:
7. LettailCallbeIsInTailPosition(thisCall) .
8. ReturnEvaluateDirectCall(func, thisValue, Arguments, tailCall) .
IsInTailPosition returns false for generators (see 14.6.1):
5. Ifbodyis theFunctionBodyof aGeneratorBody , returnfalse .
In other words, avoid recursion for infinite generators. You need to use the iterative form instead if you want to avoid stack overflows.
With a little modification, we can remove the recursion and use an iterative form instead:
As you can see, we’re still doing the same variable swap that was in the original function call signature, but this time we’re using destructuring assignment to accomplish it inside a while loop. We need ` isInfinite`in the generator in case we don’t pass a limit.
It’s possible to extract an array from your generator using a combination of destructuring assignment and the …rest syntax:
But if your generator is an infinite series and there’s no way to describe a limit by passing a parameter, the resulting array will never stop filling.
┻━┻ ︵ヽ(`Д´)ﾉ︵ ┻━┻
In both of the Fibonacci implementations above, we allow the caller to pass`n` , which limits the sequence to the first`n`numbers. All good!
┬─┬ ノ( ゜-゜ノ)
It’s very tempting to memoize something like the Fibonacci sequence, because doing so can dramatically decrease the number of required iterations. In other words, it makes it a lot faster .
For functions which always produce the same output given the same arguments, you can record the results in a memo for future calls so that the work of calculating the results doesn’t have to be repeated. Instead, the result is looked up in the memo and returned without repeating the calculation. The Fibonacci algorithm repeats lots of calculations to come up with results, which means that if we memoize the function, we can save a lot of time.
Let’s look at how we can memoize the iterative form of the Fibonacci generator:
Because`n`essentially represents an index into an array of numbers, we can use it as a literal array index. Subsequent calls will just look up that index and return the corresponding result value.
The original version of this code contained a bug. The first time you would run the function, everything would work just fine, but the memo was written incorrectly because you can’t just yield a value when you find a memo hit — unlike`return`, `yield`does not stop the rest of the function from running. It simply pauses execution until`.next()`gets called again.
This has been the hardest point for me to wrap my head around.`yield`is not just`return`for generators. You also have to think carefully about how resuming the function with`next()`impacts the way you write the logic.
In this case, I was able to get the logic working using`yield`,but it made the control flow hard to read.
It occurred to me that for something that can be memoized this way, it’s much easier for me to read when I separate the generator function from the calculation logic.
As you can see, the new generator function is extremely simple — it simply calculates the memo array by calling the memoized`fib()` , and then delegates the generator to the resulting array iterable
`yield*`is a special form of`yield`that will delegate to another generator or iterable. For example:
Whenever I’m playing with competing algorithm implementations, I usually write a simple benchmark script to compare the performance.
For this test, I generated 79 numbers, each. I used Node’s`process.hrtime()`to record nanosecond-accurate timings for both implementations, ran the test three times, and averaged the results:
As you can see, that’s quite a significant difference. If you’re generating a lot of numbers and you want it to be fast, the memoized solution is clearly a wise choice.
There’s just one problem: With an infinite series, the memo array will have unbounded growth. Eventually, you’re going to run into heap size limits, and that will crash the JS engine.
Every time I write a simple benchmark script, I wish for a precision timing API that works in both browsers and Node, but there isn’t one. The closest we can get is a library that provides a facade that wraps both the browser’s`performance.now()`API and Node’s`process.hrtime()`API to present a unified API. Realistically, though, Node-only benchmarks are enough for this test.
My only complaint is that Node’s`process.hrtime()`returns an array instead of a straightforward value in nanoseconds. This is easily remedied, though:
Just pass the array returned from`process.hrtime()`to this function and you’ll get human-friendly nanoseconds back. Let’s take a look at the benchmark script I used to compare the iterative Fibonacci generator to the memoized version:
My favorite feature of`hrtime()`is that you can pass the start time into the function to get the time elapsed since the start time — exactly what you need for profiling.
Sometimes, processes can encounter some terrible luck with the OS task scheduler, so I like to run scripts like this multiple times and average the results.
I’m certain you could come up with much more accurate ways to benchmark your code, but something like this should be good enough for most situations — especially when there is such a clear winner like the memoized Fibonacci implementation.
I don’t want to bore you with too much crazy math, but did you know that there’s a very efficient way to calculate Fibonacci without iterations or recursion? It looks like this:
The only problem is the limitation of floating point precision. The actual formula does not include any rounding. I added it because floating point errors start to cause the results to drift after`n = 11`.Not very impressive.
So, as long as we don’t need values higher than`n = 75` , this faster formula will work great! Let’s turn it into a generator:
Looks good. Let’s look at a benchmark run:
Profile with 79 numbers
Faster, yes, but we’ve lost our last few accurate numbers. Worth the tradeoff?
Before I started:
- I had no idea how many accurate values I could produce with the formula version.
- I had no idea how many recursive calls I’d make to produce those accurate values.
But now that I know all these limits, the best implementation so far is one I haven’t shown you, yet:
Most of the times I’ve used an infinite series in a real application, I actually needed a limited number of values for a specific purpose (usually generating graphics). Most of the time it was faster to grab values from a lookup table than it would have been to compute the values. In fact, this was a frequent optimization used in computer games in the 80’s and 90’s. It probably still is.
Since arrays are iterables in ES6 and already behave like generators by default, we can simply delegate to the lookup table using`yield*` .
Not surprisingly, this is the fastest implementation of the bunch, by a large margin:
Profile with 79 numbers
Looking back, I’m pretty sure that as long as we limited the series to accurate values, the callstack would be no problem… a slightly modified recursive version would probably do just fine:
This one is my favorite of the bunch. The seed values can go in the memo, leaving the actual calculation about as close as you can get to the mathematical recurrence relation:Fn = Fn-1 + Fn-2
For the generator we’re just delegating to the memo array again.
- If you use a formula that utilizes floating point math, you should definitely test the limits of its accuracy.
- If you’re using a series that grows exponentially, you should figure out how much of the series you can produce before you run into the limitations of the JS`Number`type.
- If your limits are small enough, consider pre-generating a lookup table to speed up your production app.
When the generator functions were introduced in ES6, a lot of other builtin things also implemented the iterator protocol (the thing that gets returned from the generator that can be iterated over).
More precisely, they implemented the iterable protocol.`String`, `Array`, `TypedArray`, `Map`and`Set`are all builtin iterables, which means they all have a`[Symbol.iterator]`property that is not enumerable.
In other words, you can now iterate over any array-like builtin object using the iterator`.next()`method.
Here’s how you can access an array iterator. The technique is the same for anything that implements the iterable protocol:
You can even build your own custom iterables:
And even redefine builtin iterable behaviors, but beware — I’m seeing inconsistent behavior between Babel and V8:
I thought it might be convenient to write a function that’s basically a shortcut for`arr[Symbol.iterator]()`,so I made one and gave it a fun slicing API so you can easily grab chunks of an array and turn them into iterators. I called it arraygen. You can browse arraygen on GitHub.
Hopefully I hit on some stuff you might not have known about generators. I went off on a couple interesting tangents, too:
- Avoid recursion. Generators don’t get optimized tail calls.
- Allow parameters to limit the length of your generators, and you can use the …rest operator to destructure them.
- Memoized infinite generators can blow the heap size limits.
- Floating point precision errors can trip up formula-based infinite generators. Be careful.
- Know your limits. Does your generator have enough runway to satisfy the needs of your application? Is it accurate enough over the span of that runway? Are you going to run into limitations of the data type you’re using? Will the JS engine have enough memory to keep your generator running as long as you want it to?
- Most of the builtins behave a bit like generators with the iterable protocol, and you can define your own custom iterables.
If you want to play with the Fibonacci examples, you can clone the full source from GitHub.
He spends most of his time in the San Francisco Bay Area with the most beautiful woman in the world.