PhpRiot
News Archive
PhpRiot Newsletter
Your Email Address:

More information

A Case of Mistaken Iterator

Note: This article was originally published at Planet PHP on 30 July 2010.
Planet PHP

Earlier this week, I spent most of a day tracing through code in search of the source of a bug that was causing part of our application to fail in strange ways.

In the back end, we have models that connect to CouchDB. These models implement the Iterator pattern to allow easy traversal of a record's keys.

When I wrote the code to implement Iterator several months ago, I dutifully checked the PHP Manual and adapted the reference example that I found there:

_data); } public function current() { return current($this-_data); } public function key() { return key($this-_data); } public function next() { return next($this-_data); } public function valid() { return (current($this-_data) !== false); }}

Little did I realize that this implementation is very broken. I'll explain why, below.

Over the past few years, I've implemented many iterators in this way, using PHP's implicit array manipulation functions (reset(), current(), key(), next()). These functions are very convenient because PHP arrays are so powerfula-aarrays in PHP work like ordered hash tables in other languages.

PHP's implicit management of an array's iteration index (the value that is incremented by next() and referenced by key()) is indeed convenient, but the convenience can sometimes be offset by its very implicitnessa-athe value is hidden from you, the PHP programmer.

In PHP, generic array iteration (without the implicit iterator) isn't actually as simple as it sounds. Remember that arrays aren't arrays in the traditional sense, but ordered hash tables. Consider this:

$data = array('zero','one','two','three'); for ($i=0; $i

Output:

zero one two three

This first example is easy to iteratea-athe array contains sequential, numeric, zero-based keys. It gets more complicated when using non-sequential, and non-numeric keys:

$data = array('apple', 'cow' = 'moo', 'pig' = 'oink', 'orange'); for ($i=0; $i

Output:

apple orange Notice: Undefined offset: 2 in - on line 10 Notice: Undefined offset: 3 in - on line 10

I could use foreach, but because a numeric loop illustrates the point more clearly, here's how I might implement the above code so that it works:

$data = array('apple', 'cow' = 'moo', 'pig' = 'oink', 'orange'); $k = array_keys($data); for ($i=0; $i

Output:

apple moo oink orange

This brings us back to the Iterator implementation. Why isn't the code above correct? Take a closer look at this:

public function valid() { return (current($this-_data) !== false); }

A value of false in the array is indistinguishable from a false value returned by current(). Using the above implementation with the following array would cause it to bail after orange (and subsequently might cause you to waste a day tracking down the cause):

array('apple', 'orange', false, 'banana',);

On Tuesday night, I updated the manual to use an improved Iterator implementation. It's probably a bit slower (so you can use the internal-indexing implementation if you're sure your arrays will never contain false), but my implementation is more robust.

_index = 0; } public function current() { $k = array_keys($this

Truncated by Planet PHP, read more at the original (another 765 bytes)