PhpRiot
News Archive
PhpRiot Newsletter
Your Email Address:

More information

MongoDB Cursors with PHP

Note: This article was originally published at Planet PHP on 22 May 2012.
Planet PHP

MongoDB Cursors with PHP

London, UK Tuesday, May 22nd 2012, 09:15 BST

Recently I was asked to improve the MongoCursor::batchSize documentation. This began an indepth investigation in how the PHP driver for MongoDB handles pulling data that's been queried from the MongoDB server. Here are my findings.

A MongoCursor is created as soon as you run the find() method on a MongoCollection object, like in:

$m = new Mongo(); $collection = $m-demoDb-demoCollection; $cursor = $collection-find();

Just calling find() will only create a cursor object, and does not immediately send the query to the server for processing. That is only done as soon as you start reading from the cursor for the first time. Because of this, you can call additional methods on the newly created cursor object that still influence how the query is run on the server. One of such examples is the sort() method that makes the result sort according to its arguments (in this example, by name):

$cursor-sort(array('name' = 1)); $result = $cursor-getNext();

When you then call getNext() on $cursor the driver sends to the server the query, and requests to return a default number of documents in the first batch. The default Batch Size is 101. Let's have a look on what's get send on the wire in our simple query for all documents, sorted by name:

The Number to Return is 0, which means to use the default. So even although we only want to fetch one result (getNext() asks the cursor for the next document only), the server returns 101 documents:

The driver stores all 101 documents locally and during the next 100 calls to getNext() the driver will simply return the documents from the local memory. Once getNext() gets called for the 102th time, the driver connects back to the server to request more documents:

// skip the other 100 docs for ($i = 0; $i getNext(); } // request document 102: $result = $cursor-getNext();

When the driver asks for more documents separately (i.e., not at the same time it is issuing a query) without a specific batch size, the server fills up 4MB of documents. On the wire, the request for Get More looks like:

and the reply like:

As you can see, the returned data is 4194378 bytes, and the Number Returned is 34673.

Setting your own batch size

You can instruct the driver to use different batch sizes, by using the batchSize() method on the $cursor. In this new example, we use the batchSize() method to request 25 documents per round trip to the server:

$cursor = $collection-find()-sort(array('name' = 1)); $cursor-batchSize(25); $result = $cursor-getNext();

When we run this script, we will see the following on the wire:

As expected, the Number to Return is now 25. During iteration, all query results are returned from the server to the driver in batches of 25 documents:

// retrieve another 25 documents to trigger the getMore for ($i = 0; $i getNext(); }

Which creates this query:

And this r

Truncated by Planet PHP, read more at the original (another 7915 bytes)