PhpRiot
News Archive
PhpRiot Newsletter
Your Email Address:

More information

64-bit integers in MongoDB

Note: This article was originally published at Planet PHP on 9 August 2010.
Planet PHP

64-bit integers in MongoDB

London, UK Monday, August 9th 2010, 14:23 BST

The current project that I'm working on relies heavily on MongoDB, a bridge between key-value stores and traditional RDBMS systems. Users in this project are identified by their Facebook UserID, which is a "64-bit int datatype". Unfortunately, the MongoDB PHP Driver only had support for 32-bit integers causing problems for newer users of Facebook. For those users, their nice long UserID was truncated to only 32 bits which didn't quite make the application work.

MongoDB stores documents internally in something called BSON (Binary JSON). BSON has two integer types, a 32-bit signed integer type called INT and a 64-bit signed integer type called LONG. The documentation of the MongoDB PHP driver on types says (or used to say, depending on when you're reading this) that only the 32-bit signed integer type is supported because "PHP does not support 8 byte integers". That's not quite true. PHP's integer type supports 64-bit on the platforms where the C-data type long is 64 bits. That is generally on every 64-bit platform (where PHP is compiled for 64 bits); except on Windows, where the C-data type long is always only 32 bits.

Whenever a PHP integer is send to MongoDB, the driver would use the 32 least significant bits to store the number as part of the document. The example here shows what happens (on a 64-bit platform):

selectCollection('test', 'inttest'); $c-remove(array()); $c-insert(array('number' = 1234567890123456)); $r = $c-findOne(); echo $r['number'], "\n"; ?

shows:

int(1015724736)

In binary:

1234567890123456 = 100011000101101010100111100100010101011101011000000 1015724736 = 111100100010101011101011000000

Truncating data is obviously not a very good idea. In order to address this issue we could just allow for the native PHP integer type to be used when storing data from PHP into MongoDB. But instead of changing how the MongoDB driver works by default I've added the new setting mongo.native_long - simply because otherwise we might be breaking applications. With the mongo.native_long setting enabled, we see the following result instead of the outcome of the script above:

insert(array('number' = 1234567890123456)); $r = $c-findOne(); var_dump($r['number']); ?

This script shows:

int(1234567890123456)

On 64-bit platforms, the mongo.native_long setting allows for 64-bit integers to be stored in MongoDB. The MongoDB data type that is used in this case is the BSON LONG, instead of the BSON INT that is used if this setting is turned off. The setting also changes the way how BSON LONGs behave when they are read back from MongoDB. Without mongo.native_long enabled, the driver would convert every BSON LONG to a PHP double which results in the loss of precision. You can see that in the following example:

insert(array('number' = 12345678901234567)); ini_set('mongo.native_long', 0); $r = $c-findOne(); var_dump($r['number']); ?

This script shows:

float(1.2345678901235E+16)

On 32-bit platforms, the mongo.native_log setting changes nothing for storing integers in MongoDB: the integer is stored as a BSON INT as before. However, when the setting is enabled and a BSON LONG is read from MongoDB a MongoCursorException is thrown alerting you that the data could not be read back without losing precision:

MongoCursorException: Can not natively represent the long 1234567890123456 on this platform

If the setting is not enabled, a BSON LONG is converted to a PHP float in order to avoid breaking backwards compatibility with the current behaviour.

Although the mongo.native_long settings allows for 64-bit support on 64-bit platforms, it doesn't provide much for 32-bit platforms except preventing loss of precision

Truncated by Planet PHP, read more at the original (another 11857 bytes)