Doctrine2 CouchDB support! But Why?
As I blogged recently we started working on CouchDB support on top of the Doctrine2 infrastructure back at FrOSCamp. For me its been a while since I have taken a leading role in the development of an OSS component. Ever since I left PEAR, I have mainly been helping out here and there and with php.net I was doing organizational stuff. So its kinda fun to be back at working on some OSS code with significant contributions in terms of code. There is still a lot of work ahead, so I wouldn't mind a few helping hands, But you may wonder why you should even bother. Like I was in #couchdb on freenode the other day and David asked a fairly legitimate question: "can you enlighten me as to why you'd need an ORM on top of native json object"? In this blog post I will try to explain why it makes sense to add a model based infrastructure underneath a NoSQL database.
To me the first advantage of using model classes managed by an ODM is that it ensures a centralization of the data structures for different pieces of information. This is one of the concerns I raised earlier with NoSQL: To avoid needless differences in stored content (isDeleted vs. removed etc.) the code needs to manage the schema. Plus you get convenience stuff like lazy loading as well as bulk loading of relations for an entire collection. Changing queries to use either approach is very easy and can also be done on the model level.
Model classes also allow you to do validation of the structures in the client. Obviously CouchDB for example has the ability to do server side validation. But with model classes you can move this to the client entirely or at the very least get rid of round trips for failed validation. Maybe we will eventually support some way to sync rules between the model classes and the CouchDB server, aka import and export of these rules in both directions.
Obviously the great advantage of NoSQL is that you can more easily change your schema. But what happens to the data already stored? Either you have to migrate it all at once or you use something that Jonathan and I came up with back when he started working on MongoDB support in Doctrine2: Eventual migration! The idea here is that instead of having to migrate old data to the new format, you just place rules into the model for how the data is migrated when read, so that when it is then stored its migrated to the new data structure.
Also just like with an ORM the advantage of such an ODM is that you can do some pretty neat performance optimizations at a higher level. Doctrine2 is basically a persistence manager forgoing the old ActiveRecord type approach of Doctrine1:= new EntitiesUser; $user-setName('Garfield'); $em-persist($user); $em-flush(); ?
So the idea is that you mess around with your models and then once you are done you flush the changes to the database. Now this enables quite intelligent use of transactions. If available you can also make use of bulk change API's by simply introspecting the changes to be flushed and then deciding what approach to use. In MongoDB ODM Jonathan for example implemented in place updates which could be used during a flush() operation. Furthermore during the flush() operation you can trigger events that take into account the entire change set which can be useful for managing an audit log for example.
One concern with all of this is how much overhead this adds, not only in terms of raw speed, but also in terms of code on
Truncated by Planet PHP, read more at the original (another 759 bytes)