Magic has a price, but it doesn't need to be your life
So I heard rumors that one of the teams at Liip was struggling with the performance of a symfony app, which was casting doubt on symfony in general. Now that team solved their performance issues by using a filter to do some aggressive caching. This fit nicely in the content workflow anyway since updates do not come in all the frequently anyway. So they just cache everything and added a feature to prime the caches for content changes. Problem solved, but doubts remained for other projects. So I wanted to figure out what was going on.
The problem was that I couldn't easily get a good set of test data and that their staging setup wasn't setup with XHProf yet. So I began some static code analysis. Saw a few cases where they were using "include_components" instead of just "include_partial". I noticed that they had a few cross application links, but then again they were using memcache to cache those. I wondered if they should have used pgpool to more efficiently handle the connections to PostgreSQL. Also they didnt seem to be caching Doctrine generated SQL queries yet because they ran into some issues with that enabled during the development.
Now the other day we finally had XHProf on the staging server with a current set of data and all the same setup as production. Jordi and I began disabling all the various caches and performance was indeed super slow, 4-5 second slow! We checked XHProf and found that some Doctrine methods were being called 20k-80k times! The cause seemed to be their custom tree algorithm.
I guess they could have used nestedsets here, since as I mentioned earlier their data doesn't change that often. Looking at the tree algorithm we noticed that getId() and getParentId() where being called multiple times in one of the key methods. So we stored the method return values into a local variable and shaved off 2 seconds. Jordi then continued some more optimization and was able to drop the performance without any caching down to 600ms without any output caching.
I guess by enabling the cache for the header and footer, which change almost never, but require a bit of work, this number will drop even more. The database access now makes up about 120ms per request and this could get tweaked some more. Of course now there are some more areas worth optimizing at this stage and maybe now the things I spotted during the static analysis are starting to matter.
Now we asked the team why they didn't notice that their tree algorithm was this slow. They actually did notice, but it was earlier in the development of the project and obviously with less data in the test env it was noticeable, but not so severe. Now at this stage things were hectic and not everybody was totally familiar with symfony, so it ended up being considered a framework overhead thing. And thanks to aggressive caching there wasn't really an issue for this particular project and the site was able to be launched with great performance.
Now the lesson learned are:
- don't waste your time with static code optimization analysis
- make sure that XHProf is installed on all servers
- nicely layered OO code is awesomely powerful, but especially when doing recursions, make sure you cache method output as much as possible
- symfony 1.x and Doctrine 1.x can of course perform well
- better tools that visualize performance of the test suite together with commits would have made it easier to spot the underperforming code without manual investigation
BTW: If you never heard about XHProf, learn about it in this excellent article written by Lorenzo.