Egomaniacal and Scalable Apps
The rise of cloud computing over the past few years has been incredible. Unfortunately, many still do not see the interest or do not comprehend its importance.
This is not a post about cloud computing and its benefits, but rather a refresher about understanding some basic rules of scalability and how to get started with building scalable apps while keeping in mind that most people and companies have limited budgets.
The concept of shared-nothing architectures is well known in the scalability communitya€‰-a€‰if such a thing even existsa€‰-a€‰but it seems to be frequently forgotten when it comes to building web apps.
Before we dive into scalability theory, a reminder is in order; fast apps and infrastructures don't necessarily scale.
Although scalability is a complex and arduous concept to explain, the generally-accepted meaning for load scalability is the ability for a system to handle a growing amount of work in a capable manner. In other words, it's the ability to handle more load by adding more computing power. Will you be able to handle more users if you add more servers to your cluster?
Load scalability is about the ability to adjust and adapt. As Darwin once said (or meant to say):
a€oIt is not the fastest web app that survives peaks and popularity; it is the one that is most adaptable to change.a€¯
For years, the PHP world (most of it) was deploying to a single server that contained the web server, the databases, the uploaded files, the session files, &c.
This outdated setup looks like this:
This was fine until web sites started going down because of a mention on Digg or Slashdot (for those of you who remember Slashdot). Apps were fast, but they couldn't surpass a certain threshold of users.
This is about the time the concept of shared-nothing architectures began to take hold. Infrastructures are now decoupled, and every component can be easily replaced. This improved setup looks like this:
By sharing nothing, every server that powers your app becomes egomaniacal and does not care about the rest of the infrastructure. Like modular object-oriented code, parts of your infrastructure become independent. For instance, your web server should no longer save sessions in local files, because at any given point in time, the number of web servers can change.
A system that is tightly linked to its filesystem for file uploads, databases, sessions, &c. is not scalable. Luckily, in the PHP world, we have quite a few tools to help us attain a high degree of selfishness.
Storing sessions on a single filesystem means that when the system scales and adds or removes servers, some sessions will be lost. There are a few solutions to the problem.
Memcached has been used for many years in the PHP world, and it is a great way to store objects and sessions across a cluster of servers. For an even more scalable Memcached infrastructure, I recommend Membase, which is open source and provides elasticity as well as persistence for Memcached.
Once you set up your Memcached infrastructure (and the PHP extension), you simply modify your session handler to point to your Memcached server (or pool) as follows:
Many people dislike using Memcached for session storage, because the data is ephemeral, so if your server dies, all of your sessions are going to disappear, and your users will be logged out.
Membase avoids this problem and is fully compliant with the Memcached protocol.
Redis is an advanced key-value store (data structure server) that can contain strings, hashes, lists, sets, and sorted sets. Redis can be naturally replicated, and, despite being an in-memory system, it can also fall back to storing data on disk.
After installing the aforementioned PHP extension, you have modify your session handler to use Redis:
In the event where you want to use a cluster of Redis instances, you can specify multiple servers:
Truncated by Planet PHP, read more at the original (another 3867 bytes)