Memcache Bundle for Symfony updated!

memcache_v2

We have released version 2 of our Symfony Memcache Bundle. The 3 biggest improvements are:

  • Added Windows support
  • PHP 5.4+ without notices and/or warnings
  • High availability support with redundancy configuration

This actually was a major refactoring. We switched from the PECL “memcached” to the PECL “memcache” (without ‘d’) extension. Only version 3 (which is still in beta) of this extension is supported currently, but this version is quite stable actually. Most of the features of the bundle have not been altered. The Memcache(Pool) object is deliberately not abstracted. This is done to allow for unrestricted access to the underlying implementation. The flags parameter is now supported and this is the third argument on a “set” instruction. You may have to alter your code for this, which may be annoying, but we promise to keep the API stable in this major version that may again last for a few years.

Some articles we have written about the bundle are:

Let us know your experience with this new version.

Download LswMemcacheBundle from Github

Limit concurrent PHP requests using Memcache

When you run a website you may want to use nginx reverse proxy to cache some of your static assets and also to limit the amount of connections per client IP to each of your applications. Some good modules for nginx are:

Many people are not running a webfarm, but they still want to protect themselves against scrapers and hackers that may slow the website (or even make it unavailable). The following script allows you to protect your PHP application from too many concurrent connections per IP address. You need to have Memcache installed and you need to be running a PHP web application that uses a front controller.

Installing Memcache for PHP

Run the following command to install Memcache for PHP on a Debian based Linux machine (e.g. Ubuntu):

sudo apt-get install php5-memcache memcached

This is easy. You can flush your Memcache data by running:

telnet 0 11211
flush_all

You may have to restart apache for the Memcache extension to become active.

sudo service apache2 restart

Modifying your front controller

It is as simple as opening up your “index.php” or “app.php” (Symfony) and then pasting in the following code in the top of the file:

<?php
function firewall($concurrency,$spinLock,$interval,$cachePrefix,$reverseProxy)
{
  $start = microtime(true);
  if ($reverseProxy && isset($_SERVER['HTTP_X_FORWARDED_FOR'])) {
    $ip = array_pop(explode(',',$_SERVER['HTTP_X_FORWARDED_FOR']));
  }
  else {
    $ip = $_SERVER['REMOTE_ADDR'];
  }
  $memcache=new Memcache();
  $memcache->connect('127.0.0.1', 11211);
  $key=$cachePrefix.'_'.$ip;
  $memcache->add($key,0,false,$interval);
  register_shutdown_function(function() use ($memcache,$key){ $memcache->decrement($key); });
  while ($memcache->increment($key)>$concurrency) {
    $memcache->decrement($key);
    if (!$spinLock || microtime(true)-$start>$interval) {
      http_response_code(429);
      die('429: Too Many Requests');
    }
    usleep($spinLock*1000000);
  }
}
firewall(10,0.15,300,'fw_concurrency_',false);

Add these lines if you want to test the script in stand-alone mode:

session_start();
session_write_close();
usleep(3000000);

With the default setting you can protect a small WordPress blog as it limits your visitors to do 10 concurrent(!) requests per IP address. Note that this is a lot more than 10 visitors per IP address. A normal visitor does not do concurrent requests to PHP as your browser tends to send only one request at a time. Even multiple users may not do concurrent requests (if you are lucky). In case concurrent requests do happen they will be delayed for “x” times 150 ms until the concurrency level (from that specific IP) is below 10. Other IP addresses are not affected/slowed down.

If you use a reverse proxy you can configure this (to get the correct IP address from the “X-Forwarded-For” header). Also if you set “$spinLock” to “false” then you will serve “429: Too Many Requests” if there are too many concurrent requests instead of stalling the connection.

This functionality is included as the “Firewall” feature of the new MindaPHP framework and also as the firewall functionality in the LeaseWeb Memcache Bundle for Symfony. Let me know what you think about it using the comments below.

MindaPHP now has Memcache support

The PHP framework I am building (MindaPHP) already contains support for MySQL and cURL. Today I have added Memcache support. Memcache can be used for two main purposes in PHP: session storage and application caching. In the framework we only support debugging Memcache for application caching. This is how the debugger looks when the Cache class is used:

mindaphp_cache

The cache is used to store the results from the Bing query. You can try this on: http://maurits.server.nlware.com/hello/bing (click the debugger link in the bottom bar after searching).

Memcache for application caching

Now you can speed up your application using the following commands:

$var = Cache::get($key)
$success = Cache::set($key,$var,$expire=0)
$success = Cache::delete($key)
$success = Cache::add($key,$var,$expire=0)
$success = Cache::replace($key,$var,$expire=0)
$var = Cache::increment($key,$value=1)
$var = Cache::decrement($key,$value=1)

The commands “get” and “set” do retrieval and storage of values in the cache based on the “key” parameter. The commands “add” and “replace” are comparable to “set”, but either fail when the key does (in case of add) or does not (in case of replace) exist. The “increment” and “decrement” commands can be used for counters, but beware that “increment” fails when the key does not exist. This is why you may want to call “add” before you increment.

Memcache for session storage

If you want to use Memcache for session storage in PHP (with any framework), you configured this in “php.ini” with the following statements:

session.save_handler = memcache
session.save_path = "tcp://localhost:11211"

Note that you need the Memcache daemon and the php Memcache extension installed. The following command installs the required software on a Debian based Linux (like Ubuntu):

sudo apt-get install php5-memcache memcached

Have fun accelerating you application!

Symfony2 Memcache session locking

In one of the previous posts we wrote about session reliability. Today we will talk about “locking session data”. This is another session reliability topic and we will look at the problems that may occur in Symfony2 and how to solve them.

Session locking

Session locking is when the web server thread acquires an exclusive lock on the session data to avoid concurrent access. Browsers use HTTP 1.1 keep-alive and would normally just use one open TCP connection and reuse that to get all dynamic content. When loading images (and other static content) the browser may decide to use multiple TCP connections (concurrent) to get the data as fast as possible. This also happens when using AJAX. This may (and will most likely) lead to different workers (threads) on the web server answering these concurrent requests concurrently.

Each of the requests may read the session data update and write it back. The last write wins, so some writes may get lost. This can be countered by applying session locking. The session lock will prevent race conditions from occurring and prevent any corrupted data appearing in the session. This can easily be understood by looking at the following two images.

session-access-without-lockingsession-access-with-locking

The left image shows concurrent requests without session locking and the right shows concurrent requests with session locking. This is very well described in this post by Andy Bakun. Note that the above images are also from that post. Reading the Andy Bakun post allows you to truly understand the session locking problem (and the performance problems that AJAX may cause).

Symfony2 sessions

In Symfony2 one would normally use the NativeFileSessionHandler, which will just use the default PHP session handler. This works flawless in most cases. PHP uses “flock” to acquire an exclusive lock on the local filesystem. But when you scale out and run a server farm with multiple web servers you cannot use the local filesystem. You might be using a shared (NFS) filesystem and run into problems with “flock” (see the Linux NFS FAQs). If you use the database you may run into performance problems, escpecially when applying locking.

This leaves Memcache or Redis as options for session storage. These are fast key/value stores that can be used for session storage. Unfortunately the Symfony2 session storage implementations for Memcache (in Symfony) and Redis (in phpredis) do not implement session locking. This potentially leads to problems, especially when relying on AJAX calls as explained above. Note that other frameworks (like CakePHP) also do not implement session locking when using Memcache as session storage. Edit: This post has inspired the guys from SncRedisBundle and this Symfony2 bundle now supports session locking, which is totally awesome!

Custom save handlers

One can write “Custom Save Handlers” as described by the Symfony2 documentation:

Custom handlers are those which completely replace PHP’s built in session save handlers by providing six callback functions which PHP calls internally at various points in the session workflow. Symfony2 HttpFoundation provides some by default and these can easily serve as examples if you wish to write your own. — Symfony2 documentation

But you should be careful, since the examples do not implement session locking.

LswMemcacheBundle to the rescue

At LeaseWeb we love (to use) Memcache. Therefore, we have built session locking into our LswMemcacheBundle. It actually implements acquiring a “spin lock” with the timeout set to PHP’s “max_execution_time” (defaults to 30 seconds). The spin lock tries to acquire the lock every 150 ms (configurable). It will also hold the lock for a maximum time of the PHP “max_execution_time”. By using Memcache’s built-in key expire mechanism, we can ensure the lock is not held indefinitely.

This (spin-lock) implementation is a port of the session locking code from the memcached PECL module (written in C). Our bundle enables locking by default. If you want, you can disable the locking by setting the “locking” configuration parameter to “false” as described in the documentation.

This session locking code was also ported to SncRedisBundle and submitted as PR #109. LswMemcacheBundle is open-source and can be found on our GitHub account:

https://github.com/LeaseWeb/LswMemcacheBundle

Avoiding the Memcache ‘dog pile’ effect

Normally when you use Memcache you will do something like this:

<?php
// get memcache object
$memcache = $this->get('memcache.default');
// get value from cache
$value = $memcache->get('key')
// if value is not in cache
if ($value===false) {
 // calculate the value
 $value = $this->calculateValue();
 // store the value in the cache for 1 hour
 $value = $memcache->set('key', $value, 3600);
}
// do something with the value
print "value: $value\n";

Now let us examine a high traffic website case and see how that will work:

Your cache is stored for 90 minutes. It takes about 3 second to calculate the cache value and 1 ms second to read from cache the cache value. You have about 5000 requests per second and that the value is cached. You get 5000 requests per second taking about 5000 ms to read the values from cache. You might think that that is not possible since 5000 > 1000, but that depends on the number of worker processes on your web server  Let’s say it is about 100 workers (under high load) with 75 threads each. Your web requests take about 20 ms each. Whenever the cache invalidates (after 90 minutes), during 3 seconds, there will be 15000 requests getting a cache miss. All the threads getting a miss will start to calculate the cache value (because they don’t know the other threads are doing the same). This means that during (almost) 3 seconds the server wont answer a single request, but the requests keep coming in. Since each worker has 75 threads (holding 100 x 75 connections), the amount of workers has to go up to be able to process them.

The heavy forking will cause extra CPU usage and the each worker will use extra RAM. This unexpected increase in RAM and CPU is called a “cache stampede” or “dog-piling” and is very unwelcome during peek hours on a web service. This problem is also explained in this “Memcache dog pile” article. Another good explanation of the problem can be found on github in a SIFO issues called “Cache: Avoid the dogpile effect”.

There is a solution: we serve the old cache entries while calculating the new value and by using an atomic read and write operation we can make sure only one thread will receive a cache miss when the content is invalidated. The algorithm is similar to Nginx’s “proxy_cache_use_stale” with “updating” mechanism and is implemented in AntiDogPileMemcache class in LswMemcacheBundle.

Below you see a test showing how it is working. Since phpunit does not support forking I had to go PHP native to make it work. First you see how to flush the Memcache from command line and after that you see the output when we run the test.

maurits@pc:~/src/Lsw/MemcacheBundle/Tests/Cache$ telnet 0 11211
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
flush_all
OK
quit
Connection closed by foreign host.
maurits@pc:~/src/Lsw/MemcacheBundle/Tests/Cache$ php DogPileTest.php
THREAD | SECOND | STATUS
1 |      0 | STALE!
3 |      0 | STALE!
2 |      0 | STALE!
1 |      0 | SET 1363565115
3 |      0 | SET 1363565115
2 |      0 | SET 1363565115
1 |      1 | 1363565115
3 |      1 | 1363565115
2 |      1 | 1363565115
1 |      2 | 1363565115
3 |      2 | 1363565115
2 |      2 | 1363565115
1 |      3 | STALE!
3 |      3 | 1363565115
2 |      3 | 1363565115
1 |      3 | SET 1363565119
2 |      4 | 1363565119
3 |      4 | 1363565119
1 |      4 | 1363565119
2 |      5 | 1363565119
3 |      5 | 1363565119
2 |      6 | 1363565119
3 |      6 | 1363565119
1 |      5 | 1363565119
1 |      6 | 1363565119
3 |      7 | STALE!
2 |      7 | 1363565119
3 |      7 | SET 1363565123
1 |      7 | 1363565123
2 |      8 | 1363565123
3 |      8 | 1363565123
1 |      8 | 1363565123
2 |      9 | 1363565123
1 |      9 | 1363565123
3 |      9 | 1363565123
maurits@pc:~/src/Lsw/MemcacheBundle/Tests/Cache$

Before we get to the code of the test and you are going to apply this everywhere, I want to stress the following considerations:

  • ADP might not be needed if you have low amount of hits or when calculating the new value goes relatively fast.
  • ADP might not be needed if you can break up the big calculation into smaller, maybe even with different timeouts for each part.
  • ADP might get you older data than the invalidation that is specified. Especially when a thread/worker gets “false” for “get” request, but fails to “set” the new calculated value afterwards.
  • ADP “get” and ADP “set” are more expensive than the normal “get” and “set”, slowing down all cache hits.
  • ADP does not guarantee that the dog pile will not occur. Restarting Memcache, flushing data or not enough RAM will also get keys evicted and you will run into the problem anyway.

Now this is the code used to run the test:

<?php
namespace Lsw\MemcacheBundle\Tests\Cache;

use Lsw\MemcacheBundle\Cache\AntiDogPileMemcache;

require_once "../../Cache/LoggingMemcacheInterface.php";
require_once "../../Cache/LoggingMemcache.php";
require_once "../../Cache/AntiDogPileMemcache.php";

class DogPileTest //extends \PHPUnit_Framework_TestCase
{
  public function testDogPile()
  {
    for ($t=1; $t<3; $t++) {
      $pid = pcntl_fork();
      if ($pid == -1) {
        die('could not fork');
      }
      if ($pid==0) {
        break;
      }
    }
    $c = 10;
    $m = new AntiDogPileMemcache(false);
    $m->addServer('localhost', 11211);
    if ($t==1) {
      echo "THREAD | SECOND | STATUS\n";
    }
    for ($i=0; $i<$c; $i++) {
      sleep(1);
      if (false === ($v = $m->getAdp('key'))) {
        echo sprintf("%6s | %6s | %s\n", $t, $i, "STALE!");
        sleep(1);
        $v = time();
        $m->setAdp('key', $v, 2);
        echo sprintf("%6s | %6s | %s\n", $t, $i, "SET $v");
      } else {
        echo sprintf("%6s | %6s | %s\n", $t, $i, $v);
      }
    }
    sleep(3);
  }
}

$t = new DogPileTest();
$t->testDogPile();

For further reading on applying memcache I highly recommend this article from IBM.