Avoiding the Memcache ‘dog pile’ effect

Normally when you use Memcache you will do something like this:

<?php
// get memcache object
$memcache = $this->get('memcache.default');
// get value from cache
$value = $memcache->get('key')
// if value is not in cache
if ($value===false) {
 // calculate the value
 $value = $this->calculateValue();
 // store the value in the cache for 1 hour
 $value = $memcache->set('key', $value, 3600);
}
// do something with the value
print "value: $value\n";

Now let us examine a high traffic website case and see how that will work:

Your cache is stored for 90 minutes. It takes about 3 second to calculate the cache value and 1 ms second to read from cache the cache value. You have about 5000 requests per second and that the value is cached. You get 5000 requests per second taking about 5000 ms to read the values from cache. You might think that that is not possible since 5000 > 1000, but that depends on the number of worker processes on your web server  Let’s say it is about 100 workers (under high load) with 75 threads each. Your web requests take about 20 ms each. Whenever the cache invalidates (after 90 minutes), during 3 seconds, there will be 15000 requests getting a cache miss. All the threads getting a miss will start to calculate the cache value (because they don’t know the other threads are doing the same). This means that during (almost) 3 seconds the server wont answer a single request, but the requests keep coming in. Since each worker has 75 threads (holding 100 x 75 connections), the amount of workers has to go up to be able to process them.

The heavy forking will cause extra CPU usage and the each worker will use extra RAM. This unexpected increase in RAM and CPU is called a “cache stampede” or “dog-piling” and is very unwelcome during peek hours on a web service. This problem is also explained in this “Memcache dog pile” article. Another good explanation of the problem can be found on github in a SIFO issues called “Cache: Avoid the dogpile effect”.

There is a solution: we serve the old cache entries while calculating the new value and by using an atomic read and write operation we can make sure only one thread will receive a cache miss when the content is invalidated. The algorithm is similar to Nginx’s “proxy_cache_use_stale” with “updating” mechanism and is implemented in AntiDogPileMemcache class in LswMemcacheBundle.

Below you see a test showing how it is working. Since phpunit does not support forking I had to go PHP native to make it work. First you see how to flush the Memcache from command line and after that you see the output when we run the test.

maurits@pc:~/src/Lsw/MemcacheBundle/Tests/Cache$ telnet 0 11211
Trying 0.0.0.0...
Connected to 0.
Escape character is '^]'.
flush_all
OK
quit
Connection closed by foreign host.
maurits@pc:~/src/Lsw/MemcacheBundle/Tests/Cache$ php DogPileTest.php
THREAD | SECOND | STATUS
1 |      0 | STALE!
3 |      0 | STALE!
2 |      0 | STALE!
1 |      0 | SET 1363565115
3 |      0 | SET 1363565115
2 |      0 | SET 1363565115
1 |      1 | 1363565115
3 |      1 | 1363565115
2 |      1 | 1363565115
1 |      2 | 1363565115
3 |      2 | 1363565115
2 |      2 | 1363565115
1 |      3 | STALE!
3 |      3 | 1363565115
2 |      3 | 1363565115
1 |      3 | SET 1363565119
2 |      4 | 1363565119
3 |      4 | 1363565119
1 |      4 | 1363565119
2 |      5 | 1363565119
3 |      5 | 1363565119
2 |      6 | 1363565119
3 |      6 | 1363565119
1 |      5 | 1363565119
1 |      6 | 1363565119
3 |      7 | STALE!
2 |      7 | 1363565119
3 |      7 | SET 1363565123
1 |      7 | 1363565123
2 |      8 | 1363565123
3 |      8 | 1363565123
1 |      8 | 1363565123
2 |      9 | 1363565123
1 |      9 | 1363565123
3 |      9 | 1363565123
maurits@pc:~/src/Lsw/MemcacheBundle/Tests/Cache$

Before we get to the code of the test and you are going to apply this everywhere, I want to stress the following considerations:

  • ADP might not be needed if you have low amount of hits or when calculating the new value goes relatively fast.
  • ADP might not be needed if you can break up the big calculation into smaller, maybe even with different timeouts for each part.
  • ADP might get you older data than the invalidation that is specified. Especially when a thread/worker gets “false” for “get” request, but fails to “set” the new calculated value afterwards.
  • ADP “get” and ADP “set” are more expensive than the normal “get” and “set”, slowing down all cache hits.
  • ADP does not guarantee that the dog pile will not occur. Restarting Memcache, flushing data or not enough RAM will also get keys evicted and you will run into the problem anyway.

Now this is the code used to run the test:

<?php
namespace Lsw\MemcacheBundle\Tests\Cache;

use Lsw\MemcacheBundle\Cache\AntiDogPileMemcache;

require_once "../../Cache/LoggingMemcacheInterface.php";
require_once "../../Cache/LoggingMemcache.php";
require_once "../../Cache/AntiDogPileMemcache.php";

class DogPileTest //extends \PHPUnit_Framework_TestCase
{
  public function testDogPile()
  {
    for ($t=1; $t<3; $t++) {
      $pid = pcntl_fork();
      if ($pid == -1) {
        die('could not fork');
      }
      if ($pid==0) {
        break;
      }
    }
    $c = 10;
    $m = new AntiDogPileMemcache(false);
    $m->addServer('localhost', 11211);
    if ($t==1) {
      echo "THREAD | SECOND | STATUS\n";
    }
    for ($i=0; $i<$c; $i++) {
      sleep(1);
      if (false === ($v = $m->getAdp('key'))) {
        echo sprintf("%6s | %6s | %s\n", $t, $i, "STALE!");
        sleep(1);
        $v = time();
        $m->setAdp('key', $v, 2);
        echo sprintf("%6s | %6s | %s\n", $t, $i, "SET $v");
      } else {
        echo sprintf("%6s | %6s | %s\n", $t, $i, $v);
      }
    }
    sleep(3);
  }
}

$t = new DogPileTest();
$t->testDogPile();

For further reading on applying memcache I highly recommend this article from IBM.

Share

Leave a Reply

Your email address will not be published. Required fields are marked *