A High-Performance Logger with PHP

Recently, at Leaseweb, another successful Hackathon came to an end. There was a lot of fun, a lot of coding, a lot of coffee and a lot of cool ideas that arose during these 2 days. In this blog post we want to share one of these ideas that made us proud and that was really fun to work on.

1. The motivation

As Leaseweb, we strive to know our customers better so that we can actively empower them. For that, we need to start logging events that have value for the business. So basically we want to implement a simple technical service (let’s call it a ‘Logging Service’) that accepts any kind of event with some rich data (in other words a payload), and then logs it, without interfering with the execution of the client.

Our Logging service would, on each request, return an ”OK” so that the client can continue with its own execution but still keep on logging events asynchronously.

2. Our tech stack and our tech choice for POC

Typically at Leaseweb, we have a pretty standardized technology stack that revolves around PHP+Apache or PHP+Nginx. If a server or an application is built using this kind of stack, we are bound to typical synchronous execution. This client sends a request to our application (in this case our Logging Service) and the client needs to wait until our application sends back a response after it finishes all the tasks. This is not an ideal scenario. We need a service that runs asynchronously, a service that receives the request, says ”OK” so that the client can continue with his execution, and then does its job.

In the market, there are several tools that would enable us to do this such as languages and tools like NodeJs, Golang or even some message queuing services that could be wired into our PHP stack. But as we are PHP enthusiasts at Leaseweb, we want to use our language of choice without adding dependencies. That is how we discovered ReactPHP (no relation to the front-end framework React). ReactPHP is a pure PHP library that allows the developer to do some cool reactive programming in PHP by running the code on an endless event loop :).

3. Implementation of the POC

With some ReactPHP libraries, we can handle HTTP requests in PHP itself. We no longer need a web server that handles the requests and creates PHP processes for us. We can just create a pure PHP process that handles everything for us and if we want to fully use the hardware power from our machine, we can create several PHP processes and put them behind a load balancer.

After choosing our technology we started playing around and implementing our idea. In order to make sure that asynchronous PHP would solve the majority of our concerns, we started benchmarking.

First we needed to script something to benchmark. Therefore we implemented 3 different scenarios:

  1. The all-time-favorite: An endpoint that prints Hello World.
  2. An API endpoint that calls a 3rd party API that takes 2 seconds to reply and proxies its response.
  3. An API-endpoint that accepts a payload with POST and logs it via HTTP to Elasticsearch (This is what we really want).

We implemented these 3 scenarios in two different stacks:

  • Stack A
    Traditional PHP+FPM+Nginx
  • Stack X
    4 PHP processes running on a loop with ReactPHP behind an Nginx load-balancer. The reason why we chose 4 processes was solely because this number looks good :). There are some theories in which suggestions are made regarding the number of processes that should run on a machine when using this approach. We will not go into further detail on this in this article. Note that in both Stack A and Stack X we used the exact same specs for the hardware server. On both, we had a CPU with 8 cores.

Then we ran some stress tests with Locust:

3.1 1st Benchmark – Hello world!

For the first benchmark, we just wanted to see how the implementation of the Scenario 1 would behave in both of our stacks.

Hello World with Stack X
Figure 1: Hello World with Stack X
Hello World with Stack A
Figure 2: Hello World with Stack A

We can see that both of the stacks perform very similar. The reason for this is that the computation needed to print a ”Hello World!” is minimal, therefore both of our stacks can answer a high amount of requests in a reliable way.

The real power of asynchronous code comes when we need to deal with Input/Output (I/O (reading from a DB, API, Filesystem, etc) because these operations are time-consuming (see [zhuk:2026:event-driven-with-php]). I/O is slow and CPU computation is fast. By going with an asynchronous approach, our program can execute other computations while waiting for I/O.

It is time to try this theory with the next benchmark:

3.2 2nd Benchmark – Response from a 3rd-party API

Following what we described in the previous section, the power of asynchronous code comes when we deal with input and output. So we were curious to find out how the APIs would behave if they need to call a 3rd party API that takes 2 seconds to respond and then it forwards its response.

Let’s run the tests:

Response from a 3rd-party API with Stack X
Figure 3: Response from a 3rd-party API with Stack X
Response from a 3rd-party API with Stack A
Figure 4: Response from a 3rd-party API with Stack A

With the benchmarks illustrated in figure 3 and 4, it’s encouraging to see we already have very interesting results. We ran a stress test where we gradually spawn 100 concurrent users that send a request per second, and we can easily see that the more the concurrent users grow, the less responsive Stack A becomes. Stack A is achieving an average response time of 40 seconds, while Stack X maintains an average response time of 2 seconds (which is the time that the 3rd-party API takes to respond).

With Stack A, each request made creates a process that will stay idle until the 3rd-party API responds. The implication is that, at some point in time, we will have hundreds of idle processes waiting for a reply. This will cause an immense overload on our machine’s resources.

Stack X performs exceedingly well. This is because the same processes that wait for the reply from the 3rd-party API will continue doing other work during their execution, for example, handling other HTTP requests and incoming responses from the 3rd-party API. With this, we can achieve much more efficiency in our stack.

After observing these results we wanted to push it a bit harder – we wanted to see whether we could break Stack A entirely. So we decided to run the same stress test for this scenario but this time with 1000 concurrent users.

Response from a 3rd-party API with Stack X with 1000 users
Figure 5: Response from a 3rd-party API with Stack X with 1000 users
Response from a 3rd-party API with Stack A with 1000 users
Figure 6: Response from a 3rd-party API with Stack A with 1000 users

We did it! We can see that at some point Stack A is unable to handle the requests anymore so it stops responding completely after reaching an average response time of 60 seconds. Stack X remains perfectly smooth with an average response time of 2 seconds :).

3.3 3rd Benchmark

It was indeed fun trying to see how the stacks behave with the previous scenarios but we want to see how it behaves in a real-world scenario. Next, we wanted our API to accept a JSON payload via an HTTP post and log it to an Elasticsearch cluster via HTTP to keep it simple (Scenario 3).

How the stacks work in a nutshell:

  • Stack X receives an HTTP Post request with the payload, sends a response to the client saying OK and then logs it to Elasticsearch (asynchronously).
  • Stack A receives an HTTP Post request with the payload, logs it to Elasticsearch and sends a response to the client saying OK.

Let’s bombard it with Locust again and why not with 1000 of concurrent connections right away:

Logging payloads with Stack X
Figure 7: Logging payloads with Stack X
Logging payloads with Stack A
Figure 8: Logging payloads with Stack A

We can see that we can achieve a pretty reliable and high-performance logger. And this only with pure PHP code!!

Because our intention was always to push the limits, we chose this benchmark with 1000 concurrent users being spawned gradually. Stack A at some point stops handling the requests, while the Stack X always keeps a pretty good response time, around 10ms.

4. What can we use it for now?

With this experiment, we pretty much built a central logging service!! Coming back to our main motivation, we can use this to log whatever we want, and we want to start logging meaningful domain events, from any application within our system with a simple non-blocking HTTP Request. For example, if we start logging meaningful events, we can get to know our customers better. If we log all of this into Elasticsearch we can also start making cool graphs from it. For example:

Graph with Business Events
Figure 9: Graph with Business Events

OR

Graph with End-user Actions
Figure 10: Graph with End-user Actions

Since this approach is so highly responsive, we can even start using it to log anything and maybe everything, where our logging exists in a central endpoint. System monitoring, near-real-time-analytics, domain events, trends, etc, etc. And all of this with pure PHP :).

5. Cons of the approach and future work

When using ReactPHP there are some important considerations, which in some scenarios can be seen as cons, that usually they are not applicable to projects that follow an architecture similar to Stack A.

  • ReactPHP uses reactive/event-driven programming which is a paradigm that might have a big learning curve.
  • Long-running PHP processes can lead to memory leak and in case of failures, they could affect all the current connections to the server
  • These processes need to be constantly carefully monitored in order to avoid and predict fatal failures.
  • The usage of blocking functions (functions that block the execution of the code) will massively affect the performance for all the connections to the server.

Also, some extra work on the ”Operational / Infra side” is needed to make sure that we continuously check on the process’ health and if something goes wrong, create new ones automatically. We also need to work on the way we deploy the code. We need to make sure that we restart a process sequentially when deploying new code so that our service can finish serving the requests that it has queued at that time.

6. Conclusion

Pushing the limits of our preferred technology is one of the most fun things to do. PHP is not a usual choice if there is the need for an high-performing application, but thanks to a lot of great work of the community around, solutions like ReactPHP starts to emerge. That opens a new path to discover new programming paradigms, it introduces different mind-sets on how to approach a problem and it challenges the knowledge we have regarding the technology.

Challenging what we already know, is one of the most interesting things that we can do because it takes us out of our comfort zone and helps us to become more mature. It is really fun and it makes us realise that we can never know everything.

We would like to thank and acknowledge everybody in the communities around the tools we used in this fun experiment 🙂

Some useful links:

by Joao Castro and Elrich Faul

Share

PHP-CRUD-API now supports authorization and validation

Another milestone is reached for the PHP-CRUD-API project. A project that aims to provide a high performance, consistent data API over REST that is easy to deploy (it is a single PHP file!) and requires minimal configuration. By popular demand we have added four important new features:

  1. Tables and the actions on them can be restricted with custom rules.
  2. Access to specific columns can be restricted using your own algorithm.
  3. You can specify “sanitizers” to, for example, strip HTML tags from input.
  4. You can specify “validators” functions to show errors on invalid input.

These features are built by allowing you to define callback functions in your configuration. These functions can then contain your application specific logic. How these function work and how you can load them is explained below.

Table authorizer

The following function can be used to authorize access to specific tables:

/**
 * @param action    'create','read','update','delete','list'
 * @param database  name of your database (e.g. 'northwind')
 * @param table     name of the table (e.g. 'customers')
 * @returns bool    indicates that access is granted  
 **/
  
$f1=function($action,$database,$table){
  return true; 
};

Column authorizer

The following function can be used to authorize access to specific columns:

/**
 * @param action    'create','read','update','delete','list'
 * @param database  name of your database (e.g. 'northwind')
 * @param table     name of the table (e.g. 'customers')
 * @param column    name of the column (e.g. 'password')
 * @returns bool    indicates that access is granted  
 **/
  
$f2=function($action,$database,$table,$column){
  return true; 
};

Input sanitizer

The following function can be used to sanitize input for specific columns:

/**
 * @param action    'create','read','update','delete','list'
 * @param database  name of your database (e.g. 'northwind')
 * @param table     name of the table (e.g. 'customers')
 * @param column    name of the column (e.g. 'username')
 * @param type      type of the column (depends on engine)
 * @param value     input from the user (e.g. 'johndoe88')
 * @returns string  sanitized value
 **/
  
$f3=function($action,$database,$table,$column,$type,$value){
  return $value; 
};

Input validator

The following function can be used to validate input for specific columns:

/**
 * @param action    'create','read','update','delete','list'
 * @param database  name of your database (e.g. 'northwind')
 * @param table     name of the table (e.g. 'customers')
 * @param column    name of the column (e.g. 'username')
 * @param type      type of the column (depends on engine)
 * @param value     input from the user (e.g. 'johndoe88')
 * @param context   all input fields in this action
 * @returns string  validation error (if any) or null
 **/
  
$f4=function($action,$database,$table,$column,$type,$value,$context){
  return null;
};

Configuration

This is an example configuration that requires the above snippets to be defined.

$api = new MySQL_CRUD_API(array(
  'hostname'=>'localhost',
  'username'=>'xxx',
  'password'=>'xxx',
  'database'=>'xxx',
  'charset'=>'utf8',
  'table_authorizer'=>$f1,
  'column_authorizer'=>$f2,
  'input_sanitizer'=>$f3,
  'input_validator'=>$f4
));
$api->executeCommand();

You can find the project on Github.

Share

PHP script to tail a log file using telnet

tail

Why would you need a PHP script to tail a log file using telnet? You don’t! But it the script is cool anyway. It allows you to connect to your web server over telnet, talk some HTTP to your web server, and run a PHP script that shows a tail of a log file. It uses ANSI sequences (colors!) to provide a nice user interface specifically to tail a log file with the “follow” option (like “tail -f”). Below you find the PHP script that you have to put on the web server:

<?php
// configuration
$file = '/var/log/apache2/access.log';
$ip = '127.';
// start of script
$title = "\033[H\033[2K$file";
if (strpos($_SERVER['REMOTE_ADDR'],$ip)!==0) die('Access Denied');
$stream = fopen($file, 'r');
if (!$stream) die("Could not open file: $file\n");
echo "\033[m\033[2J";
fseek($stream, 0, SEEK_END);
echo str_repeat("\n",4500)."\033[s$title";
flush();
while(true){
  $data = stream_get_contents($stream);
  if ($data) {
    echo "\033[32m\033[u".$data."\033[s".str_repeat("\033[m",1500)."$title";
    flush();
  }
  usleep(100000);
}
fclose($stream);

To tail (and follow) a remote file you need to talk HTTP to the web server using telnet and request to load the PHP tail script. First you connect using telnet:

$ telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

After connecting you have to “speak” some HTTP (just type this):

GET /tail.php HTTP/1.1
Host: localhost

NB: Make sure you end the above telnet commands with an empty line! After this the screen should be empty showing any new log lines in real-time in green on the telnet window.

You can use Ctrl + ‘]’ to get to the telnet prompt and type “quit” to exit.

If you don’t want to copy the code above, then you can also find the latest version of tail.php on Github.

Share

Creating a simple REST API in PHP

I’m the author of php-crud-api and I want to share the core of the application with you. It includes routing a JSON REST request, converting it into SQL, executing it and giving a meaningful response. I tried to write the application as short as possible and came up with these 65 lines of code:

<?php

// get the HTTP method, path and body of the request
$method = $_SERVER['REQUEST_METHOD'];
$request = explode('/', trim($_SERVER['PATH_INFO'],'/'));
$input = json_decode(file_get_contents('php://input'),true);

// connect to the mysql database
$link = mysqli_connect('localhost', 'user', 'pass', 'dbname');
mysqli_set_charset($link,'utf8');

// retrieve the table and key from the path
$table = preg_replace('/[^a-z0-9_]+/i','',array_shift($request));
$key = array_shift($request)+0;

// escape the columns and values from the input object
$columns = preg_replace('/[^a-z0-9_]+/i','',array_keys($input));
$values = array_map(function ($value) use ($link) {
  if ($value===null) return null;
  return mysqli_real_escape_string($link,(string)$value);
},array_values($input));

// build the SET part of the SQL command
$set = '';
for ($i=0;$i<count($columns);$i++) {
  $set.=($i>0?',':'').'`'.$columns[$i].'`=';
  $set.=($values[$i]===null?'NULL':'"'.$values[$i].'"');
}

// create SQL based on HTTP method
switch ($method) {
  case 'GET':
    $sql = "select * from `$table`".($key?" WHERE id=$key":''); break;
  case 'PUT':
    $sql = "update `$table` set $set where id=$key"; break;
  case 'POST':
    $sql = "insert into `$table` set $set"; break;
  case 'DELETE':
    $sql = "delete `$table` where id=$key"; break;
}

// excecute SQL statement
$result = mysqli_query($link,$sql);

// die if SQL statement failed
if (!$result) {
  http_response_code(404);
  die(mysqli_error());
}

// print results, insert id or affected row count
if ($method == 'GET') {
  if (!$key) echo '[';
  for ($i=0;$i<mysqli_num_rows($result);$i++) {
    echo ($i>0?',':'').json_encode(mysqli_fetch_object($result));
  }
  if (!$key) echo ']';
} elseif ($method == 'POST') {
  echo mysqli_insert_id($link);
} else {
  echo mysqli_affected_rows($link);
}

// close mysql connection
mysqli_close($link);

This code is written to show you how simple it is to make a fully operational REST API in PHP.

Running

Save this file as “api.php” in your (Apache) document root and call it using:

http://localhost/api.php/{$table}/{$id}

Or you can use the PHP built-in webserver from the command line using:

$ php -S localhost:8888 api.php

The URL when ran in from the command line is:

http://localhost:8888/api.php/{$table}/{$id}

NB: Don’t forget to adjust the ‘mysqli_connect’ parameters in the above script!

REST API in a single PHP file

Although the above code is not perfect it actually does do 3 important things:

  1. Support HTTP verbs GET, POST, UPDATE and DELETE
  2. Escape all data properly to avoid SQL injection
  3. Handle null values correctly

One could thus say that the REST API is fully functional. You may run into missing features of the code, such as:

  1. No related data (automatic joins) supported
  2. No condensed JSON output supported
  3. No support for PostgreSQL or SQL Server
  4. No POST parameter support
  5. No JSONP/CORS cross domain support
  6. No base64 binary column support
  7. No permission system
  8. No search/filter support
  9. No pagination or sorting supported
  10. No column selection supported

Don’t worry, all these features are available in php-crud-api, which you can get from Github. On the other hand, now that you have the essence of the application, you may also write your own!

Share

PHP-CRUD-API now supports PostgreSQL 9

For the pasts months I have been building PHP-CRUD-API (formerly MySQL-CRUD-API). It is a single PHP file that provides an instant powerful and consistent REST API for a MySQL, PostgreSQL or MS SQL Server database. The application uses reflection to “detect” the table structure and then provide an API in only a few hundred lines of PHP code. It is quite comparable to the REST functionality of the experimental HTTP plugin in MySQL 5.7.

Production performance < 10 ms

I recently finished the test suites and started using it in production. I’ve added some simple .htaccess based firewalling to ensure that only trusted applications can talk to it. Most REST API calls are handled well under 10 ms, so the performance impact on the consuming web application is acceptable. It manages to keep page loads under 100 ms even when doing several REST API calls.

PostgreSQL support and more

Recently PostgreSQL was added as a supported database. With the addition of this third database backend I also changed the name from MySQL-CRUD-API to PHP-CRUD-API. Other features that were recently added are support for “CORS pre-flight requests” (mainly for AngularJS) and “JSON batch insert”. Feature requests are very welcome on the php-crud-api Github page.

Contributions / Future

If you feel like contributing, then maybe these topics inspire you:

  1. Set up Travis automated tests
  2. Add an API documentation generator
  3. Create a plugin system for authentication, authorization and accounting
  4. Port to NodeJS, Java or C#

If you like the project, please give it a star on Github!

Share