PHP script to tail a log file using telnet

tail

Why would you need a PHP script to tail a log file using telnet? You don’t! But it the script is cool anyway. It allows you to connect to your web server over telnet, talk some HTTP to your web server, and run a PHP script that shows a tail of a log file. It uses ANSI sequences (colors!) to provide a nice user interface specifically to tail a log file with the “follow” option (like “tail -f”). Below you find the PHP script that you have to put on the web server:

<?php
// configuration
$file = '/var/log/apache2/access.log';
$ip = '127.';
// start of script
$title = "\033[H\033[2K$file";
if (strpos($_SERVER['REMOTE_ADDR'],$ip)!==0) die('Access Denied');
$stream = fopen($file, 'r');
if (!$stream) die("Could not open file: $file\n");
echo "\033[m\033[2J";
fseek($stream, 0, SEEK_END);
echo str_repeat("\n",4500)."\033[s$title";
flush();
while(true){
  $data = stream_get_contents($stream);
  if ($data) {
    echo "\033[32m\033[u".$data."\033[s".str_repeat("\033[m",1500)."$title";
    flush();
  }
  usleep(100000);
}
fclose($stream);

To tail (and follow) a remote file you need to talk HTTP to the web server using telnet and request to load the PHP tail script. First you connect using telnet:

$ telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

After connecting you have to “speak” some HTTP (just type this):

GET /tail.php HTTP/1.1
Host: localhost

NB: Make sure you end the above telnet commands with an empty line! After this the screen should be empty showing any new log lines in real-time in green on the telnet window.

You can use Ctrl + ‘]’ to get to the telnet prompt and type “quit” to exit.

If you don’t want to copy the code above, then you can also find the latest version of tail.php on Github.

Simple web application firewall using .htaccess

Apache provides a simple web application firewall by a allowing for a “.htaccess” file with certain rules in it. This is a file you put in your document root and may restrict or allow access from certain specific IP addresses. NB: These commands may also be put directly in the virtual host configuration file in “/etc/apache2/sites-available/”.

Use Case #1: Test environment

Sometimes you may want to lock down a site and only grant access from a limited set of IP addresses. The following example (for Apache 2.2) only allows access from the IP address “127.0.0.1” and blocks any other request:

Order Allow,Deny
Deny from all
Allow from 127.0.0.1

In Apache 2.4 the syntax has slightly changed:

Require all denied
Require ip 127.0.0.1

You can find your IP address on: whatismyipaddress.com

Use Case #2: Application level firewall

If you run a production server and somebody is abusing your system with a lot of requests then you may want to block a specific IP address. The following example (for Apache 2.2) only blocks access from the IP address “172.28.255.2” and allows any other request:

Order deny,allow
Allow from all
Deny from 172.28.255.2

In Apache 2.4 the syntax has slightly changed:

Require all granted
Require not ip 172.28.255.2

If you want to block an entire range you may also specify CIDR notation:

Require all granted
Require not ip 10.0.0.0/8
Require not ip 172.16.0.0/12
Require not ip 192.168.0.0/16

NB: Not only IPv4, but also IPv6 addresses may be used.

RewriteCond and RewriteRule tricks for .htaccess

The Apache web server has a module called “mod_rewrite”. It allows for redirecting and modifying the requested URL. Below are some of the most popular modifications and redirects that can be executed. Put these commands in a  “.htaccess” file in the document root of the web site. In order of popularity:

#1. Redirect everything the www subdomain

With or without “www”? Not such a hard question considering that you can answer on both and redirect one. This snippet, which can be combined with the previous one, redirects all non-www requests to the www subdomain:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{HTTP_HOST} !^www\. [NC]
 RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
</IfModule>

Having your website on the “www” subdomain may be beneficial when dealing with CDN or security services.

#2. Redirect everything to HTTPS

There is hardly any reason not to run your website over SSL (HTTPS). Also it is very important that you answer to both HTTP and HTTPS. This snippet redirects all non-HTTPS traffic to HTTPS:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{HTTPS} !on [NC]
 RewriteRule ^(.*)$ https://%{HTTP_HOST}/$1 [R=301,L]
</IfModule>

It is important to choose one true (canonical) URL for SEO reasons.

#3. PHP file to handle all non-static requests

Also known as the front controller pattern. This mechanism is the basis for any web framework. In PHP it allows you to read the actual requested path in the $_SERVER[‘REQUEST_URI’] global variable. The rewrite looks like this:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{REQUEST_FILENAME} !-d
 RewriteCond %{REQUEST_FILENAME} !-f
 RewriteRule ^(.*)$ index.php [QSA,L]
</IfModule>

Note that the PHP file is bypassed for existing files (static content).

#4. Rewrite GET parameter to URL part

If you have an URL that you should be calling with “GET /orders?id=13” and you want it to respond as if “GET /orders/13” was called, then you may use the following:

<IfModule mod_rewrite.c>
 RewriteEngine On
 RewriteCond %{REQUEST_URI} ^/orders [NC]
 RewriteCond %{QUERY_STRING} ^id=([0-9]+)$ [NC]
 RewriteRule ^(.*)$ %{REQUEST_URI}/%1\? [R,L]
</IfModule>

This is especially useful when migrating URL schemes and need legacy support. Note how the escaped question mark at the end of the “RewriteRule” removes the GET parameter(s).

 

GoAccess is a top for Nginx or Apache

goaccess-dashboard

GoAccess is an open source real-time web log analyser and interactive viewer that runs in a terminal on Linux systems. The name suggest that is written in the Go language, but it actually is not (it is written in C).

Monitoring

Effectively it is like the “top” command, but instead of showing processes it is giving you insight in the traffic on your webserver. The tool provides all kinds of top lists useful for spotting irregularities. Great features include the listing of IP addresses by percentage of hits and showing the status codes of the served responses. Typically you would use this tool if your webserver is reporting high load or high bandwidth usage and you want to find out what is going on. The outcomes of the analysis with this tool can then be used to adjust your firewall or caching rules.

Installing

On my Ubuntu 15.04 I can install version 0.8.3 with:

sudo apt-get install goaccess

If you are on any Debian based Linux and you want to run the latest (0.9.2) version you can simply run:

wget http://tar.goaccess.io/goaccess-0.9.2.tar.gz
tar -xvzf goaccess-0.9.2.tar.gz
cd goaccess-0.9.2/
sudo apt-get install build-essential
sudo apt-get install libglib2.0-dev
sudo apt-get install libgeoip-dev
sudo apt-get install libncursesw5-dev
./configure --enable-geoip --enable-utf8
make
sudo make install

This will install the software in “/usr/local/bin/” and the manual in “/usr/local/man/”.

Running

Running the software is as easy as:

man goaccess
goaccess -f /var/log/apache2/access.log

The software will prompt you the log format. For me the “NCSA Combined Log Format” log format worked best. For nginx I just ran:

goaccess -f /var/log/nginx/access.log

It is really awesome, try it!

Tutorial: Apache 2.4 as reverse proxy

This post explains how to configure Apache 2.4 (the version that comes with Ubuntu 14.04) as a fully transparent reverse proxy. If you have a single website that has multiple paths that are actually run by different web applications then this tutorial may be for you.

reverse_proxy

The proxy will serve both web applications from their own virtual host configuration. These may be on the same machine as shown below using the loop-back addresses 127.0.0.1 and 127.0.0.2 or on different machines if you use their (internal) IP addresses.

Site: http://www.yourwebsite.com/
App1: http://www.yourwebsite.com/app1 = http://127.0.0.1/app1
App2: http://www.yourwebsite.com/app2 = http://127.0.0.2/app2

This is the directory structure in which I want to load the various web apps:

maurits@nuc:/var/www/html$ ll
total 28
drwxr-xr-x 4 root root  4096 Dec  1 21:43 ./
drwxr-xr-x 3 root root  4096 Apr 21  2014 ../
-rw-r--r-- 1 root root 11510 Apr 21  2014 index.html
drwxr-xr-x 2 root root  4096 Dec  1 21:45 app1/
drwxr-xr-x 2 root root  4096 Dec  1 21:45 app2/

In this tutorial we run the web applications on the same paths as on the proxy. This means that the web apps run in a subdirectory, even on the machines behind the proxy. This avoids the need of rewriting and thus keeps this setup simple and easy to debug.

Setting up the reverse proxy in Apache 2.4

What we are going to do is setup a reverse proxy. First we load the “proxy_http” module in Apache 2.4 using:

sudo a2enmod proxy_http
sudo service apache2 restart

Let’s setup the reverse proxy virtual host configuration in “/etc/apache2/sites-available/yourwebsite-proxy.conf” like this:

<VirtualHost *:80>
ServerName www.yourwebsite.com
DocumentRoot /var/www/html
ProxyPreserveHost On
ProxyPass /app1 http://127.0.0.1/app1
ProxyPass /app2 http://127.0.0.2/app2
</VirtualHost>

The virtual host configuration of app1 in “/etc/apache2/sites-available/yourwebsite-app1.conf” looks like this:

<VirtualHost 127.0.0.1:80>
ServerName www.yourwebsite.com
DocumentRoot /var/www/html
...
</VirtualHost>

And the virtual host configuration of app2 in “/etc/apache2/sites-available/yourwebsite-app2.conf” looks like this:

<VirtualHost 127.0.0.2:80>
ServerName www.yourwebsite.com
DocumentRoot /var/www/html
...
</VirtualHost>

Lets enable all sites and reload Apache using:

sudo a2ensite yourwebsite-proxy yourwebsite-app1 yourwebsite-app2
sudo service apache2 reload

Note that this works as the virtual host configurations with a specified IP address will be matched first. The “ProxyPreserveHost” will make sure the “Host” header in the request is not rewritten. The lack of a “ProxyPassReverse” will make sure that there is no rewriting done on the response.

Showing the correct remote IP address

It is important to understand that in the above setup, the proxied web application will only see a different “REMOTE_ADDR” environment variable, since there is absolutely no rewriting going on. The real visitor address is passed along in “X-Forwarded-For” header. This is a comma separated list and the last entry holds the real client IP address.

If you are on Apache 2.4, like in Ubuntu 14.04, you can correct the reported remote address by loading the “remoteip” module like this:

sudo a2enmod remoteip
sudo service apache2 restart

Add the “RemoteIPHeader” and “RemoteIPInternalProxy” directives to the virtual host configurations:

<VirtualHost 127.0.0.1:80>
ServerName www.yourwebsite.com
DocumentRoot /var/www/html
RemoteIPHeader X-Forwarded-For
RemoteIPInternalProxy 127.0.0.0/8
...
</VirtualHost>

Note that the “RemoteIPInternalProxy” you must specify the internal IP address of the proxy. To test if you did it right you can run a PHP script that calls “phpinfo()”. If you see that the “REMOTE_ADDR” value is not set to the proxy, then it is working.

Adding headers to the upstream request

We want to make Apache2 add upstream headers and therefor we need to load the “headers” module in Apache 2.4 using:

sudo a2enmod headers
sudo service apache2 restart

Next, we have to adjust the reverse proxy virtual host configuration in “/etc/apache2/sites-available/yourwebsite-proxy.conf” like this:

<VirtualHost *:80>
ServerName www.yourwebsite.com
DocumentRoot /var/www/html
ProxyPreserveHost On
RewriteEngine On
RequestHeader add X-SSL off
RewriteRule ^/app1/(.*) http://127.0.0.1/app1/$1 [P,L]
RewriteRule ^/app2/(.*) http://127.0.0.2/app2/$1 [P,L]
</VirtualHost>

In this example we add a “X-SSL” header with the value “off” to the proxied request. If you want to add headers to the response you can use the “Header” directive.

If you have any questions, please use the comments below.