ZFS – One File System to Rule Them All

ZFS [1] is one of the few enterprise-grade file systems with advanced storage features, such as in-line deduplication, in-line compression, copy-on-write, and snapshotting. These features are handy in a variety of scenarios from backups to virtual machine image storage. A native port of ZFS is also available for Linux. Here we take a look at ZFS compression and deduplication features using some examples.

Setting ZFS up

ZFS handles disks very much like operating systems handle memory. This way, ZFS creates a logical separation between the file system and the physical disks. This logical seperation is called “pool” in ZFS terms.

Here we simply create a large file to mimic a disk via a loopback device and we create a pool on top:

# fallocate -l10G test1.img
# losetup /dev/loop0 test1.img
# zpool create testpool /dev/loop0
# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
testpool 9.94G 124K 9.94G 0% 1.00x ONLINE -

Let’s create a file, note that the pool gets mounted on /$POOLNAME:

# cd /testpool
# dd if=/dev/urandom of=randfile bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 15.3166 s, 6.8 MB/s
# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
testpool 9.94G 100M 9.84G 0% 1.00x ONLINE -

Deduplication/compression

ZFS supports in-line deduplication and compression. This means that if these features are enabled, the file system automatically finds duplicated data and deduplicates it and compresses the data with compression potential. Here we show how deduplication can help save disk space:

# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
testpool 9.94G 100M 9.84G 0% 1.00x ONLINE -
# cp randfile randfile2
# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
testpool 9.94G 200M 9.74G 1% 1.00x ONLINE -
# zfs create -o dedup=on testpool/deduplicated
# ls
deduplicated randfile randfile2
# mv randfile deduplicated/
# mv randfile2 deduplicated/
# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
testpool 9.94G 101M 9.84G 0% 2.00x ONLINE -

Here we show how compression can help save disk space with gzip algorithm:

# zfs create -o compression=gzip testpool/compressed
# ls
compressed deduplicated linux-3.12.6
# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
testpool 9.94G 532M 9.42G 5% 1.00x ONLINE -
# mv linux-3.12.6 compressed/
# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
testpool 9.94G 155M 9.79G 1% 1.00x ONLINE -

Discussion

As you can see deduplication and compression can save you some serious disk space. You can also enable both deduplication and compression together according to your needs. Deduplication is especially useful when there are lots of similar inter or intra files (e.g. virtual machine images). Compression is useful when there is compression opportunity inter files (e.g. text, source code). Benefits aside, deduplication needs a hashtable for detecting similarity. Depending on the data, you may need a couple of GBs of memory per TB of data. De/compression on the other hand, burns a lot of your CPU cycles.

[1] http://en.wikipedia.org/wiki/ZFS
[2] http://zfsonlinux.org

Share

Symfony2 Memcache session locking

In one of the previous posts we wrote about session reliability. Today we will talk about “locking session data”. This is another session reliability topic and we will look at the problems that may occur in Symfony2 and how to solve them.

Session locking

Session locking is when the web server thread acquires an exclusive lock on the session data to avoid concurrent access. Browsers use HTTP 1.1 keep-alive and would normally just use one open TCP connection and reuse that to get all dynamic content. When loading images (and other static content) the browser may decide to use multiple TCP connections (concurrent) to get the data as fast as possible. This also happens when using AJAX. This may (and will most likely) lead to different workers (threads) on the web server answering these concurrent requests concurrently.

Each of the requests may read the session data update and write it back. The last write wins, so some writes may get lost. This can be countered by applying session locking. The session lock will prevent race conditions from occurring and prevent any corrupted data appearing in the session. This can easily be understood by looking at the following two images.

session-access-without-lockingsession-access-with-locking

The left image shows concurrent requests without session locking and the right shows concurrent requests with session locking. This is very well described in this post by Andy Bakun. Note that the above images are also from that post. Reading the Andy Bakun post allows you to truly understand the session locking problem (and the performance problems that AJAX may cause).

Symfony2 sessions

In Symfony2 one would normally use the NativeFileSessionHandler, which will just use the default PHP session handler. This works flawless in most cases. PHP uses “flock” to acquire an exclusive lock on the local filesystem. But when you scale out and run a server farm with multiple web servers you cannot use the local filesystem. You might be using a shared (NFS) filesystem and run into problems with “flock” (see the Linux NFS FAQs). If you use the database you may run into performance problems, escpecially when applying locking.

This leaves Memcache or Redis as options for session storage. These are fast key/value stores that can be used for session storage. Unfortunately the Symfony2 session storage implementations for Memcache (in Symfony) and Redis (in phpredis) do not implement session locking. This potentially leads to problems, especially when relying on AJAX calls as explained above. Note that other frameworks (like CakePHP) also do not implement session locking when using Memcache as session storage. Edit: This post has inspired the guys from SncRedisBundle and this Symfony2 bundle now supports session locking, which is totally awesome!

Custom save handlers

One can write “Custom Save Handlers” as described by the Symfony2 documentation:

Custom handlers are those which completely replace PHP’s built in session save handlers by providing six callback functions which PHP calls internally at various points in the session workflow. Symfony2 HttpFoundation provides some by default and these can easily serve as examples if you wish to write your own. — Symfony2 documentation

But you should be careful, since the examples do not implement session locking.

LswMemcacheBundle to the rescue

At LeaseWeb we love (to use) Memcache. Therefore, we have built session locking into our LswMemcacheBundle. It actually implements acquiring a “spin lock” with the timeout set to PHP’s “max_execution_time” (defaults to 30 seconds). The spin lock tries to acquire the lock every 150 ms (configurable). It will also hold the lock for a maximum time of the PHP “max_execution_time”. By using Memcache’s built-in key expire mechanism, we can ensure the lock is not held indefinitely.

This (spin-lock) implementation is a port of the session locking code from the memcached PECL module (written in C). Our bundle enables locking by default. If you want, you can disable the locking by setting the “locking” configuration parameter to “false” as described in the documentation.

This session locking code was also ported to SncRedisBundle and submitted as PR #109. LswMemcacheBundle is open-source and can be found on our GitHub account:

https://github.com/LeaseWeb/LswMemcacheBundle

Share

MindaPHP: a new PHP framework optimized for learning

When people talk about Web Application Frameworks (WAF), they often refer to web frameworks with a model–view–controller (MVC) architecture. MVC is a software architecture pattern that separates the representation of information from the user’s interaction with it. Most popular frameworks actually follow the model–view–adapter (MVA) that decouples the model and the view as described below:

Traditional MVC arranges model (e.g., data structures and storage), view (e.g., user interface), and controller (e.g., business logic) in a triangle, with model, view, and controller as vertices, so that some information flows between the model and views outside of the controller’s direct control. The model–view–adapter solves this rather differently than the model–view–controller does by arranging model, adapter or mediating controller, and view linearly without any connections whatsoever directly between model and view. — Wikipedia

More and more frameworks consist of a set of components (e.g. Zend). This is why people start to talk about “full-stack” vs. “glue” frameworks. A “glue” framework allows the programmer to create a tailor-made framework by gluing the needed components together. Full-stack frameworks, on the other hand, do not require you to do this.

Others talk about the difference between “push-based” vs. “pull-based” frameworks. This difference essentially is whether the framework pushes data towards the view or pulls the data in from the view. Most frameworks use the “push” approach.

Separation of concerns

What everybody seems to agree is that we need some form of “separation of concerns” or a n-tier architectural model. This means that we “divide the system cleanly into three tiers: the presentation tier, the business-logic tier, and the data-access or resource tier”, like MVC or MVA does.

What many framework architects do not seem to optimize for are these three important things:

  1. Cost of learning – maximize documentation reuse & minimize innovation
  2. Cost of scaling – maximize compatibility & minimize lines of code executed
  3. Cost of defects – maximize best practices & minimize complexity

They seem obsessed with optimizing separation of concerns a.k.a. reducing the “Cost of spaghetti”. In their efforts they create hard to grasp concepts, like “Dependency Injection” and “Aspect-oriented programming“. Do not get me wrong: I am not saying that these methods do not help you to fight the cross-cutting concern, but IMHO the complexity problems they cause outweigh their benefits of keeping things organized.

MindaPHP to the rescue

So, it may be clear that I believe that simple is better. With that “vision” I wrote MindaPHP. Whether you like it or not you may decide for yourself, but it certainly is easy to learn and about 10-20 times faster than CakePHP or Symfony, while providing the same abstraction layers to keep things organized.

MindaPHP aims to be a full-stack framework that is:

  1. Easy to learn
  2. Secure by design
  3. Light-weight

By design, it does:

  1. … have one variable scope for all layers.
  2. … require you to write SQL queries (no ORM).
  3. … use PHP as a templating language.

Mainly to make it easy to learn for PHP developers. Check it out!

Code: https://github.com/mevdschee/MindaPHP
Demo: http://maurits.server.nlware.com/

Share

Replacing SOA API calls by EDA request/replies using Redis

This post aims at simplifying and explaining Enterprise Application Integration (EAI) concepts using more pictures and examples than words.

soa_vs_eda

The main difference between Event Driven Architecture (EDA) and Service Oriented Architecture (SOA) is that EDA is asynchronous and SOA synchronous.

soa

In the picture above you see how SOA typically connects anything to anything directly.

eda

The EDA pattern, on the other hand, uses a single communication channel (message bus or queueing server).

Ideally we stop thinking about API calls (request/response) and we move to “events”. “Events” are things that happen in your application that might be of interest to other applications. When a customer needs to be created by your website you can make an API call and wait for an “OK”, or you could rewrite your application to “fire and forget” and handle errors separately. But what if an application has a question that needs to be answered? For example: What if you have a “customer reference” and want to find the corresponding “email address”?

The solution is to use the Request/Reply Enterprise Integration Pattern (EIP). Most Enterprise Messaging Systems (EMS) offer an Enterprise Service Bus (ESB) on which the message format (set of optional and required headers) is specified, but the data format is free. It is up to the applications to run their own integration engine, which has adapters for different data types for the various applications it wants to talk with.

A SugarCRM example

The following message could be in the “SugarCRM/Requests” queue:

Timestamp:  1382055093876 ms
MessageId:  345781632423
ReplyQueue: SugarCRM/Replies
Data:       {"method":"GET","url":"/Accounts/f1eeca5f-c0eb-891e-db92-52323b958a87"}

The SugarCRM application has an integration engine that monitors the “SugarCRM/Requests” queue. It will receive the above message, transforms it into a API request and executes it. The result that it receives will be transformed into the following message and put in the “SugarCRM/Replies” queue:

Timestamp:  1382055096943 ms
MessageId:  345781632567
ReplyTo:    345781632423
Data:       {"id":"f1eeca5f-c0eb-891e-db92-52323b958a87","name":"RR. Talker Co","date_entered":"2013-09-12T18:09:00-04:00","date_modified":"2013-09-12T18:09:00-04:00","modified_user_id":"1","modified_by_name":"Administrator","created_by":"1","created_by_name":"Administrator","description":"","deleted":false,"assigned_user_id":"seed_jim_id","assigned_user_name":"Jim Brennan","team_count":"","team_name":[{"id":"East","name":"East","name_2":"","primary":true}],"linkedin":"","facebook":"","twitter":"","googleplus":"","account_type":"Customer","industry":"Electronics","annual_revenue":"","phone_fax":"","billing_address_street":"67321 West Siam St.","billing_address_street_2":"","billing_address_street_3":"","billing_address_street_4":"","billing_address_city":"Santa Fe","billing_address_state":"NY","billing_address_postalcode":"44150","billing_address_country":"USA","rating":"","phone_office":"(949) 400-8060","phone_alternate":"","website":"www.imim.name","ownership":"","employees":"","ticker_symbol":"","shipping_address_street":"67321 West Siam St.","shipping_address_street_2":"","shipping_address_street_3":"","shipping_address_street_4":"","shipping_address_city":"Santa Fe","shipping_address_state":"NY","shipping_address_postalcode":"44150","shipping_address_country":"USA","email":[{"email_address":"phone95@example.it","invalid_email":false,"opt_out":false,"primary_address":true,"reply_to_address":false},{"email_address":"qa.qa@example.tv","invalid_email":false,"opt_out":false,"primary_address":false,"reply_to_address":false}],"email1":"phone95@example.it","parent_id":"","sic_code":"","parent_name":"","email_opt_out":false,"invalid_email":false,"campaign_id":"","campaign_name":"","my_favorite":false,"_acl":{"fields":{}},"following":true,"_module":"Accounts"}

Tutorial: EDA with Redis

One of the more KISS – but less conventional – ways of implementing the above scheme would be to use Redis to store queues. The “SugarCRM/Requests” queue can be implemented as a “List” of “MessageId” values. The messages can be stored with “Hash” data types in the global namespace using the Redis “HMSET” command, where the headers can be individual keys. The unique “MessageId” values can be generated using the Redis “INCR” command. The “SugarCRM/Replies” queue can be implemented as a “Hash” data type as well, where  the “ReplyTo” can be used as the key and the “MessageId” as the value.

Share

API first architecture or the fat vs thin server debate

API first architecture is an architecture that treats the API user as the primary user of the application. This means that API is not an alternative view in the MVC paradigm, but it has the highest priority. The main differentiators are that in “API first” the architecture enforces a complete, responsive, and well-documented API. This is especially important when targeting: mobile (Apps connect to the API), resellers (their presentation layer uses the API) and highly integrated, but decoupled, multi-product environments.

MVC reuse

The MVC architecture has been popular for a long time already. In 2004, it’s popularity skyrocketed when Ruby on Rails was released. MVC allows for high reuse in the case you have a front-end / back-end application (in the CMS sense), where customers use the front-end and employees use the back-end. This does require that you choose the same software stack for both the front-end and back-end, and make those applications as similar as possible. When the MVC strategy is executed properly, many parts of the application can be reused. Some of the parts that can be reused are: DBAL/ORM, Business Logic, Presentation and AAA. Specifically AAA (Authentication, Authorization and Accounting) can be reused by allowing employees to impersonate customers, use the same login screen and share logging facilities.

Mobile views for MVC

In 2007, Apple introduced the iPhone and from that time on the importance of web application (and websites) on small screens quickly grew. MVC applications were, and still are, very suited to facilitate small screens. All that is needed is a separate or adjusted set of views that is usable on a smartphone or tablet. The strategy of creating a single set of views that is suitable for mobile and also for desktop is called “mobile first“. This is the most cost-effective and radical approach, which requires strong leadership and decision taking, because all the software needs dramatic change, as all the views need to be adjusted. The alternative is to maintain two sets of views: one for mobile and one for desktop. The alternative views are often hosted on a “m.” subdomain. This is a simple and transparent approach.

Adding API to MVC

The dreadful “HTML5 vs Native” app development debate is going on right now and I quote Danny Brown:

Any company creating mobile apps today faces an important decision, Native or HTML5? Each one has its advantages, but choosing the wrong one could be costly. – Danny Brown

Choosing Native requires you to build a complete, responsive, and well-documented API, while choosing HTML5 requires you to redesign the views. There are arguments to defend either path and it depends on the situation what choice is best. There is one approach that will always fail: building an API as views on top of MVC. Let me explain why that fails and why so many people do it anyway.

Typically the (server-side) MVC approach leads to pages with 200 ms of load time. In this approach the server does three things: database abstraction, business logic, and presentation. This is why it is also referred to as “fat server”. An API is not responsible for presentation, executes smaller business logic per request and is therefor named “thin server“. A good API is highly optimized for speed and has typical load times under 20 ms. The means that when a mobile page is constructed there can be multiple (up to 10) calls made to render the full page within 300 ms.

Still when one took the MVC approach and is now lacking an API, the easiest thing to do is to add a few views that output JSON and call it a “RESTful API”. All you need to do is write some documentation and the boss will be happy. The fact that this API is totally unusable in real life, because it does not scale and is horribly slow, will only be noticed when the API is actually used and there is no way back.

architecture_v

Twitter & API first

In 2010, Twitter announced their “API first” strategy. They call this a JavaScript architecture, since they created a web application in JavaScript in a similar architecture as the mobile “apps”. This allows them to have full reuse of the API they build. Where initially the API was something “extra”, next to their web application, it then became the foundation of all other development. Their API is focused on delivering optimal integration for JavaScript programmers by using a RESTful JSON API. But they are also serving their application using traditional pages:

In order to support crawlers and users without JavaScript, we needed a rendering system that runs on both server and client. – Twitter

This approach of delivering traditional pages, while still using the “API first” strategy is what I call “Hybrid”. In the diagram below I’ve tried to enumerate the different approaches.

architecture_h

Conclusion

Optimal reuse brings down costs, but optimal reuse can only be achieved when there is a strong architectural strategy to follow. Refactoring code to increase architectural compliance does not bring value to the business directly. It will bring down the cost of change eventually, but the level of trust needed for these decisions is not easily gained.

Further reading

This is not the first post about “API first”. Check out the following links if you want to learn more about it:

Slides

I created a deck of 11 slides about “api first” for a presentation I gave on the 26th of November at LeaseWeb.

Share