Monday 30 June 2014

Enterprise integration - PHP and Java over Thrift

Sometimes integration between different platforms is a must. In an ideal world you would roll the same platform across all your infrastructure, but in reality that is not always possible. The first thing that springs in mind when you have different platforms is web services such as REST, SOAP or XML-RPC(if you lived in a cave for the past 10 years - no offense to those who still run it for "legacy" reasons). It's natural to think of these solutions, since most people noways will publish their APIs over HTTP. Communicating over HTTP is slow and it's perfectly fine for third-party integrations in most cases, because there are no great expectations performance wise. But what if the integration is done internally and two pieces of your infrastructure, that run on top of different platforms, need to be integrated? Is it acceptable that for every HTTP request to your application you make another HTTP request? If it is, then please stop reading this, I am wasting your time. If it's not then read on.
There is a better way for integration and that is what giants such as Google or Facebook are doing. Google has Protocol Buffers and Facebook has Apache Thrift. Let's have a quick look at both.

What Are Protocol Buffers?

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.

What is Apache Thrift?

The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.

Java+PHP, Protocol Buffers or Thrift?

The answer is obvious, PHP is only supported by Thrift. If you look at the list of languages supported by Thift, it will be the natural choice for many. Facebook have developed it internally, it works great for them at a massive scale, which means it should be brilliant for you also.
In order to be able to use Thrift, you must download it and install it. Please follow the instructions on the official site. Please note that you don't need Thrift installed to run my example, just clone the repository from Github: git clone https://github.com/bogdanalbei/KrunchyKreme.git

Java Server, PHP client

Let's say that we want to consume the services of a Java application from PHP. The reverse is also possible and quite easy to accomplish, as you will see. The Java application is a doughnut vending machine and the client has the possibility of having a look at what's available and ordering.
The most essential part in understanding Thrift is understanding it's Interface Definition Language(IDL). Based on the IDL file(s), Thrift will compile your server and you client in any of the supported languages. Let's have a look at our IDL file:

The above defines the KrunchyKreme service and two methods: getMenu and order. getMenu takes no parameters and will return a list of Doughnut(the definition of Doughnut is begins with "struct Doughnut"). The "order" method allows the client to place an order for a specific doughnutId and a quantity. Please note that the data types are platform independent and they will get translated into the appropriate types depending on the compilation language. You can read about Thrift IDL files here later if you wish.

Now the fun bit, let's compile our Java server and PHP client:

The Java files will be generated into the gen-java folder and the PHP files will be generated into the gen-php folder. In my sample project I have conveniently created a generate-java and generate-php script that also copies the files into the right folder structure for the Java and PHP projects.
You can generate your clients and servers in any of the supported languages, which is pretty good if you ask me. 

 

The Java server

My sample project contains the java server. In uses maven and you can build it by running the build script bin/build.sh that just executes "mvn package". This is the server:

All this does is starting a single threaded server that handles requests by using the KrunchyKremeHandler class:

If you are wondering where KrunchyKreme.Iface came from, the answer is easy: it is generated by Thrift. So you will get an interface for your server, all you need to do now is write the actual code.  Easy, right?

Once the server is built, you can start it by running ./bin/server.sh in the server's folder. It will start listening on port 9090 and will output debug information when clients connect and call it's services.

The PHP client

You can find the sample client here. The project is based on composer, so you will have to install the dependencies by running php componser.phar install.

The getClient() method initializes the client. Once that is done, getMenu() is called and the three successive orders are placed by calling order(..). Please note the KrunchyKremeClient class, which was generated by Thrift. This gives us all the methods available to call on the server, all we have to do is use it.
You can run the client with php client.php in the client folder. This is the output:

As expected, we display the menu and then output the results of the orders. Please note the response time: 3ms to connect, 3ms to get the menu and 1ms for each order. I am running this on a basic DigitalOcean machine with 1GB or RAM, 1 processor and SSD.  Fair enough, this is a simple example, but try this over HTTP. I don't have numbers for this, but I would expect it to be about ten times worse for Restful web service calls. Not to talk about the resources used, a web server that serves web services will be much more expensive than a simple standalone server. This is why Google and Facebook don't do much Restful internally but Protocol Buffers and Thrift. 

In conclusion, is the complexity worth it? It really depends on your needs, but I believe we will see Thrift more and more.