Thursday, 12 March 2015

Reduce your git repository size in minutes

It is a fact that your git repository accumulates a lot of history. Even though git was not built for binary files, people do store them in repositories and that contributes to the growth. At a certain point you might be removing binary files and looking back at the history of an image is not something that you do every day. So why not remove all the massive blobs from history? I know, it sounds like you need to rewrite the history and that is dangerous isn't it? Not quite, with a nice tool called bfg.

Right, let's start:

1. Download bfg or install it via brew, yum, etc.

2. Create a bare clone of your git repository:
git clone --mirror git://

3. Create a backup of the repository(just in case)
cp -r big-repo.git big-repo.git_bak

4. Run this:
bfg -b 100K big-repo.git

This will remove all files over 100K, but don't worry, HEAD is protected. There are many other options(including protecting other branches), have a look at their documentation or just run bfg with no arguments to see the options.

5.  Run git gc to actually remove the files

cd big-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive

6. [Optional] Create a new repo where you push the changes. I like to push the changes into a new repo to be 100% sure that the repository is in a good state. Before pushing, change the url for remote "origin" inside big-repo.git/config.

7. Push the changes:
git push

8. Done. Enjoy your lean repository!

Monday, 30 June 2014

Enterprise integration - PHP and Java over Thrift

Sometimes integration between different platforms is a must. In an ideal world you would roll the same platform across all your infrastructure, but in reality that is not always possible. The first thing that springs in mind when you have different platforms is web services such as REST, SOAP or XML-RPC(if you lived in a cave for the past 10 years - no offense to those who still run it for "legacy" reasons). It's natural to think of these solutions, since most people noways will publish their APIs over HTTP. Communicating over HTTP is slow and it's perfectly fine for third-party integrations in most cases, because there are no great expectations performance wise. But what if the integration is done internally and two pieces of your infrastructure, that run on top of different platforms, need to be integrated? Is it acceptable that for every HTTP request to your application you make another HTTP request? If it is, then please stop reading this, I am wasting your time. If it's not then read on.
There is a better way for integration and that is what giants such as Google or Facebook are doing. Google has Protocol Buffers and Facebook has Apache Thrift. Let's have a quick look at both.

What Are Protocol Buffers?

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.

What is Apache Thrift?

The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.

Java+PHP, Protocol Buffers or Thrift?

The answer is obvious, PHP is only supported by Thrift. If you look at the list of languages supported by Thift, it will be the natural choice for many. Facebook have developed it internally, it works great for them at a massive scale, which means it should be brilliant for you also.
In order to be able to use Thrift, you must download it and install it. Please follow the instructions on the official site. Please note that you don't need Thrift installed to run my example, just clone the repository from Github: git clone

Java Server, PHP client

Let's say that we want to consume the services of a Java application from PHP. The reverse is also possible and quite easy to accomplish, as you will see. The Java application is a doughnut vending machine and the client has the possibility of having a look at what's available and ordering.
The most essential part in understanding Thrift is understanding it's Interface Definition Language(IDL). Based on the IDL file(s), Thrift will compile your server and you client in any of the supported languages. Let's have a look at our IDL file:

The above defines the KrunchyKreme service and two methods: getMenu and order. getMenu takes no parameters and will return a list of Doughnut(the definition of Doughnut is begins with "struct Doughnut"). The "order" method allows the client to place an order for a specific doughnutId and a quantity. Please note that the data types are platform independent and they will get translated into the appropriate types depending on the compilation language. You can read about Thrift IDL files here later if you wish.

Now the fun bit, let's compile our Java server and PHP client:

The Java files will be generated into the gen-java folder and the PHP files will be generated into the gen-php folder. In my sample project I have conveniently created a generate-java and generate-php script that also copies the files into the right folder structure for the Java and PHP projects.
You can generate your clients and servers in any of the supported languages, which is pretty good if you ask me. 


The Java server

My sample project contains the java server. In uses maven and you can build it by running the build script bin/ that just executes "mvn package". This is the server:

All this does is starting a single threaded server that handles requests by using the KrunchyKremeHandler class:

If you are wondering where KrunchyKreme.Iface came from, the answer is easy: it is generated by Thrift. So you will get an interface for your server, all you need to do now is write the actual code.  Easy, right?

Once the server is built, you can start it by running ./bin/ in the server's folder. It will start listening on port 9090 and will output debug information when clients connect and call it's services.

The PHP client

You can find the sample client here. The project is based on composer, so you will have to install the dependencies by running php componser.phar install.

The getClient() method initializes the client. Once that is done, getMenu() is called and the three successive orders are placed by calling order(..). Please note the KrunchyKremeClient class, which was generated by Thrift. This gives us all the methods available to call on the server, all we have to do is use it.
You can run the client with php client.php in the client folder. This is the output:

As expected, we display the menu and then output the results of the orders. Please note the response time: 3ms to connect, 3ms to get the menu and 1ms for each order. I am running this on a basic DigitalOcean machine with 1GB or RAM, 1 processor and SSD.  Fair enough, this is a simple example, but try this over HTTP. I don't have numbers for this, but I would expect it to be about ten times worse for Restful web service calls. Not to talk about the resources used, a web server that serves web services will be much more expensive than a simple standalone server. This is why Google and Facebook don't do much Restful internally but Protocol Buffers and Thrift. 

In conclusion, is the complexity worth it? It really depends on your needs, but I believe we will see Thrift more and more.

Friday, 31 May 2013

Updating InfusionSoft Custom Fields

This is a quick post, and I'm writing it mainly because I wasted a lot of time trying to programatically update custom fields in Infusionsoft. So here goes, if you need to update a custom field you need to know the id of the field and the new value:

 updateCustomField($fieldId, $fieldValues)  

You can find the id of the field by going to Admin -> Settings -> Set up custom fields for -> Go and when you hover the field you are interested into you will see the id.
Getting the value is more tricky. What works on any type of field is setting a value manually and then querying DataFormField for the value like this:
 $returnFields = array('DataType', 'Id', 'FormId', 'GroupId', 'Name', 'Label', 'DefaultValue', 'Values', 'ListRows');   
 $query = array('Id' => custom_field_id);   
 $res = $sdk->dsQuery("DataFormField", 10, 0, $query, $returnFields);   
In the result above, have a look at the "Values" field, this is how it should look like.
So to conclude, if you need to update a listbox custom field, this will do it: 

 $values = array(  
     'Values' => "\naaa\nbbb\nddd"  
 $result = $sdk->updateCustomField(custom_field_id, $values);  

Monday, 19 November 2012

Couchbase smart clients

While doing some research on replacing Memcached with Couchbase for the company I work for, I came across the term "Couchbase smart client". The problem is I couldn't find a decent explanation(in my opinion) for what exactly it does. As you will find out next, the smart client is maybe the most important aspect when it comes to connecting to a Couchbase cluster.
Let's have a look first at what the Couchbase documentation says about smart clients:
When using a smart client, the client library provides an interface to the cluster, and performs server selection directly via the vBucket mechanism. The clients communicate with the cluster using a custom Couchbase protocol which enables the sharing of the vBucket map, and the selection within the client of the required vBucket when obtaining and storing information

I couldn't figure out how that is affecting me as a client of the cluster, so I decided to have a look at the low levels calls that a simple client is doing in order to connect to the cluster and write a key. Below is the source code(PHP):

$servers = array('', '');
foreach ($servers as $server) {
        $couch = @(new Couchbase( $server )); // uses the default bucket
        if($couch->getResultCode() === Couchbase::SUCCESS) {

$couch->set('kk', 1);

echo "Done";

Let's run strace  for this script:
strace php ./bin/Temp/couchbase_connect.php   2> /tmp/couch_connect

The output is rather verbose, but we are only interested in the following bits:

  • #1 connect(6, {sa_family=AF_INET, sin_port=htons(8091), sin_addr=inet_addr("")}, 16) = 0
  • #2 sendto(6, "GET /pools/default/bucketsStream"..., 56, 0, NULL, 0) = 56
  • #3 recvfrom(6, "HTTP/1.1 200 OK\r\nTransfer-Encodi"..., 2048, 0, NULL, NULL) = 225
  • #4 connect(7, {sa_family=AF_INET, sin_port=htons(11210), sin_addr=inet_addr("")}, 16) = -1 EINPROGRESS (Operation now in progress)

The above is a massive simplification of what is happening, but it's enough to give us a clue of what the client is doing:

  • at line 1 the client is connecting to the first server in the pool on port 8091
  • at line 2 the client is requesting information about the servers in the cluster
  • at line 3 the client starts receiving that information
  • at line 4 the client is initiating a connection to the server in the pool that is responsible for the key that we want to write, port 11210

To make this short and sweet, the client is connecting to one of the servers in the pool and by receiving information about the cluster, it is able to decide which of the servers is handling the keys it needs. So what makes the client "smart" is the fact that it has knowledge about the cluster and it makes decisions based on the current state of the cluster.

You can start seeing the advantages of such a smart client when you think about not so smart clients like memcached, where you would have to implement all the logic about the state of the cluster on the client side, which is no trivial task by most people's standards.

Saturday, 19 June 2010

PHP remote debugging with Xdebug and Eclipse PDT

Debugging is an invaluable part of software development. I find it very useful in a variety of situations, for instance when I want to understand how a routine works or I need to get rid of a bug that is not exactly easy to fix just by reading the code.

There are several ways to perform debugging in PHP:
  • The most straightforward technique is to use print_r() and var_dump(). This will alter the output, it's quick but very dirty. If you're using this there's nothing to be ashemed of, everyone is doing it.
  • Logging into files/database tables at specific points in the code. This is cleaner than the previous method, but it requires additional effort and usually polutes the code with logging routines. Also this is not exactly debugging, it's logging and analisyng the logs.
  • Using proper debugging tools like Xdebug or the Zend Debugger, integrated into your PHP IDE. This is the clean way to do it, it provides a much better insight into the source code, as you can run it interactively, step by step.
My main goal in this post is to show you how to set your debugging environment with Eclipse PDT and Xdebug. If you're not already using it, get your Eclipse PDT from and install it. Next you will have to get and install xdebug on the machine where PHP runs(it can be the same machine or some remote machine). You should be able to get it through PHP PECL with the following command:

pecl install xdebug

If the above does not work, check the Xdebug installation instructions at .

Once the xdebug extension was installed, you will have to add the extension to php.ini. Add the following lines to php.ini:


On Windows+PHP 5.2.14 I had to replace zend_extension with zend_extension_ts:

Be extra careful with xdebug.remote_host, this is the host where you develop and run your Eclipse, and PHP will try and connect to Eclipse when debugging is enabled. Also make sure that the zend_extension part was not added automatically by the installation, if it was don't add it again.
If there is any mention of the Zend debugger in you php.ini file, you will have to comment that. Restart Apache or whatever web server you're using and make sure the Xdebug installation was correct by running a simple PHP script that contains phpinfo() and searching for "xdebug".

Now the tricky part, Eclipse has to be configured to accept debugging sessions from XDebug. Follow the steps below:
  1. Open your project in Eclipse PDT
  2. In the main menu select Project->Properties
  3. On the left side of the window select "PHP Debug" and then click on "Configure Workspace Settings"
  4. On the "PHP Debugger" dropdown select Xdebug and click "Apply"
  5. Click "Configure" to the right of Xdebug in the same window.
  6. Select Xdebug and click "Configure".
  7. On the "Accept remote session(JIT)" select "any" and click "OK". This is extremely important and this is where most people get stuck.
That's it, Eclipse is now configured, now all we need is to be able to be in control of our debugging sessions. For this we will need to install a Firefox extension called "easy Xdebug"(yes Firefox, you're not developing PHP in IE are you?).

The extension can be installed from . If the link does not work just google "firefox xdebug". Install the extension and restart Firefox. After that you will notice a little green bug on the bottom-right of Firefox and if you hover it it says "Start xdebug session".

As a side note, you might have used Zend IDEs where the debug process starts from the IDE. In Eclipse PDT the process is reversed: you start from the page that you want to debug and PHP will connect to Eclipse in order to establish a debug session. That is why we have installed the firefox extension, because the debug starts from the browser.

Now open the page that you want to debug, on the server where you have just configured PHP with the XDebug extension of course. Click on the green bug I just mentioned to enable debugging and then reload the page. After this you will have to go to Eclipse and see that a new window has just popped up, asking you to "Select the local resource that matches the following server path". In a simple setup you will have just a single option, select the PHP file in that window and click "OK". Eclipse will ask you if you want to change to "PHP Debug perspective" and obviously you have to say "Yes". Optionally you can also check "Remember my decision". After this you should be in the debugging perspective, with Eclipse stopped on the first line of your code, meaning that you can now step through your code.

As a simple guideline you can use the following keys:
  • F5(Step Into) - steps into everything including function or method calls
  • F6(Step Over) - walks through but does not step into function or method calls
  • F8(Resume) - runs until the first breakpoint or end of the program
Breakpoints can be placed by double-clicking on the right of the line where you need the breakpoint. Try and play with the above keys to get a better idea of how they work.

That's it, I'm sure you'll realise that you can't live without debugging once you start using it.

Wednesday, 20 May 2009

Quickstart web services with SOAP and Zend Framework

Web services are software systems designed to support interoperable machine-to-machine interaction over a network. Nowadays if you want to connect external systems, you probably want or have to use web services. What I will discuss here is how to get your own SOAP web service up in minutes.

SOAP(Simple Object Access Protocol ) is probably the most used web service protocol today. It relies on XML as its message format, and it uses HTTP for message transmission. The SOAP server uses WSDL(Web Services Description Language ) to describe its services to external clients. WSDL is simply an XML-based language that provides a model for describing Web services.

Back in the old days you had to know a lot about SOAP and WSDL create a web service. Have a look at to see what I mean. Definitely not very good looking. Luckily Zend Framework has a nice component, Zend_Soap, that handles all the SOAP hard work you would be supposed to do.

So without further ado, here's the code(discussing a Zend Framework component, the code presented here uses the Zend MVC, but you can use it without the Zend MVC):

This is the source code for the controller:
require_once realpath(APPLICATION_PATH .

class SoapController extends Zend_Controller_Action
//change this to your WSDL URI!

public function indexAction()

if(isset($_GET['wsdl'])) {
//return the WSDL
} else {
//handle SOAP request

private function hadleWSDL() {
$autodiscover = new Zend_Soap_AutoDiscover();

private function handleSOAP() {
$soap = new Zend_Soap_Server($this->_WSDL_URI);

public function clientAction() {
$client = new Zend_Soap_Client($this->_WSDL_URI);

$this->view->add_result = $client->math_add(11, 55);
$this->view->not_result = $client->logical_not(true);
$this->view->sort_result = $client->simple_sort(
array("d" => "lemon", "a" => "orange",
"b" => "banana", "c" => "apple"));



And the code for the Soaptest.php class:

class Soaptest {
* Add method
* @param Int $param1
* @param Int $param2
* @return Int
public function math_add($param1, $param2) {
return $param1+$param2;

* Logical not method
* @param boolean $param1
* @return boolean
public function logical_not($param1) {
return !$param1;

* Simple array sort
* @param Array $array
* @return Array
public function simple_sort($array) {
return $array;


You can also download the full project here.

As you can see you don't have to write a lot of code to back up the web service.

Let's discuss the controller first, because there's where the “magic” happens. The index action handles two types of requests: the request for the WSDL, handled by the hadleWSDL() method and the actual SOAP request, handled by the handleSOAP() method.

You can go ahead and try to see how your WSDL looks by accessing http://URL_TO_WEB_SERVICE/soap?wsdl , where URL_TO_WEB_SERVICE is the URL where you have deployed the example. Now imagine that you would have to construct and maintain this yourself, by hand, as old school bearded guys would. Well you don't, because this is handled by Zend_Soap_AutoDiscover which will create the WSDL file for you. The only thing that Zend_Soap_AutoDiscover needs to know is the class you want to use for the web service. Also, because PHP is not strongly typed, you will have to put PHPDoc blocks, because SOAP needs to know what types you are using as parameters and what types you are returning. Have a look here if PHPDoc does not ring a bell .

The SOAP server is handled by the Zend_Soap_Server class, and all it needs is the class you intend to use for the web service, and the URI to your WSDL file. Remember when you checked out how the WSDL file looks? That's exactly the URI you will have to use. In the example you will have to put that into the $_WSDL_URI variable, defined in the SoapController.

That was the SOAP server. Simple, right? Now let's have some tests on the server by implementing a simple SOAP client. The client is handled by the Zend_Soap_Client class that is constructed in the same manner as the server class, it needs just the URI to the WSDL file. After you have constructed the client, you can access the methods defined by the SOAP server in the same way you would access the methods of an object. In the example above you have a simple class, called Soaptest, that defines three very simple methods. Feel free to change the class and test your own methods. While you are playing with the server, you might notice that the WSDL file is cached, so if you change something into the Soaptest.php file, you might not get the expected result. Just delete the cached WSDL file from /tmp/wsdl-* while you do your tests.

You definitely want to have a look at the Zend Framework documentation located here.

That was it, as promissed: your SOAP web service up in minutes.

Tuesday, 5 May 2009

Create your perfect virtualised PHP development environment

The development environment is one of the most important factors to be taken into consideration for successful projects. The environment has three basic elements: the operating system, the text editor and the test environment. The operating system is not really important as long as you're used to it, and this also applies to the text editor. The test environment is something else, because you want to avoid “surprises” when your code goes on the production or company test environment. Your test environment should be very similar, if not identical to the production environment. Unfortunately a lot of organisations have rules over rules that stop you from having your perfect environment. Those of you who are allowed to develop on Linux, consider yourself lucky, because most of the companies will provide a Windows/MacOS environment.

How are you supposed to have a PHP test environment under Windows/MaxOS that is similar to your production Linux? The answer is quite simple: virtualisation. You can have a Linux virtual machine that runs your test environment, and the best part is that you have total control of the test environment, with root access. If that sounds good, read on, and follow these easy steps:
  • Start by downloading VMware Server for your operating system from . You have to register, but the product is free.
  • Install VMWare Server on your machine
  • Go to the VMware Infrastructure Web Access, located at , and login with your operating system credentials. If you have problems with that, check out this post
  • Next, you have to create the virtual machine. In the Vmware Infrastructure Web Access main page, go to Virtual Machine->Create Virtual Machine.
  • In the “Name and location” screen, just give a name to your virtual machine, and select the standard datastore. Click next after each step to move forward.
  • In the “Guest and Operating System” screen, you have to choose the right operating system you intend to install. For Linux, select “Linux operating system”, and the version you want.
  • In the “Memory as Processors” screen, it depends on your machine how much you want to use for your virtual machine. I would recommend at least 512MB for a simple Linux installation. If you're not sure, start with 512MB, and you can increase the memory later if you need to. The default setting for Processors should be ok, usually 1 processor is fine for testing.
  • In the “Hard Disk” screen, select “Create a new virtual disk” and input a proper size for your Linux installation. Disk space is really cheap these days, so put enough space.
  • In the “Network adapter” screen, you definitely want to add networking capabilities to your virtual machine. Select “Add a network adapter”, and choose from the three possible connections. I would recommend bridged, so that your virtual machine will be visible not only from your computer, but from your network.
  • In the “CD/DVD drive” screen, select “Use a physical drive” if you have your Linux distribution written on a CD/DVD of “Use an ISO image” if you have just the .iso file.
  • Don't add no floppy drive on the “Floppy drive” screen, unless you have a good reason to add one.
  • If you need USB, add an USB controller from the “USB controller” screen.
  • After completing all these steps, the virtual machine is ready to be created, click “Finish” to do that. Wait for the machine to be created, you can see the progress at the bottom of the screen. You have your virtual machine now, next you have to install the operating system.
  • Select your newly created virtual machine from the left side of the screen and go to the “Console” tab. In order for you to be able to access the virtual machine, you will need to install a browser plugin that allows you to see the virtual operating system inside your browser. Click on the install plugin link, you will get the plugin installed, and the browser will restart. Log in again to the VMware Infrastructure Web Access ( ) and go to the Console tab of your virtual machine.
  • Now you can start the machine by clicking in the middle of the console. You will have to click again the “Open console in new window” link to open the console.
  • Finally you can see the console. Depending on what you have selected on the “CD/DVD drive” screen, your Linux installation will start automatically if you have selected an existing iso image, or you will have to provide the CD/DVD for the installation to start. From this point it is up to you what you install and how you install it. Try to avoid installing graphical Linux, because it's pointless and it will consume a lot of memory.
After installing Linux, if you want to quickly install PHP and Apache, you might have a look at Zend Server Community Edition. I have detailed in my last post the installation, you might want to have a look(

Now you have your own Linux machine inside your Windows and you can tune it as closely as possible to your production server. It's not necessary to access your machine through the console. As long as it is started you can access it with your favourite SSH client.

Don't forget that you will have to start the virtual machine every time you start your operating system.