Saturday, November 29, 2008

Thrift + HBase + PHP

Thrift is a framework initially developped by Facebook that allows RPC communication between a client and a server written in C++, Java, Python, PHP or Ruby. It generates, using a definition file, data types and services interfaces in the programming language you specify.

Hbase database has a Thrift interface, so it can be called from any language supported by Thrift. It's the easiest way right now to call Hbase from PHP. Here is a little tutorial on how to install Thrift on Ubuntu and how to generate PHP client files used for Hbase access.

First of all, for any information about Thrift, you can have a look at Thrift wiki. This tutorial also take in account that you have apache and php5 installed and working in your ubuntu system.

Install requirement
(From: http://wiki.apache.org/thrift/GettingUbuntuPackages)
You need to install the following packages in order to compile Thrift:
  • Automake
  • LibTool
  • Flex
  • Bison
  • Boost Libraries
In a shell, type:
sudo apt-get install build-essential automake libtool flex bison libboost*
Installing Thrift
(From: http://wiki.apache.org/thrift/ThriftInstallation)
Get latest Thrift sources, unzip and go into the directory:
wget -O thrift.tgz "http://gitweb.thrift-rpc.org/?p=thrift.git;a=snapshot;h=HEAD;sf=tgz"
tar -xzf thrift.tgz
cd thrift
Take note that the above source couldn't compile on Ubuntu 8.10. I had to use a special snapshot from http://gitweb.thrift-rpc.org/?p=thrift.git;a=snapshot;h=1c8c4bb279578cb76bfcaa419d5b06fb7a187614;sf=tgz

Let's now configure, compile and Install Thrift
./boostrap.sh
./configure
make
sudo make install
Generating Thrift client libraries
You should now have a fresh Thrift installation. We now need to generate PHP files that will be included in your application in order to access Hbase. Hbase definition file has been included in Hbase sources, so we will not have to write it. If you have installed Hbase into /usr/local/hbase/, you can copy Thrift definition file into your home:
cp -r /usr/local/hbase/src/java/org/apache/hadoop/hbase/thrift ~/thrift_src
cd ~/thrift_src
Else, you need to modify the above command to match your Hbase installation path. We can now generate PHP files:
thrift -php Hbase.thrift
If you have followed all above the steps correctly, Thrift should have generated a directory named gen_php wich contains 2 php files. Those two files contains classes you will use to access hbase. But those files also depends on Thrift base files that you can find in thrift source directory. Following steps assume that your apache home directory is /var/www. Let's copy Thrift base files and create a "packages" directory wich will contains previously generated files.
cp -r ~/thrift/lib/php/src /var/www/thrift
mkdir /var/www/thrift/packages
cp -r ~/thrift_src/gen_php /var/www/thrift/packages/Hbase

Let's now start Hbase thrift server.
/usr/local/hbase/bin/hbase thrift start
All you need to access Hbase from PHP is now ready to be used. To test the installation, let's use the demo client from Hbase sources.
cp /usr/local/hbase/src/examples/thrift/DemoClient.php /var/www/DemoClient.php
You need to modify the above file (/var/www/DemoClient.php) in order to change Thrift root path. Simply change the value of $GLOBALS['THRIFT_ROOT'] to /var/www/thrift. You should now be able to access the file through apache. The script simply test your hbase installation by creating a table, insert data in it, etc.

That's it. If you have any question, contact me!

1 comments:

Vivi said...

Hi,
Is it possible to read a float from Hbase ?
I got one application that's writing floats and a php one that is trying to read that data but all I got is screwd up code