Saturday, November 29, 2008

Thrift + HBase + PHP

Thrift is a framework initially developped by Facebook that allows RPC communication between a client and a server written in C++, Java, Python, PHP or Ruby. It generates, using a definition file, data types and services interfaces in the programming language you specify.

Hbase database has a Thrift interface, so it can be called from any language supported by Thrift. It's the easiest way right now to call Hbase from PHP. Here is a little tutorial on how to install Thrift on Ubuntu and how to generate PHP client files used for Hbase access.

First of all, for any information about Thrift, you can have a look at Thrift wiki. This tutorial also take in account that you have apache and php5 installed and working in your ubuntu system.

Install requirement
(From: http://wiki.apache.org/thrift/GettingUbuntuPackages)
You need to install the following packages in order to compile Thrift:
  • Automake
  • LibTool
  • Flex
  • Bison
  • Boost Libraries
In a shell, type:
sudo apt-get install build-essential automake libtool flex bison libboost*
Installing Thrift
(From: http://wiki.apache.org/thrift/ThriftInstallation)
Get latest Thrift sources, unzip and go into the directory:
wget -O thrift.tgz "http://gitweb.thrift-rpc.org/?p=thrift.git;a=snapshot;h=HEAD;sf=tgz"
tar -xzf thrift.tgz
cd thrift
Take note that the above source couldn't compile on Ubuntu 8.10. I had to use a special snapshot from http://gitweb.thrift-rpc.org/?p=thrift.git;a=snapshot;h=1c8c4bb279578cb76bfcaa419d5b06fb7a187614;sf=tgz

Let's now configure, compile and Install Thrift
./boostrap.sh
./configure
make
sudo make install
Generating Thrift client libraries
You should now have a fresh Thrift installation. We now need to generate PHP files that will be included in your application in order to access Hbase. Hbase definition file has been included in Hbase sources, so we will not have to write it. If you have installed Hbase into /usr/local/hbase/, you can copy Thrift definition file into your home:
cp -r /usr/local/hbase/src/java/org/apache/hadoop/hbase/thrift ~/thrift_src
cd ~/thrift_src
Else, you need to modify the above command to match your Hbase installation path. We can now generate PHP files:
thrift -php Hbase.thrift
If you have followed all above the steps correctly, Thrift should have generated a directory named gen_php wich contains 2 php files. Those two files contains classes you will use to access hbase. But those files also depends on Thrift base files that you can find in thrift source directory. Following steps assume that your apache home directory is /var/www. Let's copy Thrift base files and create a "packages" directory wich will contains previously generated files.
cp -r ~/thrift/lib/php/src /var/www/thrift
mkdir /var/www/thrift/packages
cp -r ~/thrift_src/gen_php /var/www/thrift/packages/Hbase

Let's now start Hbase thrift server.
/usr/local/hbase/bin/hbase thrift start
All you need to access Hbase from PHP is now ready to be used. To test the installation, let's use the demo client from Hbase sources.
cp /usr/local/hbase/src/examples/thrift/DemoClient.php /var/www/DemoClient.php
You need to modify the above file (/var/www/DemoClient.php) in order to change Thrift root path. Simply change the value of $GLOBALS['THRIFT_ROOT'] to /var/www/thrift. You should now be able to access the file through apache. The script simply test your hbase installation by creating a table, insert data in it, etc.

That's it. If you have any question, contact me!

Wednesday, November 26, 2008

Welcome!

Welcome to my brand new blog. First of all, I need to tell you all that my english is not perfect since my primary language is French, but I'll do the best as I can to write quality posts here.

Why do I start this new blog among all others? I had a personal blog a couple of years ago (it's still online, but I'm not posting on it anymore), but now I want to switch to something more professional; somewhere I can share my professional experiences and discoveries. As I am trying to learn new technologies as they get created, this blog may help people find more easily solutions to problems I have stumbled over too. So don't expect a flood of posts in here, but I'll do my best to maintain this blog.

So what is my background? I'm an undergraduate software engineer at the École the technologie supérieure (ÉTS) in Montréal, Québec, Canada. I'm also a graduate student in computer science of what we call here in Québec a DEC, which is a college diploma we can get instead of going to university. But I had a feeling that is was missing a lot of things, so I've decided to go to the university. I have done many personal and professional projects using LAMP (Linux + Apache + mySQL + PHP) technologies, but also with Microsoft technologies (C#, VB, etc.) I also had the opportunity to work with old technologies in a mainframe environment (Cobol, Easythrieve, Natural). So I have a broad experience and this is why I can and I want to share my discoveries here.

I hope you'll find what you need here and feel free to contact me if you have any question or comment.