Friday, October 12, 2007

FOSS - Made in India !

While returning home from work today, I went to the neighboring super-market and bought the LFY (Linux For You) magazine for the month on an impulse. I have not been a regular reader of this magazine since Feb 2006 when I stopped writing for them. So it was just pure impulse on my part...

I was going through the articles lazily in bed some few minutes back. I started with the review of SLES desktop and then read the article on autopackage. The project is started by a group of young enthusiasts based in U.S and Sweden which makes package installation across Linux distros easy (say "InstallShield for Linux"). I then flipped casually to the next page where there was a guest column by our very own Kenneth Gonsalves, the Chennai based open source enthusiast and activist.

He was asked the typical question which every FOSS related magazine in India loves to repeat in every alternate editions - "Why are Indians not contributing to open source ?" I glanced through his reply. Little did I realize his comments would make my day...

Here I quote his reply verbatim...

"There are a large number of indians in practically all major FOSS projects. However if you look closely, you will find that the vast majority are not resident in India. Yes, there are several hundred people in India actively contributing to projects big and small. But genuine 'Made in India' projects of international repute ? I can see only four or five: Anjuta (Naba Kumar has left the country); HarvestMan by Anand Pillai; Deepofix by Abhas Abinav; IndLinux by Karunakar and team and Coppermine by Tarique Sani..."

Boy, was I not taken aback by surprise and blushing with genuine pride!... :) I have used Anjuta and has great respect for its original developer (Naba Kumar), though little did I know about his current resident country. I think IndLinux and Coppermine are great projects and frankly I have not heard about Deepofix (my bad...). I have always considered my own contribution (HarvestMan) as a david among the goliaths. I have a sense of my place among the international open source developers. I have made a contribution worth mentioning, but I have never considered it accomplished enough to figure among "the list of original contributions" by India to FOSS. It is surely a matter of pride to see that a well known and widely acknowledged FOSS community member thinks about the project like that.

Kenneth, you clearly made my day. Thanks for the kind words. It felt really nice to see the words "HarvestMan" staring back at me from a page in an IT magazine which I was casually flipping in the midnight. Surely for a born-again techie like me, there is no better recognition than something like this.

I consider this the best compliment I have ever received in my life for something I have done. It makes more all the more excited about open source and FOSS and the spirit of sharing code and having fun at the same time.

Monday, October 08, 2007 updated

Since I completed the port of Rbnarcissus to last week, I have been working on getting the bugs fixed. A lot of work has been done on this during the past week, and the code is now parsing a set of more than 30 javascript input samples of varying size and complexity, correctly. Also, the behaviour is very close to that of Rbnarcissus. Both the parsers now seem to fail at the same places in the code for samples which they can't parse, which is a good sign that the code now approximates Rbnarcissus very well.

Since the code is now stable and somewhat usable, I have made it publicly available. The code can be browsed in the folder. There won't be any packages or formal documentation till I feel that the code is beta quality and can be made available as a Python package.

Sunday, September 30, 2007

Today I finished porting of Rbnarcissus to Python.

I managed to finish the porting in a total of 7 days, spending approximately
2-3 hours per day. The test parse script has also been ported. With this
I managed to produce the JS parse tree for the following simple Javascript

function test()
var a = 10
var b = 20
var c = a + b

As a result of parsing, the following function dictionary was printed.

{test: []}

This library will be made available as part of HarvestMan and the EIAO projects.
This is perhaps going to be the first open source pure Python parser for Javascript.

I need to do a bit more of testing on more complex Javascript code before I make the code publicly available. This might take another week or so, depending upon how much time I get to spend on this in the coming days...

Tuesday, September 18, 2007

Rbnarcissus Porting - Day 2

In the second marathon day of Rbnarcissus porting I completed porting of another 5 functions in the Parser.rb module. What is pending are two huge functions which parses statements and expressions. That should be taken care by another day or two of hectic hacking...

Monday, September 17, 2007

BangPypers move in to

The entire Python family of BangPypers moved enmasse to their new home at the website three days back, on Sep 14 2007.

Jeff Rush of Python Advocacy Blog was instrumental in creating the new mailing list hosted at, after I sent him a request regarding the same. Jeff was immensely kind and helpful during the whole process, which got completed by end of the day. Thanks a lot Jeff!

The move came after a slew of discussions which started with this thread in early August by Anand. C (strandpyper). Quickly a kind of agreement was reached among the participants of the thread about moving out to a better place from the existing Y! group, preferably at the website itself.

A lot of people participated in the discussion, giving valuable suggestions, which finally helped to reach an agreement and making the task of moving the members to the new list a painless process.

The new mailing list is public and open to anyone. This should hopefully expose the BangPypers members to the larger Python community in the international scene and give the group more visibility. It will be nice to see if any kind of larger group activities can be arranged as a part of such an exposure.

The group still lacks a coherent theme, which needs to be painted into so that it functions as a rather tight group with shared interests, than the current fragmented one. One way of doing this is to execute open source Python projects as a group, by forming small interest groups inside the larger group, which can then focus on a particular project. The monthly meetings also need to be revived, which can bring more thought into what can be done in the coming months.

If you are interested in Python and/or the BangPypers group, feel free to comment with your thoughts.

Rbnarcissus Porting - Day 1

Today I completed almost 40% of the the porting of Rbnarcissus to Python. All the data structures and regular expressions have been ported along with the "Tokenizer" class.
There still remains around 13 classes to be ported which forms the bulk (60%) of the code.

If I keep up the same pace, I should be done with this in another 3-4 days. This could become a useful tool for Python programmers, having a pure Python parser for Javascript.

Watch this space for more updates.

Saturday, September 08, 2007

The Ruby Way - A Python programmer learns Ruby

Ruby is the language I am always putting off to learning the next day. I came across Ruby almost at the same time I started learning Python. However due to its similarities with Perl, I was never able to take an affinity towards the language.

I knew Ruby needed to be in my toolbox of languages and it was only a matter of time before I got to it. This happened last week, trying to solve a very practical programming problem.

I was trying to develop a Javascript parser/tokenizer for HarvestMan so that HarvestMan can crawl pages which defines the DOM dynamically using Javascript. I have been at this problem for some time now, but never came across a pure Python or even a C/C++ extension Javascript parser I could use. Last week I came across Rbnarcissus, a pure Ruby port of Narcissus, the open source Javascript engine written in Javascript.

I have set upon myself the task of porting this code to pure Python. I figured I knew enough Ruby to do this without any additional help, but one look at the code and I realized I needed help. I bought Hal Fulton's excellent Ruby book The Ruby Way from a book shop. (The book is a bit pricey for a low priced Indian edition, but it is worth the money.)

I have been spending the last two days with the book. I have realized a few things about the The Ruby Way when compared to the Zen of Python.

1. Ruby follows the Perl paradigm of There is more than one obvious way to do it, when compared to the Pythonic There should be one-- and preferably only one --obvious way to do it. (An import this in a Python interpreter prompt gives you the Zen of Python).

2. Ruby is a more complete object oriented language than Python and empowers its types and objects much more than Python does. However this also makes Ruby slightly more harder to learn than Python.

I feel Python is still the ideal language for a newbie who wants to learn a very high level programming language. However Ruby is much more powerful and suited for the expert programmer who expects more power out of his objects and types.

I have not got completely into the Ruby Way yet, but I am on my way. I am hoping that combining the Ruby Way with the Zen of Python will lead me to the Tao of Programming...

Friday, August 31, 2007

Some open (source) clarifications...

Of late, I have been getting a few queries from people asking questions as follows:

"Have you dropped open source programming altogether ?"
"Will you stop coding in Python anymore ?"
"Have you lost interest in opensource ?"

When I thought it caused a confusion and some slight tensions with a friend who commands a lot of respect and trust and is a major facilitator of open source, I thought it was time to put it in perspective by blogging about it.

I think I am myself to blame for some of these confusions, which resulted from a recent post in my blog when I mentioned I am "taking a break from open source".

I did not intend a complete break at all...rather the post was an impulsive reaction. Let me explain.

First of all, I have not dropped open source development. I just love it too much to drop it and it has now become second nature to me after being an open source developer for nearly four years. I *cannot* stop being one overnight.

Regarding Python, it is my favorite language. I can never stop coding in Python. I continue to write open source code in Python, with most of that efforts going towards my open source project HarvestMan, whose 2.0 version is under active development. Even otherwise, my default reaction is to start a Python prompt if I have to perform some simple computation or even an arithmetic calculation!

I have not lost interest in open source at all. What happened was that I lost interest in "commercializing open source" or rather working for entities which focus on commercializing open source without contributing anything back. I had some (what appears to be now) utopian dreams in this respect, and imagined a scenario where such predator companies coexist with the people who spent their time doing open source development and contribute to a greater goal. I have realized that such dreams are pipe dreams and that most of these new "open source companies" are in it to try and make a fast buck. They have no real intentions of playing it long term or making a difference.

However when you associate yourself with such entities, you tend to associate your concept of open source with theirs some times. This was what happened to me. I spent too much time at such a place for my concepts of open source and community development to get polluted and corrupted, which ended up confusing and frustrating me a lot. When I quit, the natural reaction was a general apathy towards everything which was labeled "open source" for a while. This is something like those allergic reactions you get when you are exposed to a change in weather or surroundings; however the good thing is that an allergic reaction is not a permanent disease :)

Thankfully such reactions are not long lasting and I have come out of my black reaction finally. One good lesson I learned in the whole process was to keep my ideology separate from the ideology of the place where I work and not to mix both. If you do that you can avoid feeling frustrated when things do not work out the way you thought they will. An investment of time can be fruitful or fruitless and you may not be greatly affected; however an investment of ideology and principles can be quite frustrating if it does not bear fruit, the way you thought it would.

So, I am back to my good old ways and feeling better about it all. I will continue to be active in open source and help the community (and myself) by contributing any little code and effort I can in terms of my small projects.

I think that renaming my blog to what it was originally might be a good start and that is what I have just done. Thanks to everyone who inquired about this and well, I am being truthful to my good old ways.

Sunday, August 19, 2007

Random title

Looks like I am changing the title of my blog too randomly these days. I kind of like the latest one, and I might just settle on that, for a random, arbitrary period of time !

HarvestMan 2.0

I have kickstarted the release process of HarvestMan 2.0, the next big version update of HarvestMan, after a gap of nearly two years. A lot of development has gone into the program during the last two years, with countless bugs getting fixed and numerous new features getting added.

As part of the process, I am making 2.0 alpha package drops available on the website. These drops are of the complete HarvestMan package with source code and documentation. You can check out the latest news on the website for more information.

If you are an existing user of HarvestMan, you should check out the alpha drops.
On the other hand, if you are new to HarvestMan, then also you are welcome to check these out, since HarvestMan-2.0 is much more robust and feature-rich than the current HarvestMan release, namely 1.4.6, which was released two years back.

Monday, August 06, 2007

Chasing dollars...?

Apparently, the cat is out of the bag. Read this.

Wednesday, August 01, 2007

A self-demotion

I just demoted myself from owner of BangPypers to moderator. Truth is that I have been getting a bit bored with open source & community in general. Moreover I have not really been able to do much Python community work for the last one year as I could do in the first year of inception of BangPypers. Since I felt I am not doing justice as being the owner of the group, I decided to give it a break.

I shall mostly demote myself to just member pretty soon. Taking a break from Python might allow me to look at other languages (open source or others) out there and allow me to do something new...

Good bye BangPypers-owner. It was nice being you...

Tuesday, July 17, 2007

A break from open source

I am taking a break from open source, in terms of writing open source software and spending my time working with an "open source" company. My recent experiences in working full time in open source have not been very positive or pleasant. The focus has shifted - in my new avatar I won't have to think about open source as a problem to solve.

Does it mean I will stop looking into open source altogether ? No, for a couple of reasons - now a days open source is all pervasive and you need to keep in touch with it if you want to be in sync with the changes happening in the software landscape around you, of which you are a part. It still remains the best way for a developer to publish his original ideas and then take it to a larger audience with almost zero effort; also, a lot of quality projects and products are open source, which means though your product might be proprietary it is influenced by open source, directly (through borrowing code) or indirectly (through borrowing ideas/algorithms).

I will be working on HarvestMan also which will remain as an open source project.

In fact, the main changes are in two things.

1. The name of this blog has changed - It no longer has open source in it :)
2. I wont be paid to write tools for open source integration or making use of open source in commercial enterprises - I am out of that business - entirely.

Perhaps one day I will be back to that business. Of course the software landscape
keeps changing daily and the "open source companies" of the future would be mostly having totally different business models from the current ones - an interesting future to watch out for.

Friday, July 13, 2007

HarvestMan crawls up in Google rankings

About 2 months after HarvestMan moved from freezope to the new site, it is nice to note that is back up in the top 10 Google queries for "HarvestMan". In fact, I note it is the 10th result as of writing this post.

Thanks Google! :)

Tuesday, July 10, 2007

Stacked up or hyped up ?

There is a rather old article on computerworld website. The article talks about what it calls the recent "scramble" to create application stacks for enterprises using open source components. The criticism is that the market does not really exist but is a rather hyped up buzzword. One tends to draw subtle parallels with push technology of the heady dotcom days in one's mind. The technology existed, but it was trying to solve a problem that did not exist.

Open source stack vendors seems to be in a similar situation, developing solutions in search of a problem. When will this hype cycle burst ?

Wednesday, July 04, 2007

Barcamp Bangalore is happening

And this time, many participants from BangPypers will be there. Go to the BangPypers wiki on Barcamp website to catch all the action.

Barcamp is happening on July 28 and 29 at IIM Bangalore.

Monday, July 02, 2007

Adios Amigas

I submitted my resignation to my employer today. I will be seeing the last of them on the 20th of this month.

Friday, June 22, 2007

A new look

My blog has a new look after Blogger "forced" me to change the template.

Tuesday, June 19, 2007

Power of Twisted

I have been planning to learn Twisted for long, but never got a real problem to try it. Today, I was writing an XML-RPC server, and felt that this was a good time to learn Twisted, since I was not feeling very comfortable with the rather simple XML RPC server provided by Python standard library.

I read the Twisted HOWTO on XML-RPC and was off and going within 30 minutes. Within minutes I had the same XML-RPC server rewritten to use Twisted's powerful reactor framework. It took some time to figure out how to add basic HTTP authentication, but with some googling I was able to do this also in a couple of hours!

Now, I have a small framework which consists of a module which provides an extensible XML-RPC server using twisted with basic HTTP authentication. Since Twisted supports SOAP also out-of-the-box, it is quite simple to extend this to support SOAP also.

The power of Twisted is quite amazing. I am thinking of writing a version of HarvestMan which runs on top of twisted...

A scene from Dilbert

Dilbert and his friends Alice, Asok and Ratbert in the cafeteria.

Ratbert: Hiya Dilbert, your face is beaming; What is news ?
Dilbert: The Boss has moved on.
Asok the intern: How does it make a difference ? These managers are the same everywhere!
Ratbert: No, you are underestimating the black powers of The Boss. He can make it or break it. I have heard, he breaks it most of the time.
Alice: I heard Catbert might also get a move on.
Ratbert: Really ?
Alice: I heard something.
Ratbert: Good riddance.
Dilbert: Yup.

Monday, May 28, 2007

HarvestMan moves to

HarvestMan project has moved to a new address from today. It was earlier hosted gratis on freezope. However the freezope virtual hosts have not been working properly for more than a month, which has affected the availability of the HarvestMan project website. Google also reacted promptly, taking HarvestMan project off its top queries for "harvestman". Whereas it was the number one result say two months back, it is nowhere to be seen in the top 20 results now!

The new website has the same look and feel as the original one, since I simply copied the pages over. Hope this would help the website and the project to recapture some lost page views in the coming days.

Monday, May 21, 2007

Art of Innovation

I am currently reading the book by Tom Kelley with the same name. Tom Kelley is the general manager of IDEO a legendary design firm based in Palo Alto, California. IDEO is well-known as a product design firm which has created a number of innovative products and designs over the years since 1991. Some of the best examples are the original Apple mouse, the Palm V PDA, the first external module for the Handspring Visor (Eyemodule) etc. The company has won 48 IDEA (Industrial Design Excellence Award) awards, much more than any other firm in history.

The book explains the various aspects of the work culture at IDEO, giving a very vivid picture to the reader of the different elements of the work culture that makes the company have an edge in innovation when compared to similar firms. The title "Art of Innovation" is a bit misleading since the book is all about IDEO and how it is able to achieve excellence and success in its design. It focuses on the practical aspects of weaving the "innovation factor" into the business processes in a company, and does a very good and thorough job of it. A must-read for any body who is working in industries which involve creative work.

I shall be posting a more complete review of the book after I finish reading it.

Wednesday, April 25, 2007

What I am doing these days

These are the two things I am doing these days, apart from regular work at office.

1. Developing the next version of HarvestMan
2. Developing the next version of HarvestMan

Well, that is right... It was not a typing mistake :-)

HarvestMan is going to get a major update coming May, and it will be the result of more than 1.5 years of work. In fact the program has changed so much that I am changing the major version. It is going to be HarvestMan-2.0.

There are certain surprises with this HarvestMan release. Some of the interesting changes for NLP and Computational Linguistics programmers will be the addition of a plugin API that makes developing extensions for HarvestMan a breeze. In fact, the current CVS of HarvestMan already features an extension which binds it to an existing open source indexing engine. Apart from that, the program features a number of changes from the earlier version (1.4.6) so that it is almost on its way to becoming a "platform for web crawler software development", where I envision it to be.

Apart from that, this time HarvestMan will consist of two apps in one - that's right. There will be two applications using the same codebase. The crawler application (HarvestMan, of course) and a brand new web downloader application which supports multipart downloads. Let the name of this application be a mystery for the time being. The application just might change the command-line download experience of Unix/Linux users from the typical wget one. I will write more about it in the coming weeks.

In fact all of this is already in CVS. Anyone interested can checkout the latest code from berlios repository using anonymous CVS. There is not much documentation apart from the documentation in the code, but the code is pretty stable at the moment.

This should get released by mid May.

Tuesday, April 24, 2007

Open source and Innovation

What makes a successful open source project ? What makes a successful open source business ? How does successful open source projects make the transition to good business which still keep with the spirit of open source and open standards ?

These are questions any developer who is a serious contributor to open source would be interested in. Even if you are not an open source contributor, you will be interested in these questions, if your company is working in open source.

According to my experience so far, with my own projects and with some of the international projects I have participated in, a successful open source projects brings some new ideas to the table. New ideas need not be confused with new way of doing things or new I.P. It can be a new implementation of an existing protocol, it can be an open implementation of a proprietary standard, or it can be a project that uses existing open source components or applications to solve an existing problem (or a new one) in interesting and innovative ways. These need not always generate new kinds of intellectual property.

Yes, the key here is innovation. Good open source projects bring a fresh way of solving existing problems; they give a fresh perspective to existing way of doing things. Sometimes they are able to rewrite the rules by capturing the imagination of many hundreds of developers and thousands of supporting community members - a good example is the Firefox community. Some times, it will be a rather closeted group of skilled people in a rather niche area who finds a void in the experience of open source applications/operating systems and tries to fill the gap - a good example is the Beryl/Compiz projects which are working hard to bring display compositing to the Linux and open source crowd.

However, a common thread to all these project is this - they innovate. They innovate in fresh ideas, simplifying user experience and sometimes on performance. They often open up an entire new facet to an existing problem which makes programming a joy.

What do successful companies in open source have in common ? They understand the importance of keeping the developer crowd happy. They are keen to become good citizens of the open source community and contribute either their manpower or projects to the community - some do both. They understand that it is important not to just become consumers of open source but also stakeholders and participants.

When a company fails to understand this, or fails to create a working, effective developer policy towards open sourcing, it is prone to be assigned the category of a second or third rate citizen in the open source community. By just becoming a consumer of open source and not contributing enough, it risks alienating the coding crowd who tends to think of the company as a predator, not as an ally.

Most often, such companies never learn to use open source the right way too. By not participating enough, they fail to understand the driving force behind open source and people working in such projects. This in turn makes them less effective users of such software. For example, a company that brands itself as an open source integrator can never be quite effective if it does not understand the open source projects it is integrating and does not contribute developer resources to such projects; in fact, it is not even necessary to contribute directly most of the time. Indirect participation such as hosting meetings, contributing tools, toolchains and providing a platform for discussion and creation of new ideas are also good ways of contribution.

A company not doing any of these and still claiming to work in open source is somehow not doing the right thing. Such strategies are doomed to fail in the long term and even prove counter productive. In the long run a company like this is bound to move away from open source or bound to fail.