Wednesday, November 23, 2005

Workshop in Norway

I am leaving tonight (Nov 24 2005, 2.15 am) to Grimstad, Norway for taking part in the 2nd international EIAO conference and workshop. This conference is organised by AUC as part of the EIAO project.

I will be making a small presentation on D-HarvestMan. I am returning to India on 1st Dec.

Here is a brief abstract of the proceedings in Grimstad.

2nd international EIAO conference and worskhop

24.11.05
1. Presentation of overall timeplan
2. Presentation of preliminary Observatory anatomy
3. WAM coordination (D3.1.1, UWEM, D5.1.1.1 and D5.1.1.2 )
4. Coordination of WP3, WP4 and WP5 release 1.0 time plans
5. Overall release 1.0 planning and actions

25.11.05

1. Welcome Mikael Snaprud
2. D-Crawler - Anand B. Pillai
3. Presentation of anatomy methodology Alf Fredvik
4. SW development process Design and test. Parastoo Mohagheghi,
Per Wollebæk

26.11.05 and 27.11.05

1. Presentation of overall timeplan
2. Presentation of preliminary Observatory anatomy
3. Development of a DW anatomy
4. Description of each anatomy
5. Scheduling of each anatomy
6. SW development process (Design and test)
7. Coordination of WP6 and WP5 release 1.0 time plans
8. Overall release 1.0 planning and actions

28.11.05
Follow ups and documentation
1. Inspection and adjustment of plans for each anatomy
2. Strategy for outreach and dissemination towards
and after release 1.0
3. Summary and outlook - HarvestMan as a vehicle for
research based teaching

Thursday, November 17, 2005

Internet Summit in Tunis

The World Summit on Information Society (WSIS), dubbed as the "Internet Showdown" by CNET, is being held under the aegis of the U.N in Tunis, the capital of Tunisia at present.

Dr.Mikael Snaprud of the EIAO project presented a paper in the conference(Past, Presence and Future of Research in the Information Society) associated to this summit on Nov 15. The paper talks about the role open source plays in ICT education and research, with the EIAO project as the background. The presentation associated with this paper is available at the EIAO Publications website.

The paper is authored by Mikael Snaprud, Agata Sawicka, Anand Pillai(myself), Nina Olsen, Morten.G.Olsen, Vidar Laupsa and Terje Gjøsæter.

Monday, November 14, 2005

D-HarvestMan prototype is born

It is 2.30 am in the morning right now. I am in a good mood. The reason is that I just finished coding and testing the basic D-HarvestMan prototype for a single master, single slave configuration. And it works! The master was able to successfully bootstrap the slave crawler with a new domain and let it start downloading files from it. Hip hip hooray!

D-HarvestMan is a project to write a distributed crawler on top of the existing HarvestMan. Distributed programming is always exciting, and a distributed crawler is even more so :-)

Friday, November 11, 2005

The fox is one year old

Firefox turned one year old on Nov 9, two days back. The wily fox has set the web on fire, ever since it debuted on Nov 9 2004. With the 1.5 final release on the way, it is well on course to capture further market share from the beleaguered I.E .

Three cheers to the Firefox team and wish all the best to the fox for its second year!

Monday, November 07, 2005

OOo plugin for Firefox

In the context of my previous post, it is interesting to see that people have already talked about how an OOo plugin for Firefox can be a killer collaboration application on par with MS Sharepoint.

Apparently a Mozilla plugin is already available in OOo 2.0. The problem with it is that you enable the plugin from inside OOo, which requires that OOo is already installed in your machine.

From the Mozilla OOo plugin specification,


"Plug-in usability

The plug-in works only if a working OpenOffice.org installation is found on the system"
"



This reduces the usability of the plugin, I think. Developing a light-weight OOo plugin for Mozilla/Firefox which can be installed on the fly could be pretty useful.

Saturday, November 05, 2005

Some random thoughts on a web-based office suite

Ever since Sun and Google announced their partnership early in October, speculation has been rife on the possibility of a web-based office suite, aptly titled "GoogleOffice". However, there has not been any strong indication of such an effort underway. In fact a number of industry watchers were disappointed when Google and Sun announced that the initial collaboration would be on bundling the Google toolbar with Java runtime downloads.

Now that Microsoft is alligning itself as a services provider and trying to offer integrated solutions by bundling its diverse product portfolio (Office, MSN, messenger etc) along with its new service initiatives (Windows Live, Office Live), it is probably time that Google looked into countering these overtures with appropriate answers - I think there is nothing more fitting here than an office solution which integrates Gmail, Google Talk, Google Desktop and the Firefox web browser.

Perhaps such an effort is already underway in Google Labs. However, here is my vision for such a solution, a kind of blue print for a future GoogleOffice on the web.

Google office will be a collaborative suite integrating Gmail, Google Desktop, Google Talk and Firefox. Ideally it should have the following components:

1. An open-office plugin for Firefox
2. An extension to Google desktop that allows Gmail attachments to be searched and accessed.
3. An extension to Google talk that allows members to access attachments in their Gmail account and also share files and folders through Google Desktop.

Let me explain:

1. The Firefox plugin will allow one to view Openoffice documents inside the Firefox web-browser. Initially this need to support only OOo native file formats and the OpenDocument format, but MS office support would be preferable.
2. Right now attachments cannot be searched in Gmail. Google needs to add this capability to Gmail. Attachments can be searched by name, but they also need to be searchable by content.
3. Google desktop integrates with Gmail, but again attachments are not searchable or accessible. This capability need to be added to Google Desktop so that one can search and access documents stored as attachments in his Gmail id through Google Desktop.
4. The same capability should be added to Google Talk so that one can search for attached documents from Google Talk.
5. Integration between Google Talk and Google Desktop so that chatters can share documents with each other and search each others desktop, given sufficient access control privileges.

All these pieces will allow for a basic GoogleOffice over the web. Let us look at some common scenarios:

1. Someone is browsing in an Internet cafe - He wants to view an OOo document sent to his Gmail id as email attachment. Typically Internet cafes do not have OOo so he is at a loss (This happens to me quite often). This is where an OOo plugin for Firefox can help. Gmail need not know anything about this plugin. It can be a regular firefox plugin. In this case, Firefox will detect that the plugin is not installed, will download and install it automatically. Voila, you can view your OOo document inside firefox.

I dont think it will be too difficult to develop such a plugin, considering that the source code for OOo is open and the OOo file formats are well documented. This will also help in large scale acceptance of OOo file formats in a way similar to PDF. Editing capabilities will also be nice but this could be tough to implement in a browser plugin.

I would like to see plugins for all OOo file formats but specifically OOo writer (.sxw), OOo impress (.sxi) and the OpenDocument formats.

2. Adding capability to search and find attachments in Gmail will allow people to use Gmail as a sort of virtual storage for their documents (read office documents). I tend to do this even now, but because the attachments are not searchable, the experience is crippled. Initially these can be added for well known and open formats such as PDF and the OOo file formats.

3. Once Gmail is enhanced with advanced capabilities to search attachments, this capability should be integrated with Google desktop which can then index the attachments and make them searchable from the desktop. Considering the work involved in indexing large attachments, it actually makes sense to add this capabillity only to Google desktop rather than onto Gmail directly.

4. This opens up the possibility of integrating Firefox, Gmail and Google Desktop for searching, accessing and modifying office documents. One can use Google Desktop to search office doucments stored in his Gmail account, then open and edit them on his desktop inside Firefox using the OOo plugin. If OOo is installed on the machine already, it can be used instead.

5. The last missing piece is Google Talk. Google Talk should integrate with Google desktop allowing querying of documents (Gmail, desktop, photos etc) through Google Talk. Google should add capabilities to Google Talk which will make this possible not only on one's own desktop but across desktops.

That is, you can give privilges in Google Talk to selected Google Talk users (your friends, co-workers, family), to search and access documents from inside your desktop and also your Gmail id. These can be controlled by access at various levels - read-only, read-write etc.

If one finds documents of interest they can be shared and edited online. Google should provide something like a shared space for two interested parties to share documents as a part of Google talk. This space can either be part of Gmail or separate from it. However, it should allow for saving documents by multiple Google talk users, acting as a kind of collaborative space.


Thus Google Talk and Google desktop along with Gmail and Firefox can be used to build a virtual collaborative office suite on the web which if done well, can probably pose some compeition to Microsoft office solutions and the recent office collaboration initiatives. It will also allow to take the OOo efforts to the web and provide it as a part of a services offering instead of the current stand-alone product.

Perhaps this vision is a bit grand, but I don't think it is a difficult one for Google and the open source (read Openoffice) community. All the pieces are already there, they just require some additional capabilities and some plumbing to work as a unified, web-based solution.

I have not thought to deep about the technical aspects of such a solution but a very interesting thought will be the role Java and Google toolbar can play in this integrated approach. Perhaps Java can be used to develop the Firefox OOo plugin also.

I hope we can expect to see a web-based office solution from Google, Sun and the OOo community within the next 12 months.

Wednesday, November 02, 2005

FOSS.in

FOSS.in has published the second list of speakers of the event. Some notables who are speaking include the legendary Alan Cox, Danese Cooper, David Fetter, Jeremy Zawodny, Jonathan Corbet, Andrew Cowie, Harald Welte and Brian Behlendorf.

Murugan Pal is giving a talk on "Open source Alternatives".

Also the talk by Zaheda Bhorat of Google on Google and open source seems interesting.

FOSS.in is scheduled from 29th Nov to 2nd Dec 2005 at Bangalore.

On Open Voting

An article published in a local daily of Granite Bay, CA, talks about accountability and the Open Voting Consortium.

Read the article.

What is my interest in this ? Well, I happen to be part of the team that originally developed the OVC prototype, which was demonstrated in April 1 2004. The OVC project was my first experience in working for an international open source project. The architects of the system decided to use Python for the project, which was how I got interested in it.

I am also a founding member of the OVC.

Saturday, October 22, 2005

International Conference on Digital Inclusion and Open Source

The annual conference on "Digal Inclusion and Open Source" took place in Oslo, Norway from Oct 20-21, 2005.

A paper co-authored by me, Parastoo Mohaghegi, Mikael Snaprud & Nils Ulltveit-Moe was presented in the conference. The paper highlights the activities of the EIAO project in bringing together users and external contributors from different parts of the world in an EU sponsored project.

Read the abstract of the paper.

Tuesday, October 04, 2005

HarvestMan web-site redesign

I have finally re-designed the HarvestMan web-site! It has been something I have been planning since the beginning of this year! When I finally did it, it took me only two days. Surprising, how much savings one can get in terms of time, if one really focuses on the task at hand.

Wednesday, September 28, 2005

Microsoft and JBoss shake hands

Probably bad news for LAMP and other open source stacks. Complete news is here.

Tuesday, September 13, 2005

Pyrex

I am currently learning Pyrex, a language designed to take the pain out of writing C-extension modules for Python.

I plan to rewrite performance intensive portions of HarvestMan using Pyrex. These enhancments will be available as part of the next major HarvestMan release, version 1.5.

Saturday, September 10, 2005

Cathedral tries to recruit Bazaar!

No kidding :-). Microsoft apparently tried to recruit Eric Raymond. If you don't know who Eric Raymond is, spend your afternoon reading up the excellent Cathedral and Bazaar essays, some of the best essays written on the open source model. He also happens to be one of the co-founders of OSI. Talk about Bush trying to recruit Osama Bin Laden for homeland security!

Read more about Microsoft's Mea Culpa in this article posted on ESR's blog.

Friday, September 09, 2005

Standalone Executables on Windows using py2exe

The latest release of py2exe, namely py2exe 0.6.1 allows to create single executables on Windows. This is an improvement over the earlier versions which used to create a host of files around the main executable. I think the new feature is a welcome one, especially for a project like HarvestMan which has a number of dependencies. If you try to create an executable for HarvestMan with existing versions
of py2exe, you get quite a lot of .pyd files which are well, a bit confusing.

I am looking forward to create standalone executables for HarvestMan using the latest py2exe and provide downloads for them. I think this should boost the popularity of HarvestMan, since many Windows users I know could not be bothered with going through all the steps to install a pure Python package such as HarvestMan. Downloading and installing a single file executable is much easier.

Expect win32 downloads of HarvestMan soon. Three cheers to py2exe and Thomas Heller.

Wednesday, August 31, 2005

Uraga is dead - Long live Uraga!

Uraga is dead. Swaroop has confirmed this in a post to BangPypers.
I long doubted this might be the case, since there was no mention of Uraga in Swaroop's blog for quite some time. The reason he cites is job pressure; as if guys who contribute to open source do not do justification to their daily job! It sounds ironic, to say the least.

I have a principle which I apply in any open source or professional work I undertake; that is, if I propose an idea, I will try at least to do a basic prototype implementation of the same. More so, if I am talking and letting the world know about it. It is a basic contract that one should have to the community with which one interacts, especially when one tries to market the community with the tag of his idea. In this case, the community is BangPypers and the idea is Uraga of course.

If you don't fulfill this basic social contract, then you are not fit to be an open source contributor, let alone an open source project initiator. I hope some of the new-age geeks who looks to open source for quick stardom realizes this.

Wednesday, August 24, 2005

Design Patterns - Modelling the Singleton in Python

Singleton is a design pattern that seems to interest everyone, especially in the Python world.

I was doing a Google search on the ways in which Python implements the Singleton design pattern.

The results showed that in doing this in Python, you are limited only by your imagination. Unlike C++ or Java, you are not limited to a certain strategy of modeling the Singleton in Python.

I thought it was a good idea to gather the different Singleton solutions in Python and post it in a single post (pun intended) here. In this post, I list seven ways of modelling the Singleton in Python which looks elegant to me. I am not including some overly verbose or cryptic solutions which you will find if you perform the Google search above.

Though I have no preference for any particular solution, I have ordered them in the order of what I think is the least elegant solution, to the most elegant one. Of course this is purely personal! :-)

A word of caution: Except for the Borg solution, the rest of them will work only with new style classes. Also note that some solutions are exactly the same, though they look different. I have explained them as we go along.

The most basic solution overrides the __new__ method in an outer class, returning an instance of an inner class as the Singleton. Here it is:
class Singleton1(object):
    """ Singleton by overriding __new__ and using an inner
class by
using new style classes """

class __Singleton(object):
pass

__instance = None

def __new__(cls):
if not Singleton1.__instance:
Singleton1.__instance = Singleton1.__Singleton()
return Singleton1.__instance
Here is this solution in action:
s1=Singleton1()
print s1
s2=Singleton1()
print s2
s3=Singleton1()
print s3
<__main__.__Singleton object at 0x009F1390>
<__main__.__Singleton object at 0x009F1390>
<__main__.__Singleton object at 0x009F1390>

Clearly the drawback with this solution is that it is not a Singleton in its true sense. The Singleton class does not return an instance of itself, but an instance of an inner class. In other words, what appears to be the Singleton class is actually a class wrapper around an inner class, which is the actual Singleton. Not very elegant.

The next solution fixes this problem. It works directly with the guts of the class by accessing the classe's dictionary.

class Singleton2(object):

""" Singleton by using new style classes """
    def __new__(cls):
if not '_the_instance' in cls.__dict__:
cls._the_instance = object.__new__(cls)
return cls._the_instance
All right. Is there something magical about the _the_instance attribute ? Nothing. So why can't we replace it with a direct class level attribute? Yes, you can though there is not much difference in both technically. However, it looks like a kind of combination of the first solution with the second one (which it is not), so here it is for illustration purposes.

class Singleton3(object):
    """ Singleton by using direct class attribute
access without using cls.__dict__.
This might
look different from Singleton2, but in fact
it is the same. """


__instance = None

def __new__(cls):
if not Singleton3.__instance:
Singleton3.__instance = object.__new__(cls)
return Singleton3.__instance
All right. Enough of fooling around with classes directly. Can't we do this by using metaclass magic? Looks like you can. And with most metaclass solutions, it seems to be somehow more elegant than directly putting the logic inside the class!

Here is the solution from Bruce Eckel's Thinking in Python.
class SingletonMetaClass(type):
def __init__(cls,name,bases,dict):
super(SingletonMetaClass,cls) .__init__(name,bases,dict)
original_new = cls.__new__
def my_new(cls,*args,**kwds):
if cls.instance == None:
cls.instance = original_new(cls,*args,**kwds)
return cls.instance
cls.instance = None
cls.__new__ = staticmethod(my_new)

class Singleton4(object):
__metaclass__ = SingletonMetaClass
The idea is to override the __new__ method of the object's class right in it's metaclass's __init__ method. This is done the first time the object is created. The overrided __new__ works quite similar to the one in Singleton1,Singleton2 and Singleton3. Thereafter, everytime you create an object of Singleton4, it will call the __init__ in its metaclass where the magic happens.

The elegancy of this solution comes from the fact that, the extra code for the class is just one line, where we assign the __metaclass__ attribute. All the magic resides in the metaclass, which allows one to quickly convert his class to a Singleton by adding just this one line.

NOTE: In fact the metaclass solutions for Singletons (or any other patterns for that matter) are not class scoped solutions, but type scoped ones. It might take some time to wrap your head around that, if you come from a C++ background.

But does this look a bit cryptic? Well, I should say yes since it took some time for me to figure out what is happening here. Apparently, you don't need to take all that trouble to get it right. Here is the above solution re-written, but without the inner function and all that.

class SingletonMetaClass(type):
    """ Singleton using metaclasses by overriding
the __init__ method, 2nd version.
"""

def my_new(cls, *args, **kwargs):
if not cls.instance:
cls.instance = object.__new__(cls)
return cls.instance

def __init__(cls, name, bases, dct):
super(SingletonMetaClass, cls).__init__(name, bases, dct)
cls.instance = None
cls.__new__ = cls.my_new

class Singleton5(object):
__metaclass__ = SingletonMetaClass
The above solution is a re-write of Singleton4, but lesser cryptic and more readable.

Is that all you can do with metaclasses and the Singleton in Python? The answer is No. Looks like there is a much more elegant way of doing this using metaclasses. It is done by overriding
the __call__ method in the metaclass, instead of the __init__ method. This solution is the ASPN Python Cookbook Recipe #412551 by Daniel Brodie. I am just copying it here.

class SingletonMetaClass(type):
    """ Singleton using metaclasses by overriding the __call__ method.
Original code courtesy from ASPN Python Cookbook recipe number
412551 """

def __init__(self, *args):
type.__init__(self, *args)
self._instances = {}

def __call__(self, *args):
if not args in self._instances:
self._instances[args] = type.__call__(self, *args)
return self._instances[args]

class Singleton6(object):
__metaclass__ = SingletonMetaClass
Before I conclude, I should include one of the most ingenious methods of doing the Singleton in Python, created by Alex Martelli. This is the so-called Borg non-pattern. This and the concept of non-patterns is discussed in detail here.

The Borg is unique in that it re-defines the problem Singleton is trying to solve. Instead of trying to ensure that the unique instance maps to a unique memory location, Borg ensures that the state of the various instances are shared and hence the various instances are in effect, the same. In other words, Borg focuses on object equivalence instead of object identity which is what Singleton offers.

Here is the Borg non-pattern, applied to the Singleton problem.

class Singleton7:
    """ Alex martelli's Borg non-pattern. Not exactly
a singleton. Focus on equivalence of state rather than the
uniqueness of the Singleton instance """

__shared_state = {}
def __init__(self):
self.__dict__ = self.__shared_state
Here is the Borg non-pattern in action.
s1=Singleton7()
# Set s1's state
s1.x = 100
print s1.__dict__
s2=Singleton7()
print s2.__dict__
s3=Singleton7()
print s3.__dict__

{'x': 100}
{'x': 100}
{'x': 100}

Well, according to me, that is the most elegant one. Instead of solving the Singleton problem directly, it solves the problem that the Singleton is trying to solve, which is that of ensuring
unique state across instances.

That is the end of my first post on Python and Design patterns. I hope to add more in the future, on the rest of the Gang of four Design Patterns.

Tuesday, August 02, 2005

Freezope is down

Freezope is down since yesterday. I was planning to make the 1.4.5 beta 1 release of HarvestMan today, but it looks like I cannot provide any updates on the HarvestMan site till freezope comes back up.

I might still make the files public at BerliOS and make the freshmeat and PyPI announcement today or tomorrow. There won't be any update at the site of course, but that can wait till freezope is back on track.

Wednesday, July 27, 2005

Apple? - Not on BillG's map

The latest offering from Redmond, MSN Virtual Earth beta is making the buzz for all the wrong reasons.

The Register reports that Apple HQ is nowhere to be found on MSN virtual earth. Apparently, MSN virtual earth has chosen to rebuild the World Trade Center towers too. Looks like the images Microsoft used is way too old.

Clearly, MS has pushed a hastily built software that has not undergone enough testing to make even to alpha stage, as a beta. The stunt is an apparent knee-jerk reaction to the buzz Google is making with its map service and Google Earth. This is supposed to be a competition to Google Earth, but clearly the features or stability to be a credible competitor are missing. This is when Google has upped its sights and started to map the moon!

My gripes about MSN Virtual Earth.

1. The images are way too old. The image quality and resolution is way lacking when compared to Google Earth. Since MS seems to have used U.S.G.S archives which are 10 years old, this is to be expected. But hey, we are talking about one of the richest companies in the world. Surely they could have done better.

2. The map is not as responsive to panning as Google maps. The tiling does not work very well, often leaving large patches of yellow rectangles without resolving the underlying geography. Also, the maps load pretty slowly and often you are left staring at blank patches of screen for seconds before the tiled images start loading. You almost end up thinking that the server has stopped responding. Yeah, I am on 256 kbps broadband, not dial-up :-) .

3. Screws up Firefox. After some panning and zooming, some buttons on Firefox (I am using 1.0.5) does not work well. And hey, why does it disable my "Back" button?

4. "Locate me" feature. I don't understand the actual need for such a feature except for generating buzz. The feature requires ActiveX which I think is one-step backwards for Microsoft for making this technology work with other browsers such as Firefox. The I.P address based location does not work very well. It located me in Bombay, India whereas I am actually in Bangalore.

5. Sticky mouse buttons: The left mouse click tends to "stick" on the page. So even after I release the mouse button and move the mouse, the image keeps panning. I have to click the left mouse button once more to "release" the "stickiness".

6. The stupid compass: Why do you need that? It just eats up space on the map and is really not very useful.

7. Interface quality: The interface seems hastily thrown together and not well thought out. It is counter-intuitive if you are used to Google maps since the search and address box is on the left rather than on the right as in Google. To me, right position seems more intuitive and user friendly since that means the map does not overlap on the right edge of my screen and looks solid. Also, MS seemed to have not done much research on how surrounding colors affect the users ability to effectively view the maps. The choice of colors and color contrast leaves a lot to be desired. If I were the engineer, I would replace that stupid blue overlay on the top with something that increases the contrast and is less straining on the eyes.

The pluses.

1. It occupies more screen area than Google maps. Clearly this is an advantage.
2. Seems to show more places in India than Google. Hell, it even shows some obscure towns and villages in Kerala, my native state. Google maps suck in that respect.


Conclusion: Not enough reasons for anyone to switch to Virtual Earth if he/she is already using Google earth/google maps. I think MS should apologize, rename the release to MSN VE alpha, work on it a bit more and release a much better, user-friendly and fast beta. Then I would agree that it is competition.

Saturday, July 23, 2005

Apple - An "i" for Innovation

iMac, iPod, iTunes - Popular brand names from a company known for its innovative products - Apple.

Now it has become official. In a Businessweek poll conducted among top executives from across the planet, Apple won the top slot for the most innovative company, with nearly 25% of the votes.

3M came second with nearly 12% of the votes, followed by Microsoft. Big Blue came a distant 7th after GE, Sony and Dell in that order.

Another feather in the innovative cap of CEO Steve Jobs.

Way to go Apple! Well done.

Hasta-la-Vista Windows!

Microsoft has re-christened its unborn poster child,
"Windows Longhorn" as "Windows Vista".

The marketing speak from Microsoft is that Windows Vista brings clarity to the connected world.

Personally, I think "Longhorn" was much better than "Vista" though it gave rise to subtle bovine references.

It will be no wonder if the naming turns out to be prescient and puts an end to Microsoft monopoly on the desktop, considering the rate at which features are getting axed from Longhorn aka Vista.
(Also read Steven J Vaughan Nichols' article on axing of Monad from Longhorn).

Is it time yet to say "Hasta-la-Vista Microsoft"? Let us wait and watch.