Tuesday, July 04, 2006

Some ordered thoughts on Gnosis enhancements and XPath

I had dropped out of the blogging habit after Feb this year, mainly due to work pressures at office. I think it is time for me to dust out my blogging brush and start painting my blog a little bit, at least to touch up some of the loosing sheen here and there :-)

This is about a nice little utility called Gnosis utils which provides powerful XML parsing in very little code. The library is written by Dr. David Mertz who is a known authority on Python programming matters and has written some charming articles in his "Charming Python" series for IBM developerworks.

I was looking around for an XML API which provides powerful XPath parsing capabilities out of the box some time in March for a project I am doing at Spikesource. PyXML woefully lacks in this department. ElementTree, though a very good API for generic XML processing, falls short on its XPath support (no attributes etc).

Gnosis provides decent XPath support - it supports attributes, text searches but does not support attribute values, the [@attr] syntax etc.

During my work, I enhanced Gnosis XML to support some of the XPath 1.0 specs it do not support. These include,

o Support for //elem[@attr] syntax
o Search attributes by value - i.e support //elem[@attr=value] syntax
o Support //elem[last()]
o Support XPath searches in the root node

The good thing is that I have been able to extract out the Gnosis XML processing parts to a single module and add these enhancements on top of it. I plan to enhance this with the full XPath 1.0 specifications within a month or so and release it to public domain.

I hope this will address the existing void of lightweight pure Python APIs with full XPath support. The whole thing will fit inside a single module which should make it easy to use and extend.

Looking forward to completing this work!