2006-02-17

Mapping xml to Python objects

I need to process XML in my python code and prefer to deal with the data in the pythonic way. From a wide options available, I choose to play with 3 of them.

* generateDS
*
gnosis.xml.objectify
*
amara

Because I am a fan of XmlBeans, it seems that generateDS should be my favorite choice. However, when I tried to run the code-generating script on my schema file, it encountered some recursive problem and complained:

.... parent.collectElementNames(elementNames) File "C:\Python\Lib\site-packages\generateDS.py", line 478, in collectElementNames parent.collectElementNames(elementNames) File "C:\Python\Lib\site-packages\generateDS.py", line 475, in collectElementNames base = self.getBase() RuntimeError: maximum recursion depth exceeded

The same schema file works fine with XMLBeans however.

gnosis.xml.objectify and amara are similar in that they don't require a schema file and will just turn any well-formed xml file into python objects. I had no problem parse the xml files with both modules. However, the gnosis.xml.objectify module seems not aware of the namespace prefix. Given following xml element:

<ns:book xmlns:ns="http://mycompany/bookproject">..

It generates an object called ns_book. It does not recognize : as a namespace prefix and just blindly translates it to _. I tried to search for a solution, but could not find anything.

The amara module, however, correct translates the object name to book and also keeps the namespace information. It seems that I will use this module at this time.