I wrote earlier about having written a Bloglines OPML (Outline Processor Markup Language) extractor in python. It was fun little project but someone took the topic more seriously and wrote a generic library to access the Bloglines API.
Go check out PyBloglines over at www.josephson.org.
The part of the code that fascinates me most is their use of expat, the fast XML parser for Python. I've never used it, but the syntax is so easy it only took me a few seconds to see what they did and realize how superior it was to what I did. Check out this code from PyBloglines:
class OpmlParser:
def __init__(self):
self.parser = xml.parsers.expat.ParserCreate()
self.parser.StartElementHandler = self.start_element
self.parser.EndElementHandler = self.end_element
def parse(self, opml):
self.feedlist = []
self.parser.Parse(opml)
return self.feedlist
def start_element(self, name, attrs):
if name == "outline":
if attrs.has_key('title') and attrs.has_key('xmlUrl'):
sub = Subscription()
sub.title = attrs["title"]
sub.htmlUrl = attrs["htmlUrl"]
sub.type = attrs["type"]
sub.xmlUrl = attrs["xmlUrl"]
sub.bloglinesSubId = int(attrs["BloglinesSubId"])
sub.bloglinesIgnore = int(attrs["BloglinesIgnore"])
sub.bloglinesUnread = int(attrs["BloglinesUnread"])
self.feedlist.append(sub)
def end_element(self, name):
pass
When done, a list of feeds is returned quite handily.
Posted by Nick Codignotto at November 7, 2004 08:02 AM | TrackBack