I have been looking at XML and HTML libraries on rubyforge for a simple way to pull data out of a web page. For example if I want to parse a user page on stackoverflow how
I always really like what Ilya Grigorik writes, and he wrote up a nice post about using hpricot.
I also read this post a while back and it looks like it would be useful for you.
Haven't done either myself, so YMMV but these seem pretty useful.