I wanted to make a Java based web crawler for an experiment. I heard that making a Web Crawler in Java was the way to go if this is your first time. However, I have two impo
Though mainly used for Unit Testing web applications, HttpUnit traverses a website, clicks links, analyzes tables and form elements, and gives you meta data about all the pages. I use it for Web Crawling, not just for Unit Testing. - http://httpunit.sourceforge.net/