nutch + mysql integration
问题 When nutch finishes its cycle (that is crawl - fetch- parse - index) during index phase, I do not want nutch to index (lucene index), but I want nutch to place all the crawled data (I believe he keeps them as NutchDocument object) into mysql using my code. Is there any way to do this? Thanks 回答1: Create your own java class that manage the Nutch cycle. It should be similar to org.apache.nutch.crawl.Crawl but you will have to replace the call to the indexer by a call to your Mysql connector. Or