I want to know how to parse the robots.txt in java.
Is there already any code?
There's also jrobotx library hosted at SourceForge.
(Full disclosure: I spun off the code that forms that library.)
anastluc
There is also a new release of crawler-commons:
https://github.com/crawler-commons/crawler-commons
The library aims to implement functionality common to any web crawler and this includes a very handy robots.txt parser
来源:https://stackoverflow.com/questions/3141031/robots-txt-parser-java