I\'m looking into building a content site with possibly thousands of different entries, accessible by index and by search.
What are the measures I can take to preven
Between this:
What are the measures I can take to prevent malicious crawlers from ripping
and this:
I wouldn't want to block legitimate crawlers all together.
you're asking for a lot. Fact is, if you're going to try and block malicious scrapers, you're going to end up blocking all the "good" crawlers too.
You have to remember that if people want to scrape your content, they're going to put in a lot more manual effort than a search engine bot will... So get your priorities right. You've two choices: