I\'m looking into building a content site with possibly thousands of different entries, accessible by index and by search.
What are the measures I can take to preven
Realistically you can't stop malicious crawlers - and any measures that you put in place to prevent them are likely to harm your legitimate users (aside from perhaps adding entries to robots.txt to allow detection)
So what you have to do is to plan on the content being stolen - it's more than likely to happen in one form or another - and understand how you will deal with unauthorized copying.
Prevention isn't possible - and will be a waste of your time trying to make it so.
The only sure way of making sure that the content on a website isn't vulnerable to copying is to unplug the network cable...
To detect it use something like http://www.copyscape.com/ may help.