Prevent site data from being crawled and ripped

前端 未结 12 941
终归单人心
终归单人心 2020-12-15 06:32

I\'m looking into building a content site with possibly thousands of different entries, accessible by index and by search.

What are the measures I can take to preven

12条回答
  •  半阙折子戏
    2020-12-15 06:38

    If the content is public and freely available, even with page view throttling or whatever, there is nothing you can do. If you require registration and/or payment to access the data, you might restrict it a bit, and at least you can see who reads what and identify the users that seem to be scraping your entire database.

    However I think you should rather face the fact that this is how the net works, there are not many ways to prevent a machine to read what a human can. Outputting all your content as images would of course discourage most, but then the site is not accessible anymore, let alone the fact that even the non-disabled users will not be able to copy-paste anything - which can be really annoying.

    All in all this sounds like DRM/game protection systems - pissing the hell out of your legit users only to prevent some bad behavior that you can't really prevent anyway.

提交回复
热议问题