A product I\'m helping to develop will basically work like this:
f
You could scrape the site, and if you get a code 200 response including your script just use that scrape. If not you may resolve to information from your "client proxy", that way the problem is down to the sites that you can't scrape.
For raising the security in these cases you could have multiple users sending the page and filter out any information that is not present on a minimum number of the responses. That will also have the added benefit of filtering out any user specific content. Also make sure to register what user you ask to do the proxy work and verify that you only receive pages from users that you have asked to do the job. You could also try to make sure that very active users don't get a higher chance of doing the job, that will make it harder to "fish" for the job.