问题
I have a C# program which worked fine until a day or two ago. I use the following snippet to grab a page:
string strSiteListPath = @"http://www.ngs.noaa.gov/CORS/dates_sites.txt";
Uri uriSiteListPath = new Uri(strSiteListPath);
System.Net.WebClient oWebClient = new System.Net.WebClient();
strStationList = oWebClient.DownloadString(uriSiteListPath);
But it consistently returns a 404 Not Found error. That page completely exists, you are welcome to try it yourself. Because it worked days ago, and nothing in my code changed, I am given to think maybe the web-server changed in some way. That's fine, it'll happen, but what exactly has happened here?
Why can I browse to the file manually, but DownloadString fails to get the file?
EDIT:
For completeness, the code now looks like:
string strSiteListPath = @"http://www.ngs.noaa.gov/CORS/dates_sites.txt";
Uri uriSiteListPath = new Uri(strSiteListPath);
System.Net.WebClient oWebClient = new System.Net.WebClient();
oWebClient.Headers.Add("User-Agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0");
strStationList = oWebClient.DownloadString(uriSiteListPath);
Thanks again, Thomas Levesque!
回答1:
Apparently the site requires that you have a valid User-Agent header. If you set that header to something like that:
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0
Then the request works fine.
来源:https://stackoverflow.com/questions/23001265/downloadstring-returns-a-404-error-site-needs-a-user-agent-header