问题
I’m trying to retrieve the content of a webpage in c#. The problem is that the webpage uses Ajax and JavaScript to dynamically create and populate the HTML elements.
The webpage I’m talking about is: http://diseases.jensenlab.org/Entity?order=textmining,knowledge,experiments&textmining=10&knowledge=10&experiments=10&type1=9606&type2=-26&id1=ENSP00000317985
If you use httpWebRequest to get the HTML code of the website, only the JavaScript calls are visible and not the content. So how can you get the return results of the JavaScript that is being displayed on the webpage in a console c# program? I have tried using the web browser class but can’t get it to work.
How do you use the web browser class in a new thread to display the dynamically created table’s results in an Array List? Further how do you access the relevant HTML tag if you do not know the name? Can you use the ID tag? This is assuming that the web browser class is the best way to go about doing this. Or is there a better way?
The relevant HTML code part is:
<div class="ajax_table" id="53c2583b1f204464d7fa9387e2ac1868"><script>blackmamba_pager('Textmining', 'type1=9606id1=ENSP00000317985type2=-26title=Text+mining',
10, 1, '53c2583b1f204464d7fa9387e2ac1868');</script></div>
Please provide me with an example of how this is done?
回答1:
Here. then, also taken from stack overflow :):
WebBrowser mywebBrowser;
private void Form1_Load(object sender, EventArgs e)
{
mywebBrowser = new WebBrowser();
mywebBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(mywebBrowser_DocumentCompleted);
Uri address = new Uri("http://www.cnn.com/");
mywebBrowser.Navigate(address);
}
private void mywebBrowser_DocumentCompleted(Object sender,WebBrowserDocumentCompletedEventArgs e)
{
//Until this moment the page is not completely loaded
HtmlDocument doc = mywebBrowser.Document;
HtmlElementCollection tagCollection;
tagCollection = doc.GetElement("53c2583b1f204464d7fa9387e2ac1868");
}
There's no direct way to get elements by class name like with jQuery. If id of your table div isn't stable, you might use GetElementsByTagName, iterate through the results. You can then use GetAttribute("classname") to match your "ajax_table" class.
来源:https://stackoverflow.com/questions/30007821/retrieve-ajax-javascript-return-results-from-webpage-in-c-sharp