WebBrowser control throws seemingly random NullReferenceException

孤者浪人 提交于 2020-01-09 10:36:30

问题


For a couple of days I am working on a WebBrowser based webscraper. After a couple of prototypes working with Threads and DocumentCompleted events, I decided to try and see if I could make a simple, easy to understand Webscraper.

The goal is to create a Webscraper that doesn't involve actual Thread objects. I want it to work in sequential steps (i.e. go to url, perform action, go to other url etc. etc.).

This is what I got so far:

public static class Webscraper
{
    private static WebBrowser _wb;
    public static string URL;

    //WebBrowser objects have to run in Single Thread Appartment for some reason.
    [STAThread] 
    public static void Init_Browser()
    { 
        _wb = new WebBrowser();
    }


    public static void Navigate_And_Wait(string url)
    {
        //Navigate to a specific url.
        _wb.Navigate(url);

        //Wait till the url is loaded.
        while (_wb.IsBusy) ;

        //Loop until current url == target url. (In case a website loads urls in steps)
        while (!_wb.Url.ToString().Contains(url))
        {
            //Wait till next url is loaded
            while (_wb.IsBusy) ;
        }

        //Place URL
        URL = _wb.Url.ToString();
    }
}

I am a novice programmer, but I think this is pretty straightforward code. That's why I detest the fact that for some reason the program throws an NullReferenceException at this piece of code:

 _wb.Url.ToString().Contains(url)

I just called the _wb.Navigate() method so the NullReference can't be in the _wb object itself. So the only thing that I can imagine is that the _wb.Url object is null. But the while _wb.IsBusy() loop should prevent that.

So what is going on and how can I fix it?


回答1:


Busy waiting (while (_wb.IsBusy) ;) on UI thread isn't much advisable. If you use the new features async/await of .Net 4.5 you can get a similar effect (i.e. go to url, perform action, go to other url etc. etc.) you want

public static class SOExtensions
{
    public static Task NavigateAsync(this WebBrowser wb, string url)
    {
        TaskCompletionSource<object> tcs = new TaskCompletionSource<object>();
        WebBrowserDocumentCompletedEventHandler completedEvent = null;
        completedEvent = (sender, e) =>
        {
            wb.DocumentCompleted -= completedEvent;
            tcs.SetResult(null);
        };
        wb.DocumentCompleted += completedEvent;

        wb.ScriptErrorsSuppressed = true;
        wb.Navigate(url);

        return tcs.Task;
    }
}



async void ProcessButtonClick()
{
    await webBrowser1.NavigateAsync("http://www.stackoverflow.com");
    MessageBox.Show(webBrowser1.DocumentTitle);

    await webBrowser1.NavigateAsync("http://www.google.com");
    MessageBox.Show(webBrowser1.DocumentTitle);
}


来源:https://stackoverflow.com/questions/16193084/webbrowser-control-throws-seemingly-random-nullreferenceexception

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!