c# headless browser with javascript support for crawler

前端 未结 2 1513
耶瑟儿~
耶瑟儿~ 2020-12-16 20:58

Could anyone suggest headless browser for .NET that supports cookies and authomatically javascript execution?

2条回答
  •  离开以前
    2020-12-16 21:27

    Selenium+HtmlUnitDriver/GhostDriver is exactly what you are looking for. Oversimplified, Selenium is library for using variety of browsers for automation purposes - testing, scraping, task automation.

    There are different WebDriver classes with which you can operate an actual browser. HtmlUnitDriver is a headless one. GhostDriver is a WebDriver for PhantomJS, so you can write C# while actually PhantomJS will do the heavy lifting.

    Code snippet from Selenium docs for Firefox, but code with GhostDriver (PhantomJS) or HtmlUnitDriver is almost identical.

    using OpenQA.Selenium;
    using OpenQA.Selenium.Firefox;
    using OpenQA.Selenium.Support.UI;
    
    class GoogleSuggest
    {
        static void Main(string[] args)
        {
            // driver initialization varies across different drivers
            // but they all support parameter-less constructors
            IWebDriver driver = new FirefoxDriver();
            driver.Navigate().GoToUrl("http://www.google.com/");
    
    
            IWebElement query = driver.FindElement(By.Name("q"));
            query.SendKeys("Cheese");
            query.Submit();
    
            WebDriverWait wait = new WebDriverWait(driver, TimeSpan.FromSeconds(10));
            wait.Until((d) => { return d.Title.ToLower().StartsWith("cheese"); });
    
            System.Console.WriteLine("Page title is: " + driver.Title);
    
            driver.Quit();
        }
    }
    

    If you run this on Windows machine you can use actual Firefox/Chrome driver because it will open an actual browser window which will operate as programmed in your C#. HtmlUnitDriver is the most lightweight and fast.

    I have successfully ran Selenium for C# (FirefoxDriver) on Linux using Mono. I suppose HtmlUnitDriver will also work as fine as the others, so if you require speed - I suggest you go for Mono (you can develop, test and compile with Visual Studio on Windows, no problem) + Selenium HtmlUnitDriver running on Linux host without desktop.

提交回复
热议问题