HtmlAgilityPack.HtmlDocument Cookies

前端 未结 2 1644
执念已碎
执念已碎 2021-01-18 17:17

This pertains to cookies set inside a script (maybe inside a script tag).

System.Windows.Forms.HtmlDocument executes those scripts and the cookies set (

2条回答
  •  耶瑟儿~
    2021-01-18 17:59

    I also worked with Rohit Agarwal's BrowserSession class together with HtmlAgilityPack. But for me subsequent calls of the "Get-function" didn't work, because every time new cookies have been set. That's why I added some functions by my own. (My solution is far a way from beeing perfect - it's just a quick and dirty fix) But for me it worked and if you don't want to spent a lot of time in investigating BrowserSession class here is what I did:

    The added/modified functions are the following:

    class BrowserSession{
       private bool _isPost;
       private HtmlDocument _htmlDoc;
       public CookieContainer cookiePot;   //<- This is the new CookieContainer
    
     ...
    
        public string Get2(string url)
        {
            HtmlWeb web = new HtmlWeb();
            web.UseCookies = true;
            web.PreRequest = new HtmlWeb.PreRequestHandler(OnPreRequest2);
            web.PostResponse = new HtmlWeb.PostResponseHandler(OnAfterResponse2);
            HtmlDocument doc = web.Load(url);
            return doc.DocumentNode.InnerHtml;
        }
        public bool OnPreRequest2(HttpWebRequest request)
        {
            request.CookieContainer = cookiePot;
            return true;
        }
        protected void OnAfterResponse2(HttpWebRequest request, HttpWebResponse response)
        {
            //do nothing
        }
        private void SaveCookiesFrom(HttpWebResponse response)
        {
            if ((response.Cookies.Count > 0))
            {
                if (Cookies == null)
                {
                    Cookies = new CookieCollection();
                }    
                Cookies.Add(response.Cookies);
                cookiePot.Add(Cookies);     //-> add the Cookies to the cookiePot
            }
        }
    

    What it does: It basically saves the cookies from the initial "Post-Response" and adds the same CookieContainer to the request called later. I do not fully understand why it was not working in the initial version because it somehow does the same in the AddCookiesTo-function. (if (Cookies != null && Cookies.Count > 0) request.CookieContainer.Add(Cookies);) Anyhow, with these added functions it should work fine now.

    It can be used like this:

    //initial "Login-procedure"
    BrowserSession b = new BrowserSession();
    b.Get("http://www.blablubb/login.php");
    b.FormElements["username"] = "yourusername";
    b.FormElements["password"] = "yourpass";
    string response = b.Post("http://www.blablubb/login.php");
    

    all subsequent calls should use:

    response = b.Get2("http://www.blablubb/secondpageyouwannabrowseto");
    response = b.Get2("http://www.blablubb/thirdpageyouwannabrowseto");
    ...
    

    I hope it helps when you're facing the same problem.

提交回复
热议问题