Await AJAX with HtmlAgilityPack in Xamarin

与世无争的帅哥 提交于 2019-12-05 18:50:37

Solution where you call the ajax method using a webrequest.

So I got bored and figured most of it out. What is missing below is how to identify the Klase by id. The below example will fetch the klase '1GLD'. The reason why we need cookies is in order for the request to know which school we are fetching the Klase from. Also the below code only returns JSON - and not HTML since it is an ajax method we call.

CookieContainer cookies = new CookieContainer();
try
{
    string webAddr = "https://roosters.windesheim.nl/";
    var httpWebRequest = (HttpWebRequest)WebRequest.Create(webAddr);
    httpWebRequest.ContentType = "application/json; charset=utf-8";
    httpWebRequest.Method = "POST";
    httpWebRequest.CookieContainer = cookies;        
    httpWebRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
    httpWebRequest.Headers.Add("X-Requested-With", "XMLHttpRequest");

    var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
    using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
    {
        cookies.Add(httpWebRequest.CookieContainer.GetCookies(httpWebRequest.RequestUri));
    }
}
catch (WebException ex)
{
    Console.WriteLine(ex.Message);
}

//According to my web debugger the cookie will last until the 10th of December. So need to fix a new cookie until then.
//I noticed the url used unixtimestamps at the end of the url. So we just add the unixtimestamp at the end for each request.
long unixTimeStamp = new DateTimeOffset(DateTime.Now).ToUnixTimeMilliseconds() - 100;

//we are now ready to call the ajax method and get the JSON.
try
{
    string webAddr = "https://roosters.windesheim.nl/WebUntis/Timetable.do?request.preventCache="+unixTimeStamp.ToString();
    var httpWebRequest = (HttpWebRequest)WebRequest.Create(webAddr);
    httpWebRequest.ContentType = "application/x-www-form-urlencoded; charset=utf-8";
    httpWebRequest.Method = "POST";
    httpWebRequest.CookieContainer = cookies;
    httpWebRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
    httpWebRequest.Headers.Add("X-Requested-With", "XMLHttpRequest");

    using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
    {
        string json = "ajaxCommand=getWeeklyTimetable&elementType=1&elementId=13090&date=20171126&formatId=7&departmentId=0&filterId=-2";

        //The command below will return a JSON datastructure containing all the klases and their relevant ID.
        //string otherJson = "ajaxCommand=getPageConfig&type=1&filter=-2"


        streamWriter.Write(json);
        streamWriter.Flush();
    }


    var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
    using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
    {
        var responseText = streamReader.ReadToEnd();
        //THE RESULTS GETS PRINTED HERE.
        Console.Write(responseText);
    }
}
catch (WebException ex)
{
    Console.WriteLine(ex.Message);
}

Other solution with Selenium with Firefox driver.

This is way easier to do. but it also takes some time. Not all the thread sleeps are necessary. This will give an HTML to work with isntead just like you requested. But I found it necessary in the last foreach loop.

public static void Main(string[] args)
{
    HtmlDocument doc = new HtmlDocument();
    //According to my web debugger the cookie will last until the 10th of December. So need to fix a new cookie until then.
    //I noticed the url used unixtimestamps at the end of the url. So we just add the unixtimestamp at the end for each request.
    long unixTimeStamp = new DateTimeOffset(DateTime.Now).ToUnixTimeMilliseconds() - 100;
    string webAddr = "https://roosters.windesheim.nl/WebUntis/Timetable.do?request.preventCache="+unixTimeStamp.ToString();
    var ffOptions = new FirefoxOptions();
    ffOptions.BrowserExecutableLocation = @"C:\Program Files (x86)\Mozilla Firefox\firefox.exe";
    ffOptions.LogLevel = FirefoxDriverLogLevel.Default;
    ffOptions.Profile = new FirefoxProfile { AcceptUntrustedCertificates = true };
    var service = FirefoxDriverService.CreateDefaultService();

    var driver = new FirefoxDriver(service, ffOptions, TimeSpan.FromSeconds(120));


    driver.Navigate().GoToUrl(webAddr);


    driver.FindElement(By.XPath("//input[@id='school']")).SendKeys("Windesheim"+Keys.Enter);
    Thread.Sleep(2000);
    driver.FindElement(By.XPath("//span[@id='dijit_PopupMenuBarItem_0_text' and text() ='Lesrooster']")).Click();

    driver.FindElement(By.XPath("//td[@id='dijit_MenuItem_0_text' and text() ='Klassen']")).Click();
    Thread.Sleep(2000);

    driver.FindElement(By.XPath("//div[@id='widget_Timetable_toolbar_elementSelect']//input[@class='dijitReset dijitInputField dijitArrowButtonInner']")).Click();

    //we get all the options for Klase
    doc.LoadHtml(driver.PageSource);
    HtmlNodeCollection nodes = doc.DocumentNode.SelectNodes("//div[@id='Timetable_toolbar_elementSelect_popup']/div[@item]");
    List<String> options = new List<String>();
    foreach (HtmlNode n in nodes)
    {
        options.Add(n.InnerText);
    }

    foreach(string s in options)
    {
        driver.FindElement(By.XPath("//input[@id='Timetable_toolbar_elementSelect']")).Clear();
        driver.FindElement(By.XPath("//input[@id='Timetable_toolbar_elementSelect']")).SendKeys(s);
        Thread.Sleep(2000);
        driver.FindElement(By.XPath("//body")).SendKeys(Keys.Enter);
        Thread.Sleep(2000);
        doc.LoadHtml(driver.PageSource);
        //Console.WriteLine(driver.Url); //Now we can see the id of the current Klase
    }

    Console.WriteLine(doc.DocumentNode.InnerHtml);

    Console.ReadKey();
}

Last update

Using the Selenium solution I was able to get the ID's for all courses. I have included the file here so you can use it with your ajax and web requests.

I was going to leave this as a comment. But it got too big and too badly formatted. So here we go.

Firstly. The site is updated dynamically using javascript that is called with an ajaxcommand.

If you can open up a session and store the cookie containing the SESSIONID and the now "encrypted" schoolname then you can call the ajax commands as such.

    https://roosters.windesheim.nl/ajaxCommand=getWeeklyTimetable&elementType=1&elementId=13090&date=20171126&formatId=7&departmentId=0&filterId=-2

This does however require you to know what elementType is and what elementId is.

In this case elementId refers to Klas when it is equal to 1GLD. And formatID(7) refers Roosterformaat when it is equal to "Beknopt". You have to figure out what the remaining variables does. Even more important is that if you succeed in being able to make valid ajax commands to the server then you wont get html back as a response you will receive the data in JSON.

The easiest way to do what you want is to have all the classes in a separate file. And use that as reference point. Same goes for the other options.

And then use a headless browser like phantomjs.org with Selenium. This way you can find and click on the classes you want to scrape. Load the html into a HtmlAgilityPack.HtmlDocument and then do what you need to do. Selenium/PhantomJS till keep track of your cookies. This method is slower - but a lot easier to do.

EDIT Storing cookies from a webrequest - the easy way.

I am not keen on this subject. But OP asked. If anybody has a better way of doing it please edit.

CookieContainer cookies = new CookieContainer();
try
{
    string webAddr = "https://roosters.windesheim.nl/WebUntis/";

    var httpWebRequest = (HttpWebRequest)WebRequest.Create(webAddr);
    httpWebRequest.ContentType = "application/json; charset=utf-8";
    httpWebRequest.Method = "POST";
    httpWebRequest.CookieContainer = cookies;

    httpWebRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
    httpWebRequest.Headers.Add("X-Requested-With", "XMLHttpRequest");
    using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream()))
    {
        string json = "ajaxCommand=getWeeklyTimetable&elementType=1&elementId=13092&date=20171126&formatId=7&departmentId=0&filterId=-2";

        streamWriter.Write(json);
        streamWriter.Flush();
    }


    var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse();
    using (var streamReader = new StreamReader(httpResponse.GetResponseStream()))
    {
        cookies.Add(httpWebRequest.CookieContainer.GetCookies(httpWebRequest.RequestUri));
        //cookies.Add(httpResponse.Cookies);
        var responseText = streamReader.ReadToEnd();
        doc.LoadHtml(responseText);
        foreach(Cookie c in httpResponse.Cookies)
        {
            Console.WriteLine(c.ToString());
        } 
    }
}
catch (WebException ex)
{
    Console.WriteLine(ex.Message);
}
    Console.WriteLine(doc.DocumentNode.InnerHtml);

    Console.ReadKey();
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!