How to get html source code after javascript transformation?

前端 未结 1 1436
时光说笑
时光说笑 2020-12-12 07:40

for a project at school I am trying to make a website that can show your grades in a prettier way than it\'s being done now. I have been able to log in to the site using cUR

相关标签:
1条回答
  • 2020-12-12 08:22

    First you have to understand a subtle but very important difference between using cURL to get a webpage, and using your browser visiting that same page.

    1. Loading a page with a browser

    When you enter the address on the location bar, the browser converts the url into an ip address . Then it tries to reach the web server with that address asking for a web page. From now on the browser will only speak HTTP with the web server. HTTP is a protocol made for carrying documents over network. The browser is actually asking for an html document (A bunch of text) from the web server. The web server answers by sending the web page to the browser. If the web page is a static page, the web server is just picking an html file and sending it over network. If it's a dynamic page, the web server use some high level code (like php) to generate to the web page then send it over.

    Once the web page has been downloaded, the browser will then parse the page and interprets the html inside which produces the actual web page on the browser. During the parsing process, when the browser finds script tags it will interpret their content as javascript, which is a language used in browser to manipulate the look of the web page and do stuff inside the browser.

    Remember, the web server only sent a web page containing html content he has no clue of what's javascript.

    So when you load a web page on a browser the javascript is ONLY interpreted once it is downloaded on the browser.

    2. What is cURL

    If you take a look at curl man page, you'll learn that curl is a tool to transfer data from/to servers which can speak some supported protocols and HTTP is one of them. When you download a page with curl, it will try to download the page the same way your browser does it but will not parse or interpret anything. cURL does not understand javascript or html, all it knows about is how to speak to web servers.

    3. Solution

    So what you need in your case is to download the page like cURL does it and also somehow make the javascript to be interpreted as if it was inside a browser.

    If you had follwed me up to here then you're ready to take a look at CasperJS.

    0 讨论(0)
提交回复
热议问题