How can I get a web page into a string using JavaScript?

旧街凉风 提交于 2019-12-08 04:24:15

问题


I need to get the html content of a page using JavaScript, the page could be also on another domain, kind of what does wget but in JavaScript. I want to use it for a kind of web-crawler.

Using JavaScript, how can I get content of a page, provided I have an URL, and get it into a string?


回答1:


Try this:

function cbfunc(html) { alert(html.results[0]); }
$.getScript('http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22' + 
encodeURIComponent(url) + '%22&format=xml&diagnostics=true&callback=cbfunc');

DEMO

More about YQL




回答2:


The general way to load content over HTTP via JavaScript is to use the XMLHttpRequest object. This is subject to the same origin policy so to access content on other domains you have to circumvent it.

This assumes you are running JS in a web browser (implied by "the page could be also on another domain"). If you were not that other options would be open to you. For example, with nodejs you could use the http client it has.




回答3:


If you want to also capture the hmtl tags you could concatenate them to the html like this:

 function getPageHTML() {
       return "<html>" + $("html").html() + "</html>";
    }

How do I get the entire page's HTML with jQuery?



来源:https://stackoverflow.com/questions/13029811/how-can-i-get-a-web-page-into-a-string-using-javascript

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!