Get HTML document as string before it has loaded

怎甘沉沦 提交于 2019-12-24 10:57:21

问题


I looked at this and this answer to that question, but they only get the HTML contents of the page up until the <script> that executes the code.

For example, in this snippet:

<!DOCTYPE html>
<html>

<head>
  <title>Test</title>

  <script type="text/javascript">
    console.log(new XMLSerializer().serializeToString(document));
  </script>

  <link type="text/css" rel="stylesheet" href="style.css">
</head>

<body>
  <script type="text/javascript" src="testscript1.js"></script>
  <script type="text/javascript" src="testscript2.js"></script>
  <script type="text/javascript" src="testscript3.js"></script>
</body>

</html>

if you take a look at the console.log() and scroll past the stackoverflow stuff, you'll see:

<script type="text/javascript">
    console.log(new XMLSerializer().serializeToString(document));
</script></body></html>

the <script> with src="testscript1.js" and the other two <script> tags are not present, I.E. the logged string does not contain all the HTML.

If you put the logging script at the bottom like this:

<!DOCTYPE html>
<html>

<head>
  <title>Test</title>

  <link type="text/css" rel="stylesheet" href="style.css">
</head>

<body>
  <script type="text/javascript" src="testscript1.js"></script>
  <script type="text/javascript" src="testscript2.js"></script>
  <script type="text/javascript" src="testscript3.js"></script>
  
  <script type="text/javascript">
    console.log(new XMLSerializer().serializeToString(document));
  </script>
</body>

</html>

it logs the other <script> tags.

Question

My guess is that since my scripts are loaded synchronously, the log outputs whatever has been loaded up to this point. How could I avoid that? I want my logging <script> to be as close to the top of the HTML as possible, while having access to all the HTML content.

What I've tried

If I put this script in the <head>:

var req = new XMLHttpRequest();
req.open("GET", document.location.href + "index.html", false);
req.onreadystatechange = function () {
    if (req.readyState === 4) {
        if (req.status === 200 || req.status == 0) {
            console.log(req.responseText);
        }
    }
}

req.send(null);

I get the desired result. But I don't like how easily it could fail. For example, if I paste this code as a snippet here in stackoverflow, it doesn't work because the requested file doesn't exist. If the document is named notindex.html, it would fail too.

Are there any alternatives or a reliable way to request the opened HTML document via an XMLHttpRequest?

Edit

I want to have access to all the HTML content before all stylesheets, scripts and images have loaded. That's the reason I want the logging script to be at the top. The XMLHttpRequest does it, but is unreliable.


回答1:


You can use the DOMContentLoaded event to run the function after your document has completely loaded:

<!DOCTYPE html>
<html>

<head>
  <title>Test</title>

  <script type="text/javascript">
    document.addEventListener("DOMContentLoaded", function() {
        console.log(new XMLSerializer().serializeToString(document));
    });
  </script>

  <link type="text/css" rel="stylesheet" href="style.css">
</head>

<body>
  <script type="text/javascript" src="testscript1.js"></script>
  <script type="text/javascript" src="testscript2.js"></script>
  <script type="text/javascript" src="testscript3.js"></script>
</body>

</html>


来源:https://stackoverflow.com/questions/43829832/get-html-document-as-string-before-it-has-loaded

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!