Include only <script> tag from page source

人盡茶涼 提交于 2020-04-18 05:43:33

问题


I have a string let's say headData which is a combination of <script>, <style> and <link> tags. For Ex(with Dummy data) -

let headData = '<style>
        @font-face {
            font-family: 'Roboto';
            font-style: normal;
            font-weight: 300;
            src: local('Roboto Light'), local('Roboto-Light'), url(path-to.woff) format('woff');
        }</style>
    <link rel="dns-prefetch" href="//assets.adobedtm.com">
    <script>var isPresent = false;</script>
    <script>var isContent = true;</script>
    <style>@font-face {
            font-family: 'Courgette';
            font-style: normal;
            font-weight: 400;
            src: local('Courgette Regular'), local('Courgette-Regular'), url(path-to.woff2) format('woff2');}</style>'

I inject whole of headData in a tag like below.

<script dangerouslySetInnerHTML={{__html: headData}} />

I don't want to inject HTML tags like <style>, <link> tag related data and only want all the <script> tag related data to be injected. Is there a way I can achieve this using regex of selecting only <script> tags.

So what I finally want to inject is similar to -

let headData = '<script>var isPresent = false;</script>
        <script>var isContent = true;</script>'

What is the right way to achieve this in Javascript?


回答1:


You can find the wanted tags with RegEx Capturing Groups and match():

/(<script>)[^<>]*(<\/script>)/g

Demo:

let headData = `<style>
        @font-face {
            font-family: 'Roboto';
            font-style: normal;
            font-weight: 300;
            src: local('Roboto Light'), local('Roboto-Light'), url(path-to.woff) format('woff');
        }</style>
    <link rel="dns-prefetch" href="//assets.adobedtm.com" />
    <script>var isPresent = false;<\/script>
    <script>var isContent = true;<\/script>
    <style>@font-face {
            font-family: 'Courgette';
            font-style: normal;
            font-weight: 400;
            src: local('Courgette Regular'), local('Courgette-Regular'), url(path-to.woff2) format('woff2');}</style>`;
            
 var re = /(<script>)[^<>]*(<\/script>)/g;
 headData = headData.match(re).join('\n');
 console.log(headData);
 



回答2:


I am not familiar with React, but it is generally not a good idea to try to parse HTML using regular expressions.

You could run into all kinds of problems with regular expressions. (For example, some of the script tags could contain code like this: <script> const myString='<script></script>'; </script>).

I would suggest using the browser's built-in parser rather than regular expressions to extract the script tags and their content.

function getScriptsString(headString) {
  const head = document.createElement('head');
  head.innerHTML = headData;
  const headChildrenArray = Array.from(head.children);
  const scriptsString = headChildrenArray.reduce((str,el) => {
    if(el.tagName === 'SCRIPT') {
      return str + el.outerHTML;
    }
    return str;
  }, '');
  return scriptsString;
}


来源:https://stackoverflow.com/questions/60150137/include-only-script-tag-from-page-source

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!