JavaScript Regex: How to split html string into array of html elements and text nodes?

大兔子大兔子 提交于 2021-01-29 16:08:17

问题


For example, this html string:

Lorem <b>ipsum</b> dolor <span class="abc">sit</span> amet,<br/>consectetur <input value="ok"/> adipiscing elit.

into this array:

[ 
  'Lorem ',
  '<b>ipsum</b>',
  ' dolor ', 
  '<span class="abc">sit</span>', 
  ' amet,', 
  '<br/>', 
  'consectetur ', 
  '<input value="ok"/>', 
  'adipiscing elit.' 
]

Here is the example of html elements match:

const pattern = /<([A-Z][A-Z0-9]*)\b[^>]*>(.*?)<\/\1>|<([A-Z][A-Z0-9]*).*?\/>/gi;
let html = 'Lorem <b>ipsum</b> dolor <span class="abc">sit</span> amet,<br/>consectetur <input value="ok"/> adipiscing elit.'
let nodes = html.match(pattern);

console.log(nodes)

How to add the text nodes as well?


回答1:


If the HTML is formatted properly, consider using DOMParser instead, to select all children, then take each child's .outerHTML (for element nodes) or .textContent (for text nodes):

const str = `Lorem <b>ipsum</b> dolor <span class="abc">sit</span> amet,<br/>consectetur <input value="ok"/> adipiscing elit.`;

const doc = new DOMParser().parseFromString(str, 'text/html');
const arr = [...doc.body.childNodes]
  .map(child => child.outerHTML || child.textContent);
console.log(arr);

You don't have to use DOMParser - you could also put the string into an ordinary element on the page, then take that element's children, but that'll allow for arbitrary code execution, which should be avoided.



来源:https://stackoverflow.com/questions/61049576/javascript-regex-how-to-split-html-string-into-array-of-html-elements-and-text

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!