I want to create a scraper using Google Spreadsheets with Google Apps Script. I know it is possible and I have seen some tutorials and threads about it.
The main ide
I had some good luck today just by massaging the html:
// close unclosed tags
html = html.replace(/(<(?=link|meta|br|input)[^>]*)(?/ig, '$1/>')
// force script / style content into cdata
html = html.replace(/(<(script|style)[^>]*>)/ig, '$1]*>)/ig, ']]>$1')
// change & to &
html = html.replace(/&(?!amp;)/g, '&')
// now it works! (tested with original url)
let document = XmlService.parse(html)