I want to create a scraper using Google Spreadsheets with Google Apps Script. I know it is possible and I have seen some tutorials and threads about it.
The main ide
Please be aware that certain web sites may not permit automated scraping of their content, so please consult their terms or service before using Apps Script to extract the content.
The XmlService only works against valid XML documents, and most HTML (especially HTML5), is not valid XML. A previous version of the XmlService, simply called Xml, allowed for "lenient" parsing, which would allow it to parse HTML as well. This service was sunset in 2013, but for the time being still functions. The reference docs are no longer available, but this old tutorial shows it's usage.
Another alternative is to use a service like Kimono, which handles the scraping and parsing parts and provides a simple API you can call via UrlFetchApp to retrieve the structured data.