问题
I'm trying to scrape the number of posts to a given hashtag (#castles) and populate a Google Sheet cell using ImportXML.
I tried copying the Xpath from Chrome and paste it to the ImportXML parameter in the cell like this:
=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id="react-root"]/section/main/header/div[2]/div/div[2]/span/span")
I saw there is a problem with the quotation marks so I also tried:
=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id='react-root']/section/main/header/div[2]/div/div[2]/span/span")
Nevertheless, both return an error.
What am I doing wrong?
P.S. I am aware of the Xpath to the meta tag description "//meta[@name='description']/@content"
however I would like to scrape the exact number of posts and not an abbreviated number.
回答1:
Try this -
function hashCount() {
var url = 'instagram.com/explore/tags/cats/';
var response = UrlFetchApp.fetch(url, {muteHttpExceptions: true}).getContentText();
var regex = /(edge_hashtag_to_media":{"count":)(\d+)(,"page_info":)/gm;
var count = regex.exec(response)[2];
Logger.log(count);
}
Demo -
I've added muteHttpExceptions: true
which was not added in my comment above. Hope this helps.
来源:https://stackoverflow.com/questions/58063800/scrape-instagram-web-hashtag-posts