Scrape Instagram Web Hashtag Posts

做~自己de王妃 提交于 2020-01-14 14:11:58

问题


I'm trying to scrape the number of posts to a given hashtag (#castles) and populate a Google Sheet cell using ImportXML.

I tried copying the Xpath from Chrome and paste it to the ImportXML parameter in the cell like this:

=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id="react-root"]/section/main/header/div[2]/div/div[2]/span/span")

I saw there is a problem with the quotation marks so I also tried:

=ImportXML("https://www.instagram.com/explore/tags/castels/", "//*[@id='react-root']/section/main/header/div[2]/div/div[2]/span/span")

Nevertheless, both return an error.

What am I doing wrong?

P.S. I am aware of the Xpath to the meta tag description "//meta[@name='description']/@content" however I would like to scrape the exact number of posts and not an abbreviated number.


回答1:


Try this -

function hashCount() {
  var url = 'instagram.com/explore/tags/cats/';
  var response = UrlFetchApp.fetch(url, {muteHttpExceptions: true}).getContentText();
  var regex = /(edge_hashtag_to_media":{"count":)(\d+)(,"page_info":)/gm;
  var count = regex.exec(response)[2];
  Logger.log(count);
}

Demo -

I've added muteHttpExceptions: true which was not added in my comment above. Hope this helps.



来源:https://stackoverflow.com/questions/58063800/scrape-instagram-web-hashtag-posts

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!