Google Sheets IMPORTXML XPath help (understanding how to read a page source)

▼魔方 西西 提交于 2020-02-06 08:35:31

问题


I am trying to write a function that will give me the annual payout dividend for a given stock. The website I am using is www.seekingalpha.com

So I understand that the function is =IMPORTXML (URL, xpath_query). In that case, my URL is: https://seekingalpha.com/symbol/VOO/dividends/scorecard but the problem I am having is figuring out the correct XPath to acquire the dividend value.

I currently have this as my function:

 =IMPORTXML(CONCATENATE("https://www.seekingalpha.com/symbol/", $B2, "/dividends/scorecard"), "//body")

$B2 is a cell that holds the ticker symbol if you are wondering. Anyways, I right-clicked the number I wanted from the website and followed it downstream and tried seeing where it is nested under but keep running into the wrong "directory" per se, because I am usually left with an error "Empty."

I have also tried copying the xPath directly:

/html/body/div[2]/div[1]/div/div[1]/div/div/div[2]/section/section[1]/table/tbody/tr/td[1] 

but am greeted with another empty field error.

Could anyone point me in the right direction? I've been researching this for a while and figured this would be a great way to learn. Thank you in advance


回答1:


you need some other source. Google Sheets does not support scraping of JavaScript elements. you can test JS dependency simply by disabling JS for given site and what's left can be scraped. in your case its nothing:


UPDATE:

=INDEX(IMPORTXML("https://stocknews.com/stock/"&A15&"/dividends/", 
 "//div[@class='grade-cat-ytd']"), 2)

understanding how to read a page source



来源:https://stackoverflow.com/questions/59310459/google-sheets-importxml-xpath-help-understanding-how-to-read-a-page-source

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!