Cheerio, axios, reactjs to web scrape a table off a webpage returning empty list

非 Y 不嫁゛ 提交于 2021-01-29 08:53:50

问题


Trying to scrape this table off this website: https://www.investing.com/commodities/real-time-futures

But for some reason when I try to get the data, I keep getting an empty list.

This is what I'm doing to get the data and parse it:

componentDidMount() {
    axios.get(`https://www.investing.com/commodities/real-time-futures`)
      .then(response => {
        if(response.status === 200)
          {
            const html = response.data;
            const $ = cheerio.load(html);
            let data = [];
            $('#cross_rate_1 tr').each((i, elem) => {
                data.push({
                  Month: $(elem).find('td#left noWrap').text()
                })
            });
            console.log(data);
          }
        }, (error) => console.log('err') );
  }

This is a screenshot of the particular part of the source code I'm trying to scrape.

Any help is much appreciated.


回答1:


As already mentioned, the table in question is constantly updating via a websocket connection. You can try getting the data by either 1) connecting to the websocket or 2) scraping the dynamically generated html.

Just for a data snapshot and not for a continuous time series, you can use a browser scraping extension. In this way you won't care about the websocket implementation.

I've identified the price data CSS selectors for you and created a scraping configuration to be used with the open source browser extension https://github.com/get-set-fetch/extension.

"eLtI4gnapZTLDsIgEEV/hejGLrC+F25N3OrCpUlD6FhIWmiY0f6+1Hd9EJsuSEguGRg4h8fSlS0Km/r3ZesjHR0g2zrtKzL2IYg1wOqLZ2hEicrSwxhFVOIyjquqGmpzAiRtsqG0RSxv5TVg7EDkvC7AD9etmqJlQBz9ONRW8HvgJ06UwD2HpCV/gtpFylFnC39A/s51A3qphMlg94ruBbtNCe5iMr5/EP/S3ICZf4H5myP/0tv3rSIm/oiQjBmlS0OKS6XzdDCJ9iYQT8PxLBzPw/Ei6rWwpZ0dZ2cMF5M="

Inside the extension do: new project > config hash > paste the above hash (without the quotes) > save, scrape, view results > export as csv.

Disclaimer: I'm the extension author.



来源:https://stackoverflow.com/questions/62905427/cheerio-axios-reactjs-to-web-scrape-a-table-off-a-webpage-returning-empty-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!