How to find the REST API parameters to a site that doesn't have the data within the HTML?

為{幸葍}努か 提交于 2021-02-11 17:36:09

问题


This is sort of a follow up question to my previous post here for reference: Webscraping Blockchain data seemingly embedded in Javascript through Python, is this even the right approach?

Basically, I would receive an output and want to scrape some more features from it. In this case, the final link would be located at https://tracker.icon.foundation/address/hx4ae18d8f72200dc564673a0ae7206d862992753c

where I'm trying to retrieve the balance of 3,570.5434 ICX in the middle of the page. I'm clearly not calling the correct methods (couldn't find any documentation for this) and was wondering where in the source code I could find it. Attempt at trying it in Python:

import requests
url = "https://tracker.icon.foundation/v3/block/txList"
params = {
    "from": 'hx4ae18d8f72200dc564673a0ae7206d862992753c',
}
response = requests.get(url, params=params)
response.json()["data"]

回答1:


The value you're trying to scrape - the total ICX balance - appears to be the sum of the "available" ICX, and the "staked" ICX:

I've added the red lines for emphasis. The sum of these two values is the total ICX balance. Again, if you log your browser's network traffic, you'll find that these values come from requests made to different REST APIs. One is a HTTP GET request, the other an HTTP POST request. Again, you can find out how the POST payload is supposed to look by looking at the network traffic logs. If you need a little guidance for how to approach these kinds of network-traffic-sniffing solutions, I recommend you read this answer I posted for a different question, where someone needed help scraping information from a page about different wines and vineyards, and that page also happened to make XHR requests to a REST API. In it, I go more in-depth about each step of logging your network traffic, and finding the information you're looking for.

def get_available_icx(address):

    import requests

    url = "https://tracker.icon.foundation/v3/address/info"

    params = {
        "address": address
    }

    response = requests.get(url, params=params)
    response.raise_for_status()

    return float(response.json()["data"]["balance"])

def get_staked_icx(address):

    import requests

    url = "https://wallet.icon.foundation/api/v3"

    form_data = {
        "jsonrpc": "2.0",
        "id": 0,
        "method": "icx_call",
        "params": {
            "from": "hx0000000000000000000000000000000000000000",
            "to": "cx0000000000000000000000000000000000000000",
            "dataType": "call",
            "data": {
                "method": "getDelegation",
                "params": {
                    "address": address
                }
            }
        }
    }

    response = requests.post(url, json=form_data)
    response.raise_for_status()

    return int(response.json()["result"]["totalDelegated"], 16) / (10 ** 18)



def main():

    address = "hx4ae18d8f72200dc564673a0ae7206d862992753c"

    total_icx = get_available_icx(address) + get_staked_icx(address)
    print(total_icx)
    
    return 0


if __name__ == "__main__":
    import sys
    sys.exit(main())


来源:https://stackoverflow.com/questions/65838245/how-to-find-the-rest-api-parameters-to-a-site-that-doesnt-have-the-data-within

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!