Python CURL output different from original html

我怕爱的太早我们不能终老 提交于 2021-01-07 01:42:31

问题


I am trying to get HTML body from Spotify web. But after I output it to the file the result is for some reason different from the original HTML (it's a completely different site).

curl https://open.spotify.com/artist/4npEfmQ6YuiwW1GpUmaq3F > test.html

Eventually, I will do in python so if anyone knows how to get around this page redirect, please help.


回答1:


Spotify recognize that you use unsupported "browser", Curl is not a browser so don't think it will behave like one you will need to "fake" that you use a real browser by adding the right headers, something like:

curl 'https://open.spotify.com/artist/4npEfmQ6YuiwW1GpUmaq3F' \
-X 'GET' \
-H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' \
-H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0.2 Safari/605.1.15'



回答2:


First you need to install requests with "pip install requests" in cmd

Next you can use it(you can also add header with user-agent if needed).

import requests
url="google.com" #for example
headers={"User-Agent":"Apple Web Kit"}#adding user agent
html=requests.get(url,headers=headers).content.decode()


来源:https://stackoverflow.com/questions/65392491/python-curl-output-different-from-original-html

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!