问题
I have been trying to scrape arrival and departure data of domestic flights from the website of New Delhi International Airport. I have tried almost everything but I cannot extract the data. When I run the code, it returns nothing.I tried similar code on another airport website but it worked. Here is the code I wrote.
res = requests.get("https://m.newdelhiairport.in/live-flight- information-all.aspx?FLMode=A&FLType=D")
soup = BeautifulSoup(res.content,'html5lib')
table = soup.find_all('tbody',{'class':'arr_dep_table_body'})
print(table)
Here is the link to the website:- "https://m.newdelhiairport.in/live-flight-information-all.aspx?FLMode=A&FLType=D"
A screenshot of the website
回答1:
As mentioned you can use the alternative URL where the data is being source from. You will need to add a header.
import requests
import pandas as pd
url = 'https://m.newdelhiairport.in/get-all-Fids-FlightInfo.aspx?FltType=D&FltWay=A&FltNum=&FltFrom=&rn=0.992638793938065'
re = requests.get(url, headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'})
df = pd.read_html(re.text)
print(df)
I pulled the URL from the network tab. I opened the network tab and re-loaded the page then inspected the XHR web traffic:
来源:https://stackoverflow.com/questions/53755324/failure-in-scraping-the-flight-data-table-from-airport-website