failure in scraping the flight data table from airport website

你离开我真会死。 提交于 2019-12-24 11:55:11

问题


I have been trying to scrape arrival and departure data of domestic flights from the website of New Delhi International Airport. I have tried almost everything but I cannot extract the data. When I run the code, it returns nothing.I tried similar code on another airport website but it worked. Here is the code I wrote.

res = requests.get("https://m.newdelhiairport.in/live-flight- information-all.aspx?FLMode=A&FLType=D")
soup = BeautifulSoup(res.content,'html5lib')
table = soup.find_all('tbody',{'class':'arr_dep_table_body'})
print(table)

Here is the link to the website:- "https://m.newdelhiairport.in/live-flight-information-all.aspx?FLMode=A&FLType=D"

A screenshot of the website


回答1:


As mentioned you can use the alternative URL where the data is being source from. You will need to add a header.

import requests
import pandas as pd

url = 'https://m.newdelhiairport.in/get-all-Fids-FlightInfo.aspx?FltType=D&FltWay=A&FltNum=&FltFrom=&rn=0.992638793938065'
re = requests.get(url, headers =  {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'})
df = pd.read_html(re.text)
print(df)

I pulled the URL from the network tab. I opened the network tab and re-loaded the page then inspected the XHR web traffic:



来源:https://stackoverflow.com/questions/53755324/failure-in-scraping-the-flight-data-table-from-airport-website

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!