问题
I've been trying for the last 3 hours to scrape this website and get the rank, name, wins, and losses of each team.
When implementing this code:
import requests
from bs4 import BeautifulSoup
halo = requests.get("https://www.halowaypoint.com/en-us/esports/standings")
page = BeautifulSoup(halo.content, "html.parser")
final = page.encode('utf-8')
print(final.find_all("div"))
I keep getting this error
If anyone can help me out then it would be much appreciated!
Thanks!
回答1:
You are calling the the method on the wrong variable, use the BeautifulSoup object page not the byte string final:
print(page.find_all("div"))
To get the table data is pretty straightforward, all the data is inside the div with the css classes "table.table--hcs":
halo = requests.get("https://www.halowaypoint.com/en-us/esports/standings")
page = BeautifulSoup(halo.content, "html.parser")
table = page.select_one("div.table.table--hcs")
print(",".join([td.text for td in table.select("header div.td")]))
for row in table.select("div.tr"):
rank,team = row.select_one("span.numeric--medium.hcs-trend-neutral").text,row.select_one("div.td.hcs-title").span.a.text
wins, losses = [div.span.text for div in row.select("div.td.em-7")]
print(rank,team, wins, losses)
If we run the code, you can see the data matches the table:
In [4]: print(",".join([td.text for td in table.select("header div.td")]))
Rank,Team,Wins,Losses
In [5]: for row in table.select("div.tr"):
...: rank,team = row.select_one("span.numeric--medium.hcs-trend-neutral").text,row.select_one("div.td.hcs-title").span.a.text
...: wins, losses = [div.span.text for div in row.select("div.td.em-7")]
...: print(rank,team, wins, losses)
...:
1 Counter Logic Gaming 10 1
2 Team EnVyUs 8 3
3 Enigma6 8 3
4 Renegades 6 5
5 Team Allegiance 5 6
6 Evil Geniuses 4 7
7 OpTic Gaming 2 9
8 Team Liquid 1 10
来源:https://stackoverflow.com/questions/38260853/bytes-object-has-no-attribute-find-all