I\'m trying to scrap Year & Winners ( first & second columns ) from \"List of finals matches\" table (second table) from http://en.wikipedia.org/wiki/List_of_FIFA_W
If you are inspecting through the inspect tool in the browser it will insert the tbody
tags.
The source code, may, or may not contain them. I suggest looking at the source view if you really want to know.
Either way, you do not need to traverse to the tbody, simply:
soup.findAll('table')[0].findAll('tr')
should work.
url = "http://en.wikipedia.org/wiki/List_of_FIFA_World_Cup_finals"
soup = BeautifulSoup(urllib2.urlopen(url).read())
for tr in soup.findAll('table')[2].findAll('tr'):
#get data
And then search what you need in the table :)
Directly run the below code.
tr_elements = soup.find_all('table')[2].find_all('tr')
By doing this, you can access the all the <tr>
; You will have to use for loop for doing this (There are other possible ways to iterate too). Don't try to find the tbody, it gets added by default.
Note:
If you are having a problem in getting to the desired tag, decompose the previous tags with .decompose()
method.