BeautifulSoup - AttributeError: 'NavigableString' object has no attribute 'find_all'

六眼飞鱼酱① 提交于 2021-01-28 08:42:01

问题


Trying to get this script to iterate through the html file and print out the desired results. It keeps giving me this error. It works fine with only one "game" in the table, but if it is more than one it breaks. Trying to fix it so it can iterate over more than one game/parking ticket but can't continue due to this.

Traceback (most recent call last):
  File "C:/Users/desktop/Desktop/tabletest.py", line 11, in <module>
    for rows in table.find_all('tr'):
  File "C:\Program Files\Python36\lib\site-packages\bs4\element.py", line 737, in __getattr__
    self.__class__.__name__, attr))
AttributeError: 'NavigableString' object has no attribute 'find_all'

This is my code:

import pandas as pd
from bs4 import BeautifulSoup
import requests
import lxml.html as lh


with open("htmltabletest.html", encoding="utf-8") as f:
    data = f.read()
    soup = BeautifulSoup(data, 'lxml')
    for table in soup.find('table', attrs={'id': 'eventSearchTable'}):
        for rows in table.find_all('tr'):
            cols = table.find_all('td')

            empty = cols[0].get_text()
            eventdate = cols[1].get_text()
            eventname = cols[2].get_text()
            tickslisted = cols[3].get_text()
            pricerange = cols[4].get_text()

            entry = (empty, eventdate, eventname, tickslisted, pricerange)

            print(entry)

This is whats in the html file:

<table class="dataTable st-alternateRows" id="eventSearchTable">
<thead>
<tr>
<th id="th-es-rb"><div class="dt-th"> </div></th>
<th id="th-es-ed"><div class="dt-th"><span class="th-divider"> </span>Event date<br/>Time (local)</div></th>
<th id="th-es-en"><div class="dt-th"><span class="th-divider"> </span>Event name<br/>Venue</div></th>
<th id="th-es-ti"><div class="dt-th"><span class="th-divider"> </span>Tickets<br/>listed</div></th>
<th id="th-es-pr"><div class="dt-th es-lastCell"><span class="th-divider"> </span>Price<br/>range</div></th>
</tr>
</thead>
<tbody class="" id="eventSearchTbody"><tr class="even" id="r-se-103577924">
<td class="nowrap"><input class="es-selectedEvent" id="se-103577924-check" name="selectEvent" type="radio"/></td>
<td class="nowrap" id="se-103577924-eventDateTime">Thu, 10/11/2018<br/>8:20 p.m.</td>
<td><div><a class="ellip" href="services/priceanalysis?eventId=103577924&amp;sectionId=0" id="se-103577924-eventName" target="_blank">Philadelphia Eagles at New York Giants</a></div><div id="se-103577924-venue">MetLife Stadium, East Rutherford, NJ</div></td>
<td id="se-103577924-nrTickets">6655</td>
<td class="es-lastCell nowrap" id="se-103577924-priceRange"><span id="se-103577924-minPrice">$134.50</span>  to<br/><span id="se-103577924-maxPrice">$2,222.50</span></td>
</tr><tr class="odd" id="r-se-103577925">
<td class="nowrap"><input class="es-selectedEvent" id="se-103577925-check" name="selectEvent" type="radio"/></td>
<td class="nowrap" id="se-103577925-eventDateTime">Thu, 10/11/2018<br/>8:21 p.m.</td>
<td><div><a class="ellip" href="services/priceanalysis?eventId=103577925&amp;sectionId=0" id="se-103577925-eventName" target="_blank">PARKING PASSES ONLY Philadelphia Eagles at New York Giants</a></div><div id="se-103577925-venue">MetLife Stadium Parking Lots, East Rutherford, NJ</div></td>
<td id="se-103577925-nrTickets">929</td>
<td class="es-lastCell nowrap" id="se-103577925-priceRange"><span id="se-103577925-minPrice">$20.39</span>  to<br/><span id="se-103577925-maxPrice">$3,602.50</span></td>
</tr></tbody>
</table>

回答1:


The error lies in the way you iterate on the table, more specifically at the line:

for table in soup.find('table', attrs={'id': 'eventSearchTable'}):

You should use find_all if you want to iterate. Indeed, if you look at the type of the value returned by the two methods:

print(type(soup.find('table', attrs={'id': 'eventSearchTable'})))
# <class 'bs4.element.Tag'>
print(type(soup.find_all('table', attrs={'id': 'eventSearchTable'})))
# <class 'bs4.element.ResultSet'>

in the first case you have a table, in the second case a set of tables (made by only 1 in your case) with each being of type bs4.element.Tag.

Thus, you have two options, either you use

table = soup.find('table', attrs={'id': 'eventSearchTable'})

or

for table in soup.find_all("table", {"id":"eventSearchTable"}):


来源:https://stackoverflow.com/questions/52020762/beautifulsoup-attributeerror-navigablestring-object-has-no-attribute-find

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!