问题
I would like to ask how can i extract the event's fees from this website using python libraries (beautifulSoup) for web scrapping.
However, the event's fee share the same class with other properties. I would like to ask is there any suggestions to extract only the fees. I have try find_next, find_next_sibling and find next_parent but still no use. Below is the raw html code where the price's class located:
<div class="eds-event-card-content__sub eds-text-bm eds-text-color--ui-600 eds-l-mar-top-1 eds-event-card-content__sub--cropped">Free</div>
I would appreciate if any help provided.
Below is the code that i have try. I only get a list of tag in my array.
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/?page=1'
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
#Finding common container for each event
containers = soup.find_all('article', class_ = 'eds-l-pad-all-4 eds-event-card-content eds-event-card-content--list eds-event-card-content--standard eds-event-card-content--fixed eds-l-pad-vert-3')
event_fees = []
for container in containers:
        fees = soup.select('div', class_ ='eds-event-card-content__sub eds-text-bm eds-text-color--ui-600 eds-l-mar-top-1 eds-event-card-content__sub--cropped')
        event_fees.append(fees.txt)
回答1:
The data about prices is loaded from external URL. You can use requests/json modules to get it:
import re
import json
import requests
url = "https://www.eventbrite.com/d/malaysia--kuala-lumpur--85675181/all-events/?page=1"
events_url = 'https://www.eventbrite.com/api/v3/destination/events/?event_ids={event_ids}&expand=event_sales_status,primary_venue,image,saves,my_collections,ticket_availability&page_size=99999'
html_text = requests.get(url).text
data1 = json.loads( re.search(r'window\.__SERVER_DATA__ = ({.*});', html_text).group(1) )
# uncomment this to print all data:
# print(json.dumps(data1, indent=4))
event_ids = ','.join(r['id'] for r in data1['search_data']['events']['results'])
data2 = requests.get(events_url.format(event_ids=event_ids)).json()
# uncomment this to print all data:
# print(json.dumps(data2, indent=4))
for e in data2['events']:
    print(e['name'])
    print(e['ticket_availability']['minimum_ticket_price']['display'],'-',e['ticket_availability']['maximum_ticket_price']['display'])
    print('-' * 80)
Prints:
Mega Career Fair & Post Graduate Education Fair 2020 - Mid Valley KL
0.00 MYR - 0.00 MYR
--------------------------------------------------------------------------------
Post Graduate Education Fair 2020 - Mid Valley KL
0.00 MYR - 0.00 MYR
--------------------------------------------------------------------------------
Traders Fair 2021 - Malaysia (Financial Education Event)
0.00 USD - 199.00 USD
--------------------------------------------------------------------------------
THE FIT Malaysia
0.00 MYR - 0.00 MYR
--------------------------------------------------------------------------------
Walk-In Interview with Career Partners of HRDF
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
Entrepreneurship for Beginners - Startup | Entrepreneur Hackathon Webinar
0.00 EUR - 0.00 EUR
--------------------------------------------------------------------------------
Good Shepherd Catholic Church  English Mass Registration- Scroll Down  pls
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
CGH 10:00am Assumption Mass Registration
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
Kuala Lumpu Video Speed Dating - Filter Off
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
Wiki Finance EXPO Kuala Lumpur 2021
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
English Sunday Service - 16 AUGUST
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
Good Shepherd Catholic  Bahasa Malaysia Mass Registration. Pls scroll down
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
How To Improve Your Focus and Limit Distractions - Kuala Lumpur
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
ANNUAL GENERAL MEETING
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
ITS ALL ABOUT PORTRAIT
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
First service (English)
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
KL International Flea Market 2020 / Bazaar Antarabangsa Kuala Lumpur
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
Branding Strategies For Startups
10.50 MYR - 31.50 MYR
--------------------------------------------------------------------------------
SHC 9.15am Sunday Mass Registration
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
SHC 9.15am Sunday Mass (Tamil) திருஇருதய ஆண்டவர் ஆலயத்தில்  காலை  9.15க்கு
0.00 USD - 0.00 USD
--------------------------------------------------------------------------------
来源:https://stackoverflow.com/questions/63390671/python-web-scrapping-html-with-same-class