beautifulsoup | 易学教程

BeautifulSoup can't find required div

阅读更多关于 BeautifulSoup can't find required div

问题 I have been trying to get at a nested div and its contents but am not able to. I want to access the div with class:'box coursebox'. response = res.read() soup = BeautifulSoup(response, "html.parser") div = soup.find_all('div', attrs={'class':'box coursebox'}) The above code gives a div with 0 elements, when there should be 8. find_all calls before this line work perfectly. Thanks for helping! 回答1: In the case of attributes having more than one value, Beautiful Soup puts all the values into a

BeautifulSoup can't find required div

阅读更多关于 BeautifulSoup can't find required div

Iterating html through tag classes with BeautifulSoup

阅读更多关于 Iterating html through tag classes with BeautifulSoup

问题 I'm saving some specific tags from webpage to an Excel file so I have this code: `import requests from bs4 import BeautifulSoup import openpyxl url = "http://www.euro.com.pl/telewizory-led-lcd-plazmowe,strona-1.bhtml" source_code = requests.get(url) plain_text = source_code.text soup = BeautifulSoup(plain_text, "html.parser") wb = openpyxl.Workbook() ws = wb.active tagiterator = soup.h2 row, col = 1, 1 ws.cell(row=row, column=col, value=tagiterator.getText()) tagiterator = tagiterator.find

Iterating html through tag classes with BeautifulSoup

阅读更多关于 Iterating html through tag classes with BeautifulSoup

How to use beautifulsoup to check if a string exists

阅读更多关于 How to use beautifulsoup to check if a string exists

问题 Hi I am trying to write a program that scraps a URL and if the scrap data contains a particular string do something how can i use beautiful soup to achieve this import requests from bs4 import BeautifulSoup data = requests.get('https://www.google.com',verify=False) soup= BeautifulSoup(data.string,'html.parser') for inp in soup.find_all('input'): if inp == "Google Search": print ("found") else: print ("nothing") 回答1: Your inp is a html object. You must use get_text() function import requests

Narrow in a bit more on a particular bit of text using beautifulsoup

阅读更多关于 Narrow in a bit more on a particular bit of text using beautifulsoup

问题 I'm trying to get the river level from here https://flood-warning-information.service.gov.uk/station/8108 I'm using this script import requests from bs4 import BeautifulSoup url = "https://flood-warning-information.service.gov.uk/station/8108" r = requests.get(url) soup = BeautifulSoup(r.content, "lxml") g_data = soup.find_all("header", {"intro"}) print g_data[0].text Which gives me River Skerne at John St Darlington Latest recorded level 0.72m at 10:30am Thursday 8 October 2020. which is

Beautiful soup returns None

阅读更多关于 Beautiful soup returns None

问题 I have the following html code and i use beautiful soup to extract information. I want to get for example Relationship status: Relationship <table class="box-content-list" cellspacing="0"> <tbody> <tr class="first"> <td> <strong> Relationship status: </strong> Relationship </td> </tr> <tr class="alt"> <td> <strong> Living: </strong> With partner </td> </tr> I have created the following code: xs = [x for x in soup.findAll('table', attrs = {'class':'box-content-list'})] for x in xs: #print x sx

你用 Python 写过最牛逼的程序是什么？

阅读更多关于你用 Python 写过最牛逼的程序是什么？

编译：Python开发者 - Jake_on 英文：Quora http://python.jobbole.com/85986/ 有网友在 Quora 上提问，「你用 Python 写过最牛逼的程序/脚本是什么？」。本文摘编了 3 个国外程序员的多个小项目，含代码。 Manoj Memana Jayakumar, 3000+ 顶更新：凭借这些脚本，我找到了工作！可看我在这个帖子中的回复，《Has anyone got a job through Quora? Or somehow made lots of money through Quora?》 1. 电影/电视剧字幕一键下载器我们经常会遇到这样的情景，就是打开字幕网站subscene 或者opensubtitles，搜索电影或电视剧的名字，然后选择正确的抓取器，下载字幕文件，解压，剪切并粘贴到电影所在的文件夹，并且需把字幕文件重命名以匹配电影文件的名字。是不是觉得太无趣呢？对了，我之前写了一个脚本，用来下载正确的电影或电视剧字幕文件，并且存储到与电影文件所在位置。所有的操作步骤仅需一键就可以完成。懵逼了吗？请看这个 Youtube 视频：https://youtu.be/Q5YWEqgw9X8 源代码存放在GitHub： subtitle-downloader 更新：目前，该脚本支持多个字幕文件同时下载。步骤

python

阅读更多关于 python

import requests from bs4 import BeautifulSoup import sqlite3 conn = sqlite3.connect( " test.db " ) c = conn.cursor() for num in range(1,101 ): url = " https://cs.lianjia.com/ershoufang/pg%s/ " % num headers = { ' User-Agent ' : ' Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko)Chrome/64.0.3282.140 Safari/537.36 ' , } req = requests.session() response = req.get(url, headers=headers, verify= False) info = response.text f1 = BeautifulSoup(info, ' lxml ' ) f2 = f1.find(class_= ' sellListContent ' ) f3 = f2.find_all(class_= ' clear LOGCLICKDATA ' ) for i in f3: data_id =

python爬虫爬取链家二手房信息

阅读更多关于 python爬虫爬取链家二手房信息

#coding=utf-8 import requests from fake_useragent import UserAgent from bs4 import BeautifulSoup import json import csv import time # 构建请求头 userAgent = UserAgent() headers = { 'user-agent': userAgent .Chrome } # 声明一个列表存储字典 data_list = [] def start_spider(page): #设置重连次数 requests.adapters.DEFAULT_RETRIES = 15 s = requests.session() #设置连接活跃状态为False s.keep_alive = False #爬取的url,默认爬取的南京的链家房产信息 url = 'https://nj.lianjia.com/ershoufang/pg{}/'.format(page) # 请求url resp = requests.get(url, headers=headers,timeout=10) # 讲返回体转换成Beautiful soup = BeautifulSoup(resp.content, 'lxml') # 筛选全部的li标签