目标数据:

代码:
import requests
from lxml import etree
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36'}
web_Source_Code = requests.get("http://www.yuetutu.com", headers=headers)
print(web_Source_Code.status_code)
html = etree.HTML(web_Source_Code.text)
block_1 = html.xpath('//div[@id="newscontent"]/div[@class="l"]/ul/li')
return_Date = []
for block_2 in block_1:
novel_Classification = block_2.xpath('span[@class="s1"]/a/text()')
name_Of_The_Novel = block_2.xpath('span[@class="s2"]/a/text()')
novel_Chapter = block_2.xpath('span[@class="s3"]/a/text()')
author_Of_The_Novel = block_2.xpath('span[@class="s4"]/a/text()')
update_Time = block_2.xpath('span[@class="s5"]/text()')
return_Date.append({
"novel_Classification": novel_Classification,
"name_Of_The_Novel": name_Of_The_Novel,
"novel_Chapter": novel_Chapter,
"author_Of_The_Novel": author_Of_The_Novel,
"update_Time": update_Time
})
for date_s in return_Date:
print(date_s)
输出截图:

来源:CSDN
作者:Ferencz
链接:https://blog.csdn.net/Ferencz/article/details/104088066