TXT文本存储
将知乎的发现板块的内容存入txt文本
import requests from pyquery import PyQuery as pq url="https://www.zhihu.com/explore" myheader={ "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit 537.36 (KHTML, like Gecko) Chrome" } html=requests.get(url,headers=myheader).text doc=pq(html) items=doc('.explore-tab .feed-item').items() for item in items: question=item.find('h2').text() author=item.find(".author-link-line").text() answer=pq(item.find(".content").html()).text() file=open("explore.txt","a",encoding="utf-8") file.write("\n".join([author,answer])) file.write("\n"+"="*50+"\n") file.close()
打开方式:


JSON文件存储
读取JSON
可以调用JSON库的load()方法将JSON文本字符串转换为JSON对象,可以调用dumps()方法将JSON对象转换为文本字符串。
import json str=''' [{ "name":"Bob", "gender":"male", "birthday":"1992-10-18" },{ "name":"Selina", "gender":"female", "birthday":"1995-10-18" }] ''' print(type(str)) word=json.loads(str); print(word) print(type(word))
输出:
<class 'str'> [{'name': 'Bob', 'gender': 'male', 'birthday': '1992-10-18'}, {'name': 'Selina', 'gender': 'female', 'birthday': '1995-10-18'}] <class 'list'>
获取键值对的两种方式:一种中括号加键名,另一种通过get()方法传入键名(get方法还可以传入第二个参数默认值)
word=json.loads(str) print(word[0]["name"]) print(word[0].get("name"))
输出JSON
dumps()方法将JSON对象转化为字符串
import json str=[{ "name":"Bob", "gender":"male", "birthday":"1992-10-18" },{ "name":"Selina", "gender":"female", "birthday":"1995-10-18" }] with open("datas.txt","w",encoding="utf-8") as file: file.write(json.dumps(str))

dumps()方法还可以添加一个参数indent,代表缩进字符个数
为了输出中文,还需要指定参数ensure_ascii为False,另外还要规定文件输出的编码:
with open("datas.txt","w",encoding="utf-8") as file: file.write(json.dumps(str,ensure_ascii=False))
CSV文件存储
CSV文件的写入
import csv with open("datas.csv","w") as csvfile: writer=csv.writer(csvfile) writer.writerow(["id","name","age"]) writer.writerow(["001","wuyou","21"]) writer.writerow(["002","chenwei","20"])
如果要修改列与列之间的分隔符,可以传入delimiter参数
也可以调用writerows()方法同时写入多行,此时参数就需要为二维列表。
读取CSV文件
调用csv库
import csv with open("datas.csv","r",encoding="utf-8") as csvfile: reader=csv.reader(csvfile) for row in reader: print(row)
调用pandas库的read_csv方法
文章来源: https://blog.csdn.net/qq_39905917/article/details/88847647