response | 易学教程

python爬虫笔记

阅读更多关于 python爬虫笔记

爬虫 http://httpbin.org/ 验证请求 1.urllib库（python3） python内置的HTTP请求库 urllib.request 请求模块 ( https://yiyibooks.cn/xx/python_352/library/urllib.request.html#module-urllib.request ) urllib.error 异常处理模块( https://yiyibooks.cn/xx/python_352/library/urllib.error.html#module-urllib.error ) urllib.parse url解析模块( https://yiyibooks.cn/xx/python_352/library/urllib.parse.html#module-urllib.parse ) urllib.robotparser robots.txt解析模块( https://yiyibooks.cn/xx/python_352/library/urllib.robotparser.html#module-urllib.robotparser ) 请求： import urllib.request urllib.request.urlopen(url, data=None, [timeout, ]*, cafile

Python爬虫|爬取喜马拉雅音频

阅读更多关于 Python爬虫|爬取喜马拉雅音频

"GOOD Python爬虫|爬取喜马拉雅音频喜马拉雅是知名的专业的音频分享平台，用户规模突破4.8亿，汇集了有声小说，有声读物，儿童睡前故事，相声小品等数亿条音频，成为国内发展最快、规模最大的在线移动音频分享平台。今晚分享突破障碍，探秘喜马拉雅的天籁之音，实现实时抓取，并保存到本地！知识点：开发环境：windows pycharm requests json 网络反爬技术文件的操作网络请求数据的转换数据类型的使用 1. 首先导入requests库 import requests 6. 将上面获得的json数据转换成字典格式（需要导入json模块） import json 4. header = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"} 这是应对反爬虫机制，伪装成合法浏览器而添加，本来复制过来的是User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537

Tornado 协程

阅读更多关于 Tornado 协程

同步异步I/O客户端 from tornado.httpclient import HTTPClient,AsyncHTTPClient def ssync_visit(): http_client = HTTPClient() response = http_client.fetch('www.baidu.com') # 阻塞，直到网站请求完成 print(response.body) def hendle_response(response): print(response.body) def async_visit(): http_client = AsyncHTTPClient() http_client.fetch('www.baidu.com',callback=hendle_response) # 非阻塞 async_visit() 协程 1、编写协程函数 from tornado import gen # 引入协程库 from tornado.httpclient import AsyncHTTPClient @gen.coroutine def coroutine_visit(): http_client = AsyncHTTPClient() response = yield http_client.fetch('www.baidu.com') print

request.setCharacterEncoding()和response.setCharacterEncoding()和response.setContentType()区别

阅读更多关于 request.setCharacterEncoding()和response.setCharacterEncoding()和response.setContentType()区别

初学java ee，老是碰到乱码问题，下面三种是常用来解决乱码的。 request.setCharacterEncoding() response.setCharacterEncoding() //作用是告诉servlet用utf-8转码，而不是用默认的iso8859-1是设置从request中取得的值或从数据库中取出的值 response.setContentType() //作用是让浏览器用utf-8来解析返回的数据明显区别： setCharacterEncoding() 设置页面静态文字 setContentType（）是设置动态文字（参数，数据库） response.setContentType指定 HTTP 响应的编码,同时指定了浏览器显示的编码. response.setCharacterEncoding设置HTTP 响应的编码,如果之前使用response.setContentType设置了编码格式,则使用response.setCharacterEncoding指定的编码格式覆盖之前的设置.与response.setContentType相同的是,调用此方法,必须在getWriter执行之前或者response被提交之前. request.setCharacterEncoding()是你设置获得数据的编码方式。 response

Node.js学习笔记(五) http模块

阅读更多关于 Node.js学习笔记(五) http模块

这篇文章我们将会学习 Node 的内置模块 http，http 模块主要用于搭建 HTTP 服务端和客户端 1、http 服务端（1）创建服务 http 服务端通过 http.Server 实现，我们可以通过以下两种方法创建一个 http.Server const http = require('http') // 方法一 var server = new http.Server() // 方法二 var server = http.createServer() （2）绑定事件 http.Server 是一个基于事件的服务器，我们需要为不同的事件指定相应的处理函数，即可完成功能最常用的事件莫过于 request ，当服务器获取到请求时，就会触发该事件事件处理函数接收两个参数，分别对应 http.IncomingMessage 和 http.ServerResponse server.on('request', function(message, response) { console.log(message instanceof http.IncomingMessage) // true console.log(response instanceof http.ServerResponse) // true console.log('收到请求') }) （3）属性方法

CrawlSpider爬取拉钩

阅读更多关于 CrawlSpider爬取拉钩

CrawlSpider继承Spider,提供了强大的爬取规则(Rule)供使用填充 custom_settings ,浏览器中的请求头 from datetime import datetime import scrapy from scrapy.linkextractors import LinkExtractor from scrapy.spiders import CrawlSpider, Rule from ArticleSpider.items import LagouJobItem, LagouJobItemLoader from ArticleSpider.utils.common import get_md5 class LagouSpider(CrawlSpider): name = 'lagou' allowed_domains = ['www.lagou.com'] start_urls = ['https://www.lagou.com/'] custom_settings = { } rules = ( Rule(LinkExtractor(allow=("zhaopin/.*",)), follow=True), Rule(LinkExtractor(allow=("gongsi/j\d+.html",)), follow=True), Rule

vue通过get方法下载java服务器excel模板

阅读更多关于 vue通过get方法下载java服务器excel模板

vue方法 handleDownTemplateXls(fileName){ if(!fileName || typeof fileName != "string"){ fileName = "导入模板" } let param = {...this.queryParam}; if(this.selectedRowKeys && this.selectedRowKeys.length>0){ param['selections'] = this.selectedRowKeys.join(",") } console.log("下载模板参数",param) downFile(this.url.downTemplateXlsUrl,param).then((data)=>{ if (!data) { this.$message.warning("文件下载失败") return } if (typeof window.navigator.msSaveBlob !== 'undefined') { window.navigator.msSaveBlob(new Blob([data]), fileName+'.xls') }else{ let url = window.URL.createObjectURL(new Blob([data])) let link = document

Http2

阅读更多关于 Http2

1、Http2优势信道复用分帧传输 Server Push 如下图：上面是http1，下面是http2 2、搭建http2 1)配置前端文件结构 server.js const http = require('http'); const fs = require('fs') http.createServer(function(request, response){ console.log('request come', request.url) const html = fs.readFileSync("test.html",'utf-8') const img = fs.readFileSync("test.jpg"); if(request.url === '/'){ response.writeHead(200,{ 'Content-Type':'text/html', 'Connection':'close', 'Link': '</test.jpg>; as=image; rel=preload' }) response.end(html) }else{ response.writeHead(200,{ 'Content-Type':'text/html', 'Connection':'close' }) response.end(img) } })

aiohttp 高并发抓取

阅读更多关于 aiohttp 高并发抓取

建立一个 session 会话对象首先建立一个 session 会话对象，利用会话对象 session 去访问网页访问 python 官网，async，await 关键字是将函数设置为异步操作，是 aiohttp 使用方式 import aiohttp import asyncio async def hello(URL): async with aiohttp.ClientSession() as session: async with session.get(URL) as response: responae = await response.text() print(response) if __name__ == '__main__': URl = 'http://python.org' loop = asyncio.get_event_loop() loop.run_until_complete(hello(URl)) 请求头，超时，cookies，代理在第二段代码修改 from aiohttp import ClientSession import aiohttp import asyncio # 设置请求头 headers = {'content-type' : "application/json"} async def hello(URL): async

Http服务和JSP

阅读更多关于 Http服务和JSP

需要先安装tomocat8.0，并且使用的IDEA 一个web项目新建项目写代码 // 新建一个class @WebServlet("/test") public class Main extends HttpServlet { protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setHeader("Access-Control-Allow-Origin", "*"); response.setHeader("Content-type", "text/html;charset=UTF-8"); response.setCharacterEncoding("UTF-8"); String text = request.getParameter("text"); System.out.println("结果已经传入后台：" + text); String output = "后台返回的结果加上前台的结果" + text; response.getWriter().write(output); } protected void doPost(HttpServletRequest

订阅 response