python爬取拉勾网---成功解决:'status': False, 'msg': '您操作太频繁,请稍后再访问', 'clientIp': '117.136.41.XX', 'state': 2
初次用python写爬虫,听说拉勾网反爬机制最厉害,今天学完python语法就上 试着爬取java后端开发招聘情况,没想第一步一来就out !!! 错误代码示范: from urllib import request from urllib import parse url = 'https://www.lagou.com/jobs/positionAjax.json?needAddtionalResult=false' headers = { 'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36' , 'Referer' : 'https://www.lagou.com/jobs/list_python?labelWords=sug&fromSearch=true&suginput=py' } data = { 'first' : 'true' , 'pn' : 1, 'kd' : 'python' } content = request.Request ( url, headers = headers, data = parse.urlencode ( data ) .encode