Scrapy Spider for JSON Response

蓝咒 提交于 2019-12-03 04:05:38

found two issues in your code:

  1. start url is not accessible, I took out the www from it
  2. changed json.loads(response) to json.loads(response.body_as_unicode())

this works well for me:

class MySpider(BaseSpider):
    name = "youtubecrawler"
    allowed_domains = ["gdata.youtube.com"]
    start_urls = ['http://gdata.youtube.com/feeds/api/standardfeeds/DE/most_popular?v=2&alt=json']

    def parse(self, response):
        items = []
        jsonresponse = json.loads(response.body_as_unicode())
        for video in jsonresponse["feed"]["entry"]:
            item = YoutubeItem()
            print video["media$group"]["yt$videoid"]["$t"]
            print video["media$group"]["media$description"]["$t"]
            item ["title"] = video["title"]["$t"]
            print video["author"][0]["name"]["$t"]
            print video["category"][1]["term"]
            items.append(item)
        return items
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!