Empty list returning by xpath in scrapy

只愿长相守 提交于 2019-12-24 15:48:20

问题


I am working on scrapy , i am trying to gather some data from a site ,

Spider Code

class NaaptolSpider(BaseSpider):
    name = "naaptol"
    domain_name = "www.naaptol.com"
    start_urls = ["http://www.naaptol.com/buy/mobile_phones/mobile_handsets.html"]

    def parse(self, response):
        hxs = HtmlXPathSelector(response)
        cell_matter = hxs.select('//div[@class="gridInfo"]/div[@class="gridProduct gridProduct_special"]')
        items=[]
        for i in cell_matter:
               cell_names = i.select('//p[@class="proName"]/a/text()').extract()
               prices = i.select('//p[@class="values"]/strong/text()').extract()
               item = ExampleItem()
               item['cell_name'] = cell_names
               item['price'] = prices
               items.append(item) 
        return [FormRequest(url="http://www.naaptol.com/faces/jsp/search/searchResults.jsp",
            formdata={'type': 'cat_catlg',
            'catid': '27',
            'sb' : '9,8',
            'frm' : '1',
            'max' : '15',
            'req': 'ajax'
            },
            callback=self.parse_item
            )]

def parse_item(self, response):
     hxs = HtmlXPathSelector(response) 
     cell_matter = hxs.select('//div[@class="gridInfo"]/div[@class="gridProduct gridProduct_special"]')
     for i in cell_matter:
               cell_names = i.select('//p[@class="proName"]/a/text()').extract()
               prices = i.select('//p[@class="values"]/strong/text()').extract()
               print cell_names
               print prices 

Result:

2012-06-15 09:38:36+0530 [naaptol] DEBUG: Crawled (200) <POST http://www.naaptol.com/faces/jsp/search/searchResults.jsp> (referer: http://www.naaptol.com/buy/mobile_phones/mobile_handsets.html)
[]
[]

Actually i had posted the form to achieve the pagination which is in javascript

Here i am receiving the response from parse method in parse_item method, but when i used the xpath same as in parse method its returning an empty list as above, can anyone tell me why its returning an empty array, and whats wrong in my code.

Thanks in advance


回答1:


The response is in JSON format:

{
  "prodList": [
    {
      "pid": "955492",
      "pnm": "Samsung Star 3 Duos",
      "mctid": "27",
      "pc": "5,650",
      "mrp": "6290",
      "pdc": "10",
      "pimg": "Samsung-Star-3-duos-1.jpg",
      "rt": "8",
      "prc": "1",
      "per": "Y",
      (...)
    },
    (...)
}

In order to parse it, you can use python's json module. An example of what you are trying to achieve is here: Empty list for hrefs to achieve pagination through JavaScript onclick functions.



来源:https://stackoverflow.com/questions/11045510/empty-list-returning-by-xpath-in-scrapy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!