Using Python to use a website's search function

柔情痞子 提交于 2020-07-05 18:17:28

问题


I am trying to use a search function of a website with this code structure:

<div class='search'>
<div class='inner'>
<form accept-charset="UTF-8" action="/gr/el/products" method="get"><div style="margin:0;padding:0;display:inline"><input name="utf8" type="hidden" value="&#x2713;" /></div>
<label for='query'>Ενδιαφέρομαι για...</label>
<fieldset>
<input class="search-input" data-search-url="/gr/el/products/autocomplete.json" id="text_search" name="query" placeholder="Αναζητήστε προϊόν" type="text" />
<button type='submit'>Αναζήτηση</button>
</fieldset>
</form>
</div>
</div>

with this python script:

import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'}



payload = {
    'query':'test'
}

r = requests.get('http://www.pharmacy295.gr',data = payload ,headers = headers)

soup = BeautifulSoup(r.text,'lxml')
products = soup.findAll('span', {'class':'name'})
print(products)

This code came as a result of extensive searches on this website on how to do this task, however I never seem to manage to get any search results - just the main page of the website.


回答1:


Add products to your url and it will work fine, the method is get in the form and the form shows also the url. If you are unsure crack open use the developer console on firefox or chrome you can see exactly how the the request is made

payload = {
    'query':'neutrogena',

}

r = requests.get('http://www.pharmacy295.gr/products',data = payload ,headers = headers)

soup = BeautifulSoup(r.text,'lxml')
products = soup.findAll('span', {'class':'name'})
print(products)

Output:

[<span class="name">NEUTROGENA - Hand &amp; Nail Cream - 75ml</span>, <span class="name">NEUTROGENA - Hand Cream (Unscented) - 75ml</span>, <span class="name">NEUTROGENA - PROMO PACK 1+1 \u0394\u03a9\u03a1\u039f  Lip Moisturizer - 4,8gr</span>, <span class="name">NEUTROGENA - Lip Moisturizer with Nordic Berry - 4.9gr</span>]

Also if you prefer you can get the data as json:

In [13]: r = requests.get('http://www.pharmacy295.gr/el/products/autocomplete.json',data = payload ,headers = headers)

In [14]: print(r.json())
[{u'title': u'NEUTROGENA - Hand & Nail Cream - 75ml', u'discounted_price': u'5,31 \u20ac', u'photo': u'/system/uploads/asset/data/12584/tiny_108511.jpg', u'brand': u'NEUTROGENA ', u'path': u'/products/7547', u'price': u'8,17 \u20ac'}, {u'title': u'NEUTROGENA - Hand Cream (Unscented) - 75ml', u'discounted_price': u'4,03 \u20ac', u'photo': u'/system/uploads/asset/data/4689/tiny_102953.jpg', u'brand': u'NEUTROGENA ', u'path': u'/products/3958', u'price': u'6,20 \u20ac'}, {u'title': u'NEUTROGENA - PROMO PACK 1+1 \u0394\u03a9\u03a1\u039f  Lip Moisturizer - 4,8gr', u'discounted_price': u'3,91 \u20ac', u'photo': u'/system/uploads/asset/data/5510/tiny_118843.jpg', u'brand': u'NEUTROGENA ', u'path': u'/products/4644', u'price': u'4,60 \u20ac'}, {u'title': u'NEUTROGENA - Lip Moisturizer with Nordic Berry - 4.9gr', u'discounted_price': u'2,91 \u20ac', u'photo': u'/system/uploads/asset/data/12761/tiny_126088.jpg', u'brand': u'NEUTROGENA ', u'path': u'/products/7548', u'price': u'4,48 \u20ac'}]



回答2:


Do a r = requests.post('http://www.pharmacy295.gr',data = payload ,headers = headers)

GET requests also ignore the data...




回答3:


Firstly the URL is wrong. You are using http://www.pharmacy295.gr but you should be using http://www.pharmacy295.gr/gr/el/products. This URL can actually be simplified to http://www.pharmacy295.gr/products.

Also are making a GET request so, rather than data=payload, try params=payload.

data is for POST requests.

Here is the documentation for requests.get().




回答4:


import requests
from bs4 import BeautifulSoup

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:40.0) Gecko/20100101 Firefox/40.1'}

payload = {
    'query':'test',

}

r = requests.get('http://www.pharmacy295.gr/products',data = payload ,headers = headers)

soup = BeautifulSoup(r.text,'lxml')
products = soup.findAll('span', {'class':'name'})
print(products)


来源:https://stackoverflow.com/questions/35681210/using-python-to-use-a-websites-search-function

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!