mechanize | 易学教程

Python mechanize javascript

阅读更多关于 Python mechanize javascript

问题 I'm trying to use mechanize to grab prices for New York's metro-north railroad from this site: http://as0.mta.info/mnr/fares/choosestation.cfm The problem is that when you select the first option, the site uses javascript to populate your list of possible destinations. I have written equivalent code in python, but I can't seem to get it all working. Here's what I have so far: import mechanize import cookielib from bs4 import BeautifulSoup br = mechanize.Browser() br.set_handle_robots(False)

Python Mechanize Error +

阅读更多关于 Python Mechanize Error +

I am new to python (done some java in the past). I recently decided to automate a process that takes me about 20 hours once a year. I need to login to a vendor's website, with a login form they have. That then loads a new form which I can select an order from, and then it loads yet another form I can submit an item number to. This then loads the page with sizes of the item and price per size, I take this information and put it into a spreadsheet. The row has columns based on number of sizes and then price (item,sm,med,lg,9.99,10.99,12.99). After that return to the browser, i hit the back

Get JavaScript variable using Mechanize

阅读更多关于 Get JavaScript variable using Mechanize

I want to get a JavaScript variable from https://admin.booking.com/hotel/hoteladmin in head > script > var token . I don't know how this variable is set by the browser because when I get this page from Mechanize I get: var token = '' || 'empty-token', Here is the code I use to GET this page: login_url = "https://admin.booking.com/hotel/hoteladmin" agent = Mechanize.new agent.verify_mode= OpenSSL::SSL::VERIFY_NONE page = agent.get(login_url) Jan Dragsbaek If you want to access this token via JavaScript in mechanize/watir, you need to be able to access it with your browsers developers tools as

what does mechanize tag br.set_handle_gzip do?

阅读更多关于 what does mechanize tag br.set_handle_gzip do?

I'm trying python mechanize module in order to write some scripts. When i run it i get the following error.What actually is this set_handle_gzip ? manoj@ubuntu:~/pyth$ python rock.py │ rock.py:15: UserWarning: gzip transfer encoding is experimental! │ br.set_handle_gzip(True) │ Traceback (most recent call last): │ File "rock.py", line 60, in <module> │ br.follow_link(text='Sign out') │ File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py", line│ 569, in follow_link │ return self.open(self.click_link(link, **kwds)) │ File "/usr/lib/python2.7/dist-packages/mechanize/_mechanize.py",

Finding next input element using Mechanize?

阅读更多关于 Finding next input element using Mechanize?

Using Mechanize, is it possible to find a phrase in the HTML of a page, for example, "email", and find the next <input* after that, and fill in that input field, and only that field? the Tin Man Mechanize uses Nokogiri internally to handle its DOM parsing, which is the basis of its ability to locate different elements in a page. It's possible to access the parsed DOM, and, through it use Nokogiri to locate elements Mechanize doesn't normally let us find. For instance: require 'mechanize' agent = Mechanize.new page = agent.get('http://www.example.com') # Use Nokogiri to find the content of the

Clicking in a online js button with python

阅读更多关于 Clicking in a online js button with python

问题 I'm trying to click on the "Search all flights" button in http://www.priceline.com/ but i'm having some problems. I know that mechanize doesn't work with javascript so I tried so look on the source code trying to do what the button does but i can't find the function. There's any other way to do this? 回答1: I suggest using selenium (download link), which has very heavy support for javascript. All docs here. Here is a quick example of how you can do that: from selenium import webdriver driver =

Python mechanize javascript

阅读更多关于 Python mechanize javascript

I'm trying to use mechanize to grab prices for New York's metro-north railroad from this site: http://as0.mta.info/mnr/fares/choosestation.cfm The problem is that when you select the first option, the site uses javascript to populate your list of possible destinations. I have written equivalent code in python, but I can't seem to get it all working. Here's what I have so far: import mechanize import cookielib from bs4 import BeautifulSoup br = mechanize.Browser() br.set_handle_robots(False) br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615

Clicking in a online js button with python

阅读更多关于 Clicking in a online js button with python

I'm trying to click on the "Search all flights" button in http://www.priceline.com/ but i'm having some problems. I know that mechanize doesn't work with javascript so I tried so look on the source code trying to do what the button does but i can't find the function. There's any other way to do this? I suggest using selenium ( download link ), which has very heavy support for javascript. All docs here . Here is a quick example of how you can do that: from selenium import webdriver driver = webdriver.Firefox() driver.get("http://www.priceline.com/") driver.find_element_by_id("hotel-btn-submit

mechanize (python) click on a javascript type link

阅读更多关于 mechanize (python) click on a javascript type link

is it possible to have mechanize follow an anchor link that is of type javascript? I am trying to login into a website in python using mechanize and beautifulsoup. this is the anchor link <a id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_LoginButton" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("StaticModuleID15$ctl00$SkinLogin1$Login1$Login1$LoginButton", "", true, "Login1", "", false, true))"><img id="StaticModuleID15_ctl00_SkinLogin1_Login1_Login1_Image2" border="0" src="../../App_Themes/default/images/Member/btn_loginenter.gif" align="absmiddle" style=

Python, Mechanize - request disallowed by robots.txt even after set_handle_robots and add_headers

阅读更多关于 Python, Mechanize - request disallowed by robots.txt even after set_handle_robots and add_headers

I have made a web crawler which gets all links till the 1st level of page and from them it gets all link and text plus imagelinks and alt. here is whole code: import urllib import re import time from threading import Thread import MySQLdb import mechanize import readability from bs4 import BeautifulSoup from readability.readability import Document import urlparse url = ["http://sparkbrowser.com"] i=0 while i<len(url): counterArray = [0] levelLinks = [] linkText = ["homepage"] levelLinks = [] def scraper(root,steps): urls = [root] visited = [root] counter = 0 while counter < steps: step_url =