Finding next input element using Mechanize?

问题

Using Mechanize, is it possible to find a phrase in the HTML of a page, for example, "email", and find the next <input* after that, and fill in that input field, and only that field?

回答1:

Mechanize uses Nokogiri internally to handle its DOM parsing, which is the basis of its ability to locate different elements in a page.

It's possible to access the parsed DOM, and, through it use Nokogiri to locate elements Mechanize doesn't normally let us find. For instance:

require 'mechanize'

agent = Mechanize.new
page = agent.get('http://www.example.com')

# Use Nokogiri to find the content of the <h1> tag...
puts page.at('h1').content # => "Example Domain"

For your search you'd want to use an XPath accessor to locate where "email" is in the page. Once you've done that you can locate the next <input> tag.

Starting from a simple HTML fragment, we'll pretend this comes from Mechanize:

page = Nokogiri::HTML('<div><form><p>email</p><input name="email"></form></div>')
puts page.to_html

Which looks like:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><div><form>
<p>email</p>
<input name="email">
</form></div></body></html>

Searching for "email":

page.at("//*[contains(text(),'email')]")
#<Nokogiri::XML::Element:0x3ff50d0c4bc0 name="p" children=[#<Nokogiri::XML::Text:0x3ff50d0c497c "email">]>

Building upon that, this gets the <input> tag:

input_tag = page.at("//*[contains(text(),'email')]/following-sibling::input")
#<Nokogiri::XML::Element:0x3ff50d09b75c name="input" attributes=[#<Nokogiri::XML::Attr:0x3ff50d09b5f4 name="name" value="email">]>

Once you've found that input tag, you can get the "name" from the tag using Nokogiri, and then tell Mechanize to locate and fill in that particular input field:

input_tag['name']
=> "email"

For a web form to function correctly, it has to have names for the elements. Those get passed to the server when the form is submitted. Without the names it'd take a lot of work to determine which input sent a particular piece of data, and, programmers being lazy, we don't want to work hard, so you can count on having a name to work with.

See "Ruby Mechanize, Nokogiri and Net::HTTP" for more information, plus a search of Stack Overflow, and reading the Nokogiri documenation and tutorials will give you lots of needed information for figuring out how to do the rest.

回答2:

First find the element with the phrase text:

el = page.at('*[text()*="some phrase"]')

From there you can get the first following input:

input = el.at('./following::input')

Now, find the ancestor form node of that input:

form_node = input.ancestors('form')[0]

Then use that to get the Mechanize::Form object

form = page.form_with(:form_node => form_node)

And now you can fill out the value

form[input[:name]] = 'foo'

回答3:

For a well-formed HTML page, an input element should have a label showing what the input is for. In this case, you can iterate all label, finding the one containing text "email", and get the associated input by the for attribute of the label.

However, not all HTML page are well-formed. No label, no for attribute, or other ill-formed issues.

If you mean the input right after some element in the DOM. You can do some DOM traversal to find whether an element containing "email" has an input element next to it.

If you mean the input next to an element in the rendered page, you should define what is "next to". And I think you cannot get what you want without great efforts. Some element located after the element "email" might be placed before it with some CSS trick. You need some graphical API to find that input. However, I don't see that in watir's API documentation.

来源：https://stackoverflow.com/questions/15697049/finding-next-input-element-using-mechanize

标签

ruby

mechanize