问题
I am filling out and submitting a form using PhantomJS and then outputting the resulting page. The thing is, I have no idea if this thing is being submitted at all.
I print the resulting page, but it's the same as the original page. I don't know if this is because it redirects back or I didn't submit it or I need to wait longer or or or. In a real browser it sends a GET and receives a cookie, which it uses to send more GETS before eventually receiving the final result - flight data.
I copied this example How to submit a form using PhantomJS, using a diferent url and page.evaluate functions.
var page = new WebPage(), testindex = 0, loadInProgress = false;
page.onConsoleMessage = function(msg) {
console.log(msg);
};
page.onLoadStarted = function() {
loadInProgress = true;
console.log("load started");
};
page.onLoadFinished = function() {
loadInProgress = false;
console.log("load finished");
};
var steps = [
function() {
//Load Login Page
page.open("http://www.klm.com/travel/dk_da/index.htm");
},
function() {
//Enter Credentials
page.evaluate(function() {
$("#ebt-origin-place").val("CPH");
$("#ebt-destination-place").val("CDG");
$("#ebt-departure-date").val("1/5/2013");
$("#ebt-return-date").val("10/5/2013");
});
},
function() {
//Login
page.evaluate(function() {
$('#ebt-flightsearch-submit').click() ;
# also tried:
# $('#ebt-flight-searchform').submit();
});
},
function() {
// Output content of page to stdout after form has been submitted
page.evaluate(function() {
console.log(document.querySelectorAll('html')[0].outerHTML);
});
}
];
interval = setInterval(function() {
if (!loadInProgress && typeof steps[testindex] == "function") {
console.log("step " + (testindex + 1));
steps[testindex]();
testindex++;
}
if (typeof steps[testindex] != "function") {
console.log("test complete!");
phantom.exit();
}
}, 50);
回答1:
The site of interest is rather complicated to scrape. I logged the HTTP traffic from the US KLM site and got this:
GET /travel/us_en/apps/ebt/ebt_home.htm?name=on&ebt-origin-place=New+York+-+John+F.+Kennedy+International+%28JFK%29%2CNew+York&ebt-destination-place=Paris+-+Charles+De+Gaulle+Airport+%28CDG%29%2C+France&c%5B0%5D.os=JFK&c%5B0%5D.ost=airport&c%5B0%5D.ds=CDG&c%5B0%5D.dst=airport&c%5B1%5D.os=CDG&c%5B1%5D.ost=airport&c%5B1%5D.ds=JFK&inboundDestinationLocationType=airport&redirect=no&chdQty=0&infQty=0&c%5B0%5D.dd=2013-07-31&c%5B1%5D.dd=2013-08-14&c%5B1%5D.format=dd%2Fmm%2Fyyyy&flex=true&ebt-cabin-class=ECONOMY&adtQty=1&goToPage=&cffcc=ECONOMY&sc=false HTTP/1.1
Your injected values for the form elements are not what their server is looking for.
Inside page.evaluate(), you are sandboxed, but the sample code includes a hook to get sandboxed console activity onto the external console. For other debugging, you can also include object inspectors, etc., but they have to be injected into the page or part of the code passed into evaluate().
来源:https://stackoverflow.com/questions/15658317/phantomjs-submit-a-form