CasperJS: Iterating through URL's

前端 未结 2 654
灰色年华
灰色年华 2020-12-18 10:36

I\'m pretty new to CasperJS, but isn\'t there a way to open a URL and execute CasperJS commands in for loops? For example, this code doesn\'t work as I expected it to:

相关标签:
2条回答
  • 2020-12-18 11:01

    If you need to get context then use the example here: https://groups.google.com/forum/#!topic/casperjs/n_zXlxiPMtk

    I used the IIFE (immediately-invoked-function-expression) option.

    Eg:

    for(var i in links) {
      var link = links[i];
    
      (function(index) {
        var link = links[index]
        var filename = link.replace(/#/, '');
        filename = filename.replace(/\//g, '-') + '.png';
    
        casper.echo('Attempting to capture: '+link);
        casper.thenOpen(vars.domain + link).waitForSelector('.title h1', function () {
          this.capture(filename);
        });
      })(i);
    }
    

    links could be an array of objects and therefore your index is a reference to a group of properties if need be...

    var links = [{'page':'some-page.html', 'filename':'page-page.png'}, {...}]
    
    0 讨论(0)
  • 2020-12-18 11:16

    As Fanch and Darren Cook stated, you could use an IIFE to fix the url value inside of the thenOpen step.

    An alternative would be to use getCurrentUrl to check the url. So change the line

    this.echo(normal_url);
    

    to

    this.echo(this.getCurrentUrl());
    

    The problem is that normal_url references the last value that was set but not the current value because it is executed later. This does not happen with casper.thenOpen(normal_url, function(){...});, because the current reference is passed to the function. You just see the wrong url, but the correct url is actually opened.


    Regarding your updated question:

    All then* and wait* functions in the casperjs API are step functions. The function that you pass into them will be scheduled and executed later (triggered by casper.run()). You shouldn't use variables outside of steps. Just add further steps inside of the thenOpen call. They will be scheduled in the correct order. Also you cannot return anything from thenOpen.

    var somethingDone = false;
    var status;
    casper.thenOpen(normal_url, function() {
        status = this.status(false)['currentHTTPStatus'];
        if (status != 200) {
            this.thenOpen(alternativeURL, function(){
                // do something
                somethingDone = true;
            });
        }
    });
    casper.then(function(){
        console.log("status: " + status);
        if (somethingDone) {
            // something has been done
            somethingDone = false;
        }
    });
    

    In this example this.thenOpen will be scheduled after casper.thenOpen and somethingDone will be true inside casper.then because it comes after it.


    There are some things that you need to fix:

    • You don't use your counter i: you probably mean "./Draws/wimbledon_draw_" + i + ".json" not "./Draws/wimbledon_draw_" + counter + ".json"
    • You cannot require a JSON string. Interestingly, you can require a JSON file. I still would use fs.read to read the file and parse the JSON inside it (JSON.parse).

    Regarding your question...

    You didn't schedule any commands. Just add steps (then* or wait*) behind or inside of thenOpen.

    0 讨论(0)
提交回复
热议问题