node.js async/await or generic-pool causes infinite loop?

杀马特。学长 韩版系。学妹 提交于 2019-12-11 05:16:56

问题


I was trying to create an automation script for work, it is supposed to use multiple puppeteer instances to process input strings simultaneously. the task queue and number of puppeteer instances are controlled by the package generic-pool, strangely, when i run the script on ubuntu or debian, it seems that it fells into an infinite loop. tries to run infinite number of puppeteer instances. while when run on windows, the output was normal.

const puppeteer = require('puppeteer');
const genericPool = require('generic-pool');
const faker = require('faker');
let options = require('./options');
let i = 0;
let proxies = [...options.proxy];

const pool = genericPool.createPool({
    create: async () => {
        i++;
        console.log(`create instance ${i}`);
        if (!proxies.length) {
            proxies = [...options.proxy];
        }
        let {control = null, proxy} = proxies.pop();
        let instance = await puppeteer.launch({
            headless: true,
            args: [
                `--proxy-server=${proxy}`,
            ]
        });
        instance._own = {
            proxy,
            tor: control,
            numInstance: i,
        };
        return instance;
    },
    destroy: async instance => {
        console.log('destroy instance', instance._own.numInstance);
        await instance.close()
    },
}, {
    max: 3, 
    min: 1, 
});

async function run(emails = []) {
    console.log('Processing', emails.length);
    const promises = emails.map(email => {
        console.log('Processing', email)
        pool.acquire()
            .then(browser => {
                console.log(`${email} handled`)
                pool.destroy(browser);})
    })
    await Promise.all(promises)
    await pool.drain();
    await pool.clear();
}

let emails = [a,b,c,d,e,];
run(emails)

Output

create instance 1
Processing 10
Processing Stacey_Haley52
Processing Polly.Block
create instance 2
Processing Shanny_Hudson59
Processing Vivianne36
Processing Jayda_Ullrich
Processing Cheyenne_Quitzon
Processing Katheryn20
Processing Jamarcus74
Processing Lenore.Osinski
Processing Hobart75
create instance 3
create instance 4
create instance 5
create instance 6
create instance 7
create instance 8
create instance 9

is it because of my async functions? How can I fix it? Appreciate your help!

Edit 1. modified according to @James suggested


回答1:


You want to return from your map rather than await, also don't await inside the destroy call, return the result and you can chain these e.g.

const promises = emails.map(e => pool.acquire().then(pool.destroy));

Or alternatively, you could just get rid of destroy completely e.g.

pool.acquire().then(b => b.close())



回答2:


The main problem you are trying to solve,

It is supposed to use multiple puppeteer instances to process input strings simultaneously.

Promise Queue

You can use a rather simple solution that involves a simple promise queue. We can use p-queue package to limit the concurrency as we wish. I used this on multiple scraping projects to always test things out.

Here is how you can use it.

// emails to handle
let emails = [a, b, c, d, e, ];

// create a promise queue
const PQueue = require('p-queue');

// create queue with concurrency, ie: how many instances we want to run at once
const queue = new PQueue({
    concurrency: 1
});

// single task processor
const createInstance = async (email) => {
    let instance = await puppeteer.launch({
        headless: true,
        args: [
            `--proxy-server=${proxy}`,
        ]
    });
    instance._own = {
        proxy,
        tor: control,
        numInstance: i,
    };
    console.log('email:', email)
    return instance;
}

// add tasks to queue
for (let email of emails) {
    queue.add(async () => createInstance(email))
}

Generic Pool Infinite Loop Problem

I removed all kind of puppeteer related code from your sample code and saw how it was still producing the infinite output to console.

create instance 70326
create instance 70327
create instance 70328
create instance 70329
create instance 70330
create instance 70331
...

Now, if you test few times, you will see it will throw the loop only if you something on your code is crashing. The culprit is this pool.acquire() promise, which is just re queuing on error.

To find what is causing the crash, use the following events,

pool.on("factoryCreateError", function(err) {
  console.log('factoryCreateError',err);
});

pool.on("factoryDestroyError", function(err) {
  console.log('factoryDestroyError',err);
});

There are some issues related to this:

  • acquire() never resolves/rejects if factory always rejects, here.
  • About the acquire function in pool.js, here.
  • .acquire() doesn't reject when resource creation fails, here.

Good luck!



来源:https://stackoverflow.com/questions/52801911/node-js-async-await-or-generic-pool-causes-infinite-loop

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!