how to manage log in session through headless chrome?

馋奶兔 提交于 2019-11-28 22:29:33

问题


I need to make scraper to:

open headless browser, go to url, log in (there is steam oauth), fill some inputs, click 2 buttons

problem is every new instance of headless browser clears my login session, and then i need to login again and again...how to save it through instances? for example using puppeteer with headless chrome

or how can i open already logged in chrome headless instance? if i already log in in my main chrome window


回答1:


In puppeter you have access to the session cookies through page.cookies().

So once you log in, you could get every cookie and save it in a json file using jsonfile:

// Save Session Cookies
const cookiesObject = await page.cookies()
// Write cookies to temp file to be used in other profile pages
jsonfile.writeFile(cookiesFilePath, cookiesObject, { spaces: 2 },
 function(err) { 
  if (err) {
  console.log('The file could not be written.', err)
  }
  console.log('Session has been successfully saved')
})

Then, on your next iteration right before using page.goto() you can call page.setCookie() to load the cookies from the file one by one:

const previousSession = fileExistSync(cookiesFilePath)
if (previousSession) {
  // If file exist load the cookies
  const cookiesArr = require(`.${cookiesFilePath}`)
  if (cookiesArr.length !== 0) {
    for (let cookie of cookiesArr) {
      await page.setCookie(cookie)
    }
    console.log('Session has been loaded in the browser')
    return true
  }
}

Checkout the docs:

  • https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagecookiesurls
  • https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagesetcookiecookies



回答2:


There is an option to save user data using the userDataDir option when launching puppeteer. This stores the session and other things related to launching chrome.

puppeteer.launch({
  userDataDir: "./user_data"
});

It doesn't go into great detail but here's a link to the docs for it: https://pptr.dev/#?product=Puppeteer&version=v1.6.1&show=api-puppeteerlaunchoptions




回答3:


For a version of the above solution that actually works and doesn't rely on jsonfile (instead using the more standard fs) check this out:

Setup:

const fs = require('fs');
const cookiesPath = "cookies.txt";

Reading the cookies (put this code first):

// If the cookies file exists, read the cookies.
const previousSession = fs.existsSync(cookiesPath)
if (previousSession) {
  const content = fs.readFileSync(cookiesPath);
  const cookiesArr = JSON.parse(content);
  if (cookiesArr.length !== 0) {
    for (let cookie of cookiesArr) {
      await page.setCookie(cookie)
    }
    console.log('Session has been loaded in the browser')
  }
}

Writing the cookies:

// Write Cookies
const cookiesObject = await page.cookies()
fs.writeFileSync(cookiesPath, JSON.stringify(cookiesObject));
console.log('Session has been saved to ' + cookiesPath);



回答4:


For writing Cookies

async function writingCookies() {
const cookieArray = require(C.cookieFile); //C.cookieFile can be replaced by ('./filename.json')
await page.setCookie(...cookieArray);
await page.cookies(C.feedUrl); //C.url can be ('https://example.com')
}

For reading Cookies, for this, you've to install jsonfile in your project : npm install jsonfile

async function getCookies() {
const cookiesObject = await page.cookies();
jsonfile.writeFile('linkedinCookies.json', cookiesObject, { spaces: 2 },
  function (err) {
    if (err) {
      console.log('The Cookie file could not be written.', err);
    }
    console.log("Cookie file has been successfully saved in current working Directory : '" + process.cwd() + "'");
  })
}

Call these two functions using await and it will work for you.



来源:https://stackoverflow.com/questions/48608971/how-to-manage-log-in-session-through-headless-chrome

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!