问题
Node.js puppeteer - How do I download, access and process a xml file and the content in puppeteer?
When clicking on a link like:
await page.evaluate(() => {
document.querySelector('#datagrid > div > a:nth-child(2)').click();
});
... I can download a xml file looking like this:
XML file:
<table>
<row>
<column>Titel01</column>
<column>Titel02</column>
<column>Titel03</column>
<column>Titel04</column>
<column>Titel05</column>
<column>Titel06</column>
<column>Titel07</column>
<column>Titel08</column>
<column>Titel09</column>
<column>Titel10</column>
<column>Titel11</column>
<column>Titel12</column>
<column>Titel13</column>
<column>Titel14</column>
<column>Titel15</column>
<column>Titel16</column>
</row>
<row>
<column>Value01</column>
<column/>
<column>Value03</column>
<column>Value04</column>
<column>Value05</column>
<column>Value06</column>
<column>Value07</column>
<column>Value08</column>
<column>Value09</column>
<column>Value10</column>
<column>Value11</column>
<column>Value12</column>
<column>Value13</column>
<column>Value14</column>
<column>Value15</column>
<column>Value16</column>
</row>
... // starting possible more rows
<row>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column/>
<column>Value15B</column>
<column>Value16B</column>
</row>
... // possible
</table>
How can I access the values and store it in variables to further process it in puppeteer?
回答1:
I don't know if this is the best solution but works. I would return, instead of .click()
the href value with document.querySelector('#datagrid > div > a:nth-child(2)').href;
and do another .goto
once you open the new page, you could parse it. Here a full example:
const newPage = await page.evaluate(() => {
return document.querySelector('#datagrid > div > a:nth-child(2)').href;
});
await page.goto(newPage, {waitUntil: 'load'});
const newPage2 = await page.evaluate(() => {//<-- open the new page
var columns = document.getElementsByTagName("column");
var values = {"values":[]};
for(let f in columns){
values.values.push(columns[f].innerText);
}
return JSON.stringify(values);//<-- return the values of columns
});
console.log(JSON.parse(newPage2))//<-- we have all values
来源:https://stackoverflow.com/questions/51875386/node-js-puppeteer-downloading-accessing-a-xml-file-and-process-the-content