Puppeteer $$evan 和 QuerySelector

问题描述 投票:0回答:1

我需要获取代码和产品标题。我尝试了这个,但它返回“无法读取 null 的属性‘innerText’”

HTML

 <th data-code="XXXXXX">
     <div>
      <div id="title"><span>Product title</span></div>
     </div>
    </th> 
    <th data-code="XXXXXX">
     <div>
      <div id="title"><span>Product title</span></div>
     </div>
    </th>

傀儡师

 let table = await page.$$eval(
     'table th',
      divs => divs.map((div, index) => ({
        title: div.querySelector('#title').innerText,
        code: div.dataset.code
      })
    )
 );
puppeteer
1个回答
0
投票

您现有的代码非常适合我:

const puppeteer = require("puppeteer"); // ^22.6.0

const html = `
<table>
  <th data-code="XXXXXX">
   <div>
    <div id="title"><span>Product title</span></div>
   </div>
  </th>
  <th data-code="XXXXXX 2">
   <div>
    <div id="title"><span>Product title 2</span></div>
   </div>
  </th>
</table>`;

let browser;
(async () => {
  browser = await puppeteer.launch();
  const [page] = await browser.pages();
  await page.setContent(html);

  // your exact code:
  let table = await page.$$eval(
     'table th',
      divs => divs.map((div, index) => ({
        title: div.querySelector('#title').innerText,
        code: div.dataset.code
      })
    )
  );

  console.log(table);
})()
  .catch(err => console.error(err))
  .finally(() => browser?.close());

输出:

[
  { title: 'Product title', code: 'XXXXXX' },
  { title: 'Product title 2', code: 'XXXXXX 2' }
]

该网站可能有影子 DOM、iframe、任意 JS 行为、cloudflare 块、A/B 测试或其他一些缓解因素,因此请分享一个包含实际页面的最小示例以获得进一步帮助。

© www.soinside.com 2019 - 2024. All rights reserved.