如何使用 Puppeteer 获取请求的原始下载大小?

问题描述 投票:0回答:4

即所有资源(包括视频/媒体)下载的数据总量,类似于 Chrome DevTools 的“网络”选项卡返回的数据。

javascript request puppeteer
4个回答
11
投票

截至 2018 年 1 月,似乎没有任何方法可以适用于所有资源类型(侦听

response
事件 视频失败),并且可以正确计算压缩资源。

最好的解决方法似乎是监听

Network.dataReceived
事件,并手动处理该事件:

const resources = {};
page._client.on('Network.dataReceived', (event) => {
  const request = page._networkManager._requestIdToRequest.get(
    event.requestId
  );
  if (request && request.url().startsWith('data:')) {
    return;
  }
  const url = request.url();
  // encodedDataLength is supposed to be the amount of data received
  // over the wire, but it's often 0, so just use dataLength for consistency.
  // https://chromedevtools.github.io/devtools-protocol/tot/Network/#event-dataReceived
  // const length = event.encodedDataLength > 0 ?
  //     event.encodedDataLength : event.dataLength;
  const length = event.dataLength;
  if (url in resources) {
    resources[url] += length;
  } else {
    resources[url] = length;
  }
});

// page.goto(...), etc.

// totalCompressedBytes is unavailable; see comment above
const totalUncompressedBytes = Object.values(resources).reduce((a, n) => a + n, 0);

4
投票

@mjs的解决方案即使在2021年也能完美运行。只需更换:

page._networkManager -> page._frameManager._networkManager

对我有用的完整示例:

const resources = {};
page._client.on('Network.dataReceived', (event) => {
  const request = page._frameManager._networkManager._requestIdToRequest.get(
    event.requestId
  );
  if (request && request.url().startsWith('data:')) {
    return;
  }
  const url = request.url();
  const length = event.dataLength;
  if (url in resources) {
    resources[url] += length;
  } else {
    resources[url] = length;
  }
});

await page.goto('https://stackoverflow.com/questions/48263345/how-can-i-get-the-raw-download-size-of-a-request-using-puppeteer');

const totalUncompressedBytes = Object.values(resources).reduce((a, n) => a + n, 0);
console.log(totalUncompressedBytes);

1
投票

如果您使用 puppeteer,您有服务器端节点...为什么不通过流或流传输请求,然后计算内容大小?

还有https://github.com/watson/request-stats

此外,您可能想要调用 page.waitForNavigation,因为您可能正在努力解决异步计时问题


-1
投票
const imgaes_width = await page.$$eval('img', anchors => [].map.call(anchors, img => img.width));
const imgaes_height = await page.$$eval('img', anchors => [].map.call(anchors, img => img.height));
© www.soinside.com 2019 - 2024. All rights reserved.