无法截取页面

问题描述 投票:0回答:1

我正在尝试截取页面 https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wyszukiwanieKW 但我只收到消息说我的请求被拒绝了。

我正在使用如下代码

const puppeteer = require('puppeteer');         // Require Puppeteer module 
const url = "https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wyszukiwanieKW";           // Set website you want to screenshot
const Screenshot = async () => {                // Define Screenshot function
   const browser = await puppeteer.launch({headless: false, slowMo: 250});    // Launch a "browser"
   const page = await browser.newPage();        // Open a new page
   await page.goto(url);                        // Go to the website

   await page.screenshot({                      // Screenshot the website using defined options
    path: "./screenshot.png",                   // Save the screenshot in current directory
    fullPage: true                              // take a fullpage screenshot
  });
  await page.close();                           // Close the website
  await browser.close();                        // Close the browser
}
Screenshot();                                   // Call the Screenshot function

在正常的浏览器页面是空白的,几秒钟后刷新,然后正常显示,所以我试图对所有重定向进行操作,但它不起作用

const puppeteer = require('puppeteer');         // Require Puppeteer module 
const url = "https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wyszukiwanieKW";           // Set website you want to screenshot
const Screenshot = async () => {                // Define Screenshot function
   const browser = await puppeteer.launch({headless: false, slowMo: 250});    // Launch a "browser"
   const page = await browser.newPage();        // Open a new page
   await page.goto(url);                        // Go to the website

   page.on('console', msg => console.log('PAGE LOG:', msg.text()));
   await page.evaluate(() => console.log(`url is ${location.href}`));

   
    await page.waitForSelector('#numerKsiegiWieczystej', { visible: true, timeout: 0 });
   await page.screenshot({                      // Screenshot the website using defined options
    path: "./screenshot.png",                   // Save the screenshot in current directory
    fullPage: true                              // take a fullpage screenshot
  });
  await page.close();                           // Close the website
  await browser.close();                        // Close the browser
}
Screenshot();                                   // Call the Screenshot function

node.js puppeteer
1个回答
0
投票

该页面有一个机器人检测脚本

window["bobcmn"]
所以将这些添加到您的 package.json 中:

"puppeteer-extra": "^3.3.6",
"puppeteer-extra-plugin-stealth": "^2.11.2",

然后截图的代码将是:


const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());

let browser;
(async () => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
let url = 'https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wyszukiwanieKW';

async function screenshot(url) {

    await page.goto(url,{ waitUntil: 'networkidle2', timeout:0});
    await page.waitForSelector('div.content');
    await page.screenshot({path : "./screenshot.png", fullPage: true});

    await page.close();
    await browser.close();
}

await screenshot(url);

})().catch(err => console.error(err)).finally(() => browser ?. close());

© www.soinside.com 2019 - 2024. All rights reserved.