我正在尝试截取页面 https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wyszukiwanieKW 但我只收到消息说我的请求被拒绝了。
我正在使用如下代码
const puppeteer = require('puppeteer'); // Require Puppeteer module
const url = "https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wyszukiwanieKW"; // Set website you want to screenshot
const Screenshot = async () => { // Define Screenshot function
const browser = await puppeteer.launch({headless: false, slowMo: 250}); // Launch a "browser"
const page = await browser.newPage(); // Open a new page
await page.goto(url); // Go to the website
await page.screenshot({ // Screenshot the website using defined options
path: "./screenshot.png", // Save the screenshot in current directory
fullPage: true // take a fullpage screenshot
});
await page.close(); // Close the website
await browser.close(); // Close the browser
}
Screenshot(); // Call the Screenshot function
在正常的浏览器页面是空白的,几秒钟后刷新,然后正常显示,所以我试图对所有重定向进行操作,但它不起作用
const puppeteer = require('puppeteer'); // Require Puppeteer module
const url = "https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wyszukiwanieKW"; // Set website you want to screenshot
const Screenshot = async () => { // Define Screenshot function
const browser = await puppeteer.launch({headless: false, slowMo: 250}); // Launch a "browser"
const page = await browser.newPage(); // Open a new page
await page.goto(url); // Go to the website
page.on('console', msg => console.log('PAGE LOG:', msg.text()));
await page.evaluate(() => console.log(`url is ${location.href}`));
await page.waitForSelector('#numerKsiegiWieczystej', { visible: true, timeout: 0 });
await page.screenshot({ // Screenshot the website using defined options
path: "./screenshot.png", // Save the screenshot in current directory
fullPage: true // take a fullpage screenshot
});
await page.close(); // Close the website
await browser.close(); // Close the browser
}
Screenshot(); // Call the Screenshot function
该页面有一个机器人检测脚本
window["bobcmn"]
所以将这些添加到您的 package.json 中:
"puppeteer-extra": "^3.3.6",
"puppeteer-extra-plugin-stealth": "^2.11.2",
然后截图的代码将是:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
puppeteer.use(StealthPlugin());
let browser;
(async () => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
let url = 'https://przegladarka-ekw.ms.gov.pl/eukw_prz/KsiegiWieczyste/wyszukiwanieKW';
async function screenshot(url) {
await page.goto(url,{ waitUntil: 'networkidle2', timeout:0});
await page.waitForSelector('div.content');
await page.screenshot({path : "./screenshot.png", fullPage: true});
await page.close();
await browser.close();
}
await screenshot(url);
})().catch(err => console.error(err)).finally(() => browser ?. close());