我使用 Puppeteer 控制一个带有查找表单的网站,该表单可以返回结果或“未找到记录”消息。我怎么知道哪个被退回了? waitForSelector 似乎一次只等待一个,而 waitForNavigation 似乎不起作用,因为它是使用 Ajax 返回的。 我正在使用 try catch,但是很难做到正确并且会减慢一切。
try {
await page.waitForSelector(SELECTOR1,{timeout:1000});
}
catch(err) {
await page.waitForSelector(SELECTOR2);
}
您可以同时使用
querySelectorAll
和waitForFunction
来解决这个问题。使用带有逗号的所有选择器将返回与任何选择器匹配的所有节点。
await page.waitForFunction(() =>
document.querySelectorAll('Selector1, Selector2, Selector3').length
);
现在,如果有某个元素,它只会返回
true
,它不会返回哪个选择器与哪些元素匹配。
如何使用
Promise.race()
就像我在下面的代码片段中所做的那样,并且不要忘记 { visible: true }
方法中的 page.waitForSelector()
选项。
public async enterUsername(username:string) : Promise<void> {
const un = await Promise.race([
this.page.waitForSelector(selector_1, { timeout: 4000, visible: true })
.catch(),
this.page.waitForSelector(selector_2, { timeout: 4000, visible: true })
.catch(),
]);
await un.focus();
await un.type(username);
}
我认为解决这个问题的最佳方法是从更基于 CSS 的角度出发。
waitForSelector
似乎遵循 CSS 选择器列表规则。因此本质上您只需使用逗号即可选择多个 CSS 元素。
try {
await page.waitForSelector('.selector1, .selector2',{timeout:1000})
} catch (error) {
// handle error
}
根据Md. Abu Taher的建议,我最终得到了这个:
// One of these SELECTORs should appear, we don't know which
await page.waitForFunction((sel) => {
return document.querySelectorAll(sel).length;
},{timeout:10000},SELECTOR1 + ", " + SELECTOR2);
// Now see which one appeared:
try {
await page.waitForSelector(SELECTOR1,{timeout:10});
}
catch(err) {
//check for "not found"
let ErrMsg = await page.evaluate((sel) => {
let element = document.querySelector(sel);
return element? element.innerHTML: null;
},SELECTOR2);
if(ErrMsg){
//SELECTOR2 found
}else{
//Neither found, try adjusting timeouts until you never get this...
}
};
//SELECTOR1 found
在 puppeteer 中,您可以简单地使用用逗号分隔的多个选择器,如下所示:
const foundElement = await page.waitForSelector('.class_1, .class_2');
返回的元素将是页面中找到的第一个元素的elementHandle。
接下来,如果您想知道找到了哪个元素,您可以像这样获取类名称:
const className = await page.evaluate(el => el.className, foundElement);
在您的情况下,类似于此的代码应该可以工作:
const foundElement = await page.waitForSelector([SELECTOR1,SELECTOR2].join(','));
const responseMsg = await page.evaluate(el => el.innerText, foundElement);
if (responseMsg == "No records found"){ // Your code here }
我遇到了类似的问题,并采用了这个简单的解决方案:
helpers.waitForAnySelector = (page, selectors) => new Promise((resolve, reject) => {
let hasFound = false
selectors.forEach(selector => {
page.waitFor(selector)
.then(() => {
if (!hasFound) {
hasFound = true
resolve(selector)
}
})
.catch((error) => {
// console.log('Error while looking up selector ' + selector, error.message)
})
})
})
然后使用它:
const selector = await helpers.waitForAnySelector(page, [
'#inputSmsCode',
'#buttonLogOut'
])
if (selector === '#inputSmsCode') {
// We need to enter the 2FA sms code.
} else if (selector === '#buttonLogOut') {
// We successfully logged in
}
进一步使用
Promise.race()
,将其包装起来并检查索引以获取进一步的逻辑:
// Typescript
export async function racePromises(promises: Promise<any>[]): Promise<number> {
const indexedPromises: Array<Promise<number>> = promises.map((promise, index) => new Promise<number>((resolve) => promise.then(() => resolve(index))));
return Promise.race(indexedPromises);
}
// Javascript
export async function racePromises(promises) {
const indexedPromises = promises.map((promise, index) => new Promise((resolve) => promise.then(() => resolve(index))));
return Promise.race(indexedPromises);
}
用途:
const navOutcome = await racePromises([
page.waitForSelector('SELECTOR1'),
page.waitForSelector('SELECTOR2')
]);
if (navigationOutcome === 0) {
//logic for 'SELECTOR1'
} else if (navigationOutcome === 1) {
//logic for 'SELECTOR2'
}
如果你想等待多个选择器中的第一个并获取匹配的元素,你可以从
waitForFunction
开始:
const matches = await page.waitForFunction(() => {
const matches = [...document.querySelectorAll(YOUR_SELECTOR)];
return matches.length ? matches : null;
});
waitForFunction
将返回一个 ElementHandle 但不是它们的数组。如果您只需要本机 DOM 方法,则无需获取句柄。例如,要从此数组中获取文本:
const contents = await matches.evaluate(els => els.map(e => e.textContent));
换句话说,
matches
的行为很像 Puppeteer 传递给 $$eval
的数组。
另一方面,如果您确实需要句柄数组,以下演示代码将进行转换并显示正常使用的句柄:
const puppeteer = require("puppeteer"); // ^16.2.0
const html = `
<!DOCTYPE html>
<html>
<head>
<style>
h1 {
display: none;
}
</style>
</head>
<body>
<script>
setTimeout(() => {
// add initial batch of 3 elements
for (let i = 0; i < 3; i++) {
const h1 = document.createElement("button");
h1.textContent = \`first batch #\${i + 1}\`;
h1.addEventListener("click", () => {
h1.textContent = \`#\${i + 1} clicked\`;
});
document.body.appendChild(h1);
}
// add another element 1 second later to show it won't appear in the first batch
setTimeout(() => {
const h1 = document.createElement("h1");
h1.textContent = "this won't be found in the first batch";
document.body.appendChild(h1);
}, 1000);
}, 3000); // delay before first batch of elements are added
</script>
</body>
</html>
`;
let browser;
(async () => {
browser = await puppeteer.launch({headless: true});
const [page] = await browser.pages();
await page.setContent(html);
const matches = await page.waitForFunction(() => {
const matches = [...document.querySelectorAll("button")];
return matches.length ? matches : null;
});
const length = await matches.evaluate(e => e.length);
const handles = await Promise.all([...Array(length)].map((e, i) =>
page.evaluateHandle((m, i) => m[i], matches, i)
));
await handles[1].click(); // show that the handles work
const contents = await matches.evaluate(els => els.map(e => e.textContent));
console.log(contents);
})()
.catch(err => console.error(err))
.finally(() => browser?.close())
;
不幸的是,它有点冗长,但这可以做成一个助手。
如果您有兴趣集成 {visible: true}
选项,另请参阅
等待多个元素匹配选择器中的第一个可见。
/**
* @typedef {import('puppeteer').ElementHandle} PuppeteerElementHandle
* @typedef {import('puppeteer').Page} PuppeteerPage
*/
/** Description of the function
@callback OutcomeHandler
@async
@param {PuppeteerElementHandle} element matched element
@returns {Promise<*>} can return anything, will be sent to handlePossibleOutcomes
*/
/**
* @typedef {Object} PossibleOutcome
* @property {string} selector The selector to trigger this outcome
* @property {OutcomeHandler} handler handler will be called if selector is present
*/
/**
* Waits for a number of selectors (Outcomes) on a Puppeteer page, and calls the handler on first to appear,
* Outcome Handlers should be ordered by preference, as if multiple are present, only the first occuring handler
* will be called.
* @param {PuppeteerPage} page Puppeteer page object
* @param {[PossibleOutcome]} outcomes each possible selector, and the handler you'd like called.
* @returns {Promise<*>} returns the result from outcome handler
*/
async function handlePossibleOutcomes(page, outcomes)
{
var outcomeSelectors = outcomes.map(outcome => {
return outcome.selector;
}).join(', ');
return page.waitFor(outcomeSelectors)
.then(_ => {
let awaitables = [];
outcomes.forEach(outcome => {
let await = page.$(outcome.selector)
.then(element => {
if (element) {
return [outcome, element];
}
return null;
});
awaitables.push(await);
});
return Promise.all(awaitables);
})
.then(checked => {
let found = null;
checked.forEach(check => {
if(!check) return;
if(found) return;
let outcome = check[0];
let element = check[1];
let p = outcome.handler(element);
found = p;
});
return found;
});
}
要使用它,您只需调用并提供一组可能的结果及其选择器/处理程序:
await handlePossibleOutcomes(page, [
{
selector: '#headerNavUserButton',
handler: element => {
console.log('Logged in',element);
loggedIn = true;
return true;
}
},
{
selector: '#email-login-password_error',
handler: element => {
console.log('password error',element);
return false;
}
}
]).then(result => {
if (result) {
console.log('Logged in!',result);
} else {
console.log('Failed :(');
}
})
Puppeteer,并且遇到了同样的问题,因此我想制作一个满足相同用例的自定义函数。
功能如下:
async function waitForMySelectors(selectors, page){
for (let i = 0; i < selectors.length; i++) {
await page.waitForSelector(selectors[i]);
}
}
函数中的第一个参数接收选择器数组,第二个参数是我们在其中执行等待过程的页面。调用函数如下例:
var SelectorsArray = ['#username', '#password'];
await waitForMySelectors(SelectorsArray, page);
虽然我还没有对其进行任何测试,但它看起来很实用。
对于某些类型的错误,Puppeteer 使用特定的错误类。这些类可通过 require('puppeteer/Errors') 获得。
支持的类列表:
处理超时错误的示例:
const {TimeoutError} = require('puppeteer/Errors');
// ...
try {
await page.waitForSelector('.foo');
} catch (e) {
if (e instanceof TimeoutError) {
// Do something if this is a timeout.
}
}