使用 libcurl 集成搜索栏

问题描述 投票:0回答:1

我正在使用curl库访问一个网站并获取该网站上搜索栏的结果。它们都在同一个网站上,没有重定向,所以我不能只执行 .com/search?=... 搜索栏定义为:

<input type="search" id="search" name="search" placeholder="Search for a series" wire:model.debounce.600ms="query" class="block w-full rounded border border-transparent bg-neutral-800 py-2 pl-10 pr-3 leading-5 text-neutral-300 placeholder-neutral-400 focus:border-neutral-600 focus:bg-neutral-600 focus:text-white focus:outline-none focus:ring-neutral-900 sm:text-sm">

有两个相同的侦听器,定义为: imgListeners

每当给出输入时,这些元素就会出现在其下方,搜索结果位于它们的更深处: imgAppear

但无论我做什么,我都无法让搜索工作(让网站本身工作正常)。我总是收到错误消息,说该方法不被允许。

<hr>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="robots" content="noindex,nofollow,noarchive" />
<title>An Error Occurred: Method Not Allowed</title>
<link rel="icon" href="data:image/svg+xml,<svg xmlns=%22http://www.w3.org/2000/svg%22 viewBox=%220 0 128 128%22><text y=%221.2em%22 font-size=%2296%22>❌</text></svg>">
<style>body { background-color: #fff; color: #222; font: 16px/1.5 -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Helvetica Neue", Arial, sans-serif; margin: 0; }
.container { margin: 30px; max-width: 600px; }
h1 { color: #dc3545; font-size: 24px; }
h2 { font-size: 18px; }</style>
</head>
<body>
<div class="container">
<h1>Oops! An Error Occurred</h1>
<h2>The server returned a "405 Method Not Allowed".</h2>
<p>
Something is broken. Please let us know what you were doing when this error occurred.
We will fix it as soon as possible. Sorry for any inconvenience caused.
</p>
</div>
<script>(function(){var js = "window['__CF$cv$params']={r:'83bc5ac5bd5b5aec',t:'MTcwMzYyNTQ4OC41NzEwMDA='};_cpo=document.createElement('script');_cpo.nonce='',_cpo.src='/cdn-cgi/challenge-platform/scripts/jsd/main.js',document.getElementsByTagName('head')[0].appendChild(_cpo);";var _0xh = document.createElement('iframe');_0xh.height = 1;_0xh.width = 1;_0xh.style.position = 'absolute';_0xh.style.top = 0;_0xh.style.left = 0;_0xh.style.border = 'none';_0xh.style.visibility = 'hidden';document.body.appendChild(_0xh);function handler() {var _0xi = _0xh.contentDocument || _0xh.contentWindow.document;if (_0xi) {var _0xj = _0xi.createElement('script');_0xj.innerHTML = js;_0xi.getElementsByTagName('head')[0].appendChild(_0xj);}}if (document.readyState !== 'loading') {handler();} else if (window.addEventListener) {document.addEventListener('DOMContentLoaded', handler);} else {var prev = document.onreadystatechange || function () {};document.onreadystatechange = function (e) {prev(e);if (document.readyState !== 'loading') {document.onreadystatechange = prev;handler();}};}})();</script><script defer src="https://static.cloudflareinsights.com/beacon.min.js/v84a3a4012de94ce1a686ba8c167c359c1696973893317" integrity="sha512-euoFGowhlaLqXsPWQ48qSkBSCFs3DPRyiwVu3FjR96cMPx+Fr+gpWRhIafcHwqwCqWS42RZhIudOvEI+Ckf6MA==" data-cf-beacon='{"rayId":"83bc5ac5bd5b5aec","version":"2023.10.0","token":"2b8d18b8cd694f8abcc90dc2b359c7f7"}' crossorigin="anonymous"></script>
</body>
</html>



我只想获取搜索结果的 URL 并将它们放入向量中 - 如果有更好的东西,我愿意更改库。

这是我当前的尝试:

size_t WriteCallback(void* contents, size_t size, size_t nmemb, std::string* output) {
size_t totalSize = size * nmemb;
output->append((char*)contents, totalSize);
return totalSize;
}

int main() {
curl_global_init(CURL_GLOBAL_DEFAULT);


CURL* curl = curl_easy_init();

std::string searchQuery = "sss";
std::string postData = "search=" + searchQuery;
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, postData.c_str());

std::string responseData;
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &responseData);
curl_easy_setopt(curl, CURLOPT_USERAGENT, "simple scraper");

curl_easy_setopt(curl, CURLOPT_URL, URL.c_str());


CURLcode res = curl_easy_perform(curl);

curl_easy_cleanup(curl);
curl_global_cleanup();


std::cout << "Response:\n" << responseData << std::endl;

return 0;
}

我还尝试了重复问题提示中的建议方法,但这也不起作用:

int main() {

CURL *curl;
curl_global_init(CURL_GLOBAL_ALL);
curl = curl_easy_init();

curl_easy_setopt(curl, CURLOPT_URL, URL.c_str());
curl_easy_setopt(curl, CURLOPT_POST, 1);
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, "search=sss");
curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
curl_easy_setopt(curl, CURLOPT_USERAGENT, "simple scraper");

std::string data = "none";
curl_easy_setopt(curl, CURLOPT_WRITEDATA, &data);
curl_easy_perform(curl);

std::cout << data;

 
curl_easy_cleanup(curl);

return 0;
}
html c++ libcurl
1个回答
0
投票

您尚未显示表单和提交按钮。从错误来看,我只能猜测这一定是一个 GET 请求:

CURL* curl = curl_easy_init();

curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);

URL += "?search=" + searchQuery;
curl_easy_setopt(curl, CURLOPT_URL, URL.c_str());

CURLcode res = curl_easy_perform(curl);
© www.soinside.com 2019 - 2024. All rights reserved.