wget 相关问题

GNU非交互式（可以从脚本，cron作业，没有X-Windows支持的终端等）调用网络下载程序，从Web服务器检索内容。该名称源自万维网并获得。

如何使用Python的请求来伪造浏览器访问又名并生成用户代理？ [重复]

我想从这个网站获取内容。如果我使用像 Firefox 或 Chrome 这样的浏览器，我可以获得我想要的真实网站页面，但是如果我使用 Python Requests 包（或 wget 命令）来获取我......

python web-scraping python-requests wget user-agent

回答 9 投票 0

如何使用 Python 请求来伪造浏览器访问又名并生成用户代理？ [重复]

python web-scraping python-requests wget user-agent

回答 9 投票 0

Wget 包含/排除目录

我无法得到 wget 所期望的。该目录如下所示：服务器： > d1 -> d2 --> 一个 ---> 2021 年 ----> file_A_2021_1 ----> file_A_2021_2 ----> file_A_2021_3 ---> 2022 年 ---->

regex wget

回答 1 投票 0

尝试通过terraform从azurerm_virtual_machine_extension执行wget命令，但是curl无法执行

尝试通过 terraform 从 azurerm_virtual_machine_extension 执行 wgetcommand，但是，curl 无法执行。这是代码资源“azurerm_virtual_machine_extension”“

azure shell command-line wget terraform-provider-azure

回答 1 投票 0

如何为 wget 设置代理？

我想使用代理通过 wget 下载一些东西： HTTP 代理：127.0.0.1 端口：8080 代理不需要用户名和密码。我该怎么做？

linux proxy wget

回答 13 投票 0

如何制作脚本将文件从 ftp 文件夹移动到本地

我正在尝试制作一个脚本，通过 ftp 将种子箱（在我的例子中为 put.io）下载的所有文件移动（因此下载和删除）到本地文件（在我的例子中是在 Synology NAS 上）请注意，显然我不...

curl ftp wget move

回答 1 投票 0

从 wget 命令诊断 403 禁止错误

当我尝试以下代码时，我收到 403 禁止错误，但我不知道为什么。 wget --random-wait --wait 1 --no-directories --user-agent="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit...

cmd wget http-status-code-403

回答 2 投票 0

配置 python 使用与 wget 相同的 SSL 证书？

在使用 detectorron2 时，我遇到了 SSL 问题。然而，SSL 问题似乎与 python 有关，因为它们在使用 wget 时不会出现。我无法使用 detectorron2 下载权重文件 h...

python python-requests ssl-certificate wget detectron

回答 2 投票 0

如何`wget`文本文件中的URL列表？

假设我在一个位置有一个包含数百个 URL 的文本文件，例如 http://url/file_to_download1.gz http://url/file_to_download2.gz http://url/file_to_download3.gz http://url/file_to_download4.gz

text wget

回答 6 投票 0

wget 无法下载 - 404 错误

我尝试使用 wget 下载图像，但收到如下错误。 --2011-10-01 16:45:42-- http://www.icerts.com/images/logo.jpg 正在解析 www.icerts.com... 97.74.86.3 正在连接 www.

wget

回答 9 投票 0

Wget 因证书错误而失败

作为自动化构建的一部分，我们从 github 下载一些代码。最小的例子： wget github.com 最近，该命令开始失败并出现证书错误： URL 已转换为 HTTPS...

ssl wget ubuntu-16.04

回答 3 投票 0

Curl 在执行 50 次重定向后失败，但 wget 工作正常

我有一个基于 PHP 的实验性网络爬虫，我注意到它无法读取某些页面，例如在某些特定域上，curl 说它在执行 50 次重定向后失败，但 wget 读取...

php redirect curl web-crawler wget

回答 2 投票 0

HTTP 请求已发送，等待响应... 400 错误请求（通过 wget）

我尝试使用 wget/postman 请求银行网址，但收到 400 响应： >wget --load-cookies /home/testauto/nxdh/cookies.txt https://www.bundesbank.de/statistic-rmi/StatisticDownload?tsId=BB...

postman http-headers wget request-headers

回答 1 投票 0

python wget.download() 抛出 OSError：协议不支持地址族

我正在尝试使用Python的wget库从huggingface下载模型权重文件。我跑了什么：导入wget wget.download("https://huggingface.co/h94/IP-Adapter/resolve/main/models/

python wget

回答 1 投票 0

如何将 wget 的结果通过管道传送到 tar

我正在尝试创建一个命令来下载最新的 .gz 文件并安装在 Dockerfile 中的容器上。我想出了这个，但我似乎无法让它发挥作用。卷曲-s https://api.github....

bash curl wget tar

回答 1 投票 0

使用curl --fail获取页面输出

不带参数调用curl，我得到页面输出，即使http状态代码= 404： $ 卷曲 http://www.google.com/linux 不带参数调用curl，我会得到页面输出，即使http状态代码= 404： $ curl http://www.google.com/linux <!DOCTYPE html> <html lang=en> <meta charset=utf-8> <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width"> <title>Error 404 (Not Found)!!1</title> <style> *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.google.com/images/errors/logo_sm_2.png) no-repeat}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/errors/logo_sm_2_hr.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/errors/logo_sm_2_hr.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/errors/logo_sm_2_hr.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:55px;width:150px} </style> <a href=//www.google.com/><span id=logo aria-label=Google></span></a> <p><b>404.</b> <ins>That’s an error.</ins> <p>The requested URL <code>/linux</code> was not found on this server. <ins>That’s all we know.</ins> $ echo $? 0 状态码为0。使用 --fail 调用它不会显示输出： $ curl --fail http://www.google.com/linux curl: (22) The requested URL returned error: 404 Not Found $ echo $? 22 现在状态码是22... 即使 http status = 404, 500（如第一次curl 执行），我也想获得输出，同时获得不同的系统错误（如第二次curl 执行中，$? = 22）。可以用curl吗？如果没有，我怎样才能用另一个工具实现这一点（这个工具必须接受文件上传和发布数据！wget似乎不是替代方案......）现在可以通过curl 实现这一点。从 7.76.0 版本开始你可以做 curl --fail-with-body ... 这正是OP所要求的：显示文档正文并以代码22退出。参见 https://curl.se/docs/manpage.html#--fail-with-body 首先，错误代码（或退出代码）的最大值是 255。这是参考。此外，--fail将不允许您做您正在寻找的事情。但是，您可以使用其他方法（编写 shell 脚本）来处理该情况，但不确定它是否对您有效！ http_code=$(curl -s -o out.html -w '%{http_code}' http://www.google.com/linux;) if [[ $http_code -eq 200 ]]; then exit 0 fi ## decide which status you want to return for 404 or 500 exit 204 现在执行 $?，您将从那里获得退出代码。您将在 out.html 文件中找到响应 html。您还可以将 url 作为命令行参数传递给脚本。检查这里。不幸的是，卷曲是不可能的。但你可以使用 wget 来做到这一点。 $ wget --content-on-error -qO- http://httpbin.org/status/418 -=[ teapot ]=- _...._ .' _ _ `. | ."` ^ `". _, \_;`"---"`|// | ;/ \_ _/ `"""` $ echo $? 8 感谢@timaschew，这是我基于纯awk的增强版本： curl_fail_with_body() { curl -o - -w "\n%{http_code}\n" "$@" | awk '{l[NR] = $0} END {for (i=1; i<=NR-1; i++) print l[i]}; END{ if ($0<200||$0>299) exit $0 }' } # example usage curl_fail_with_body -sS http://httpbin.org/status/418 说明 -o - -w "\n%{http_code}\n" - 打印到标准输出（实际上它通过管道传输到下一个命令），最后带有状态代码 {l[NR] = $0} END {for (i=1; i<=NR-1; i++) print l[i]} - 打印除最后一行之外的所有行 END{ if ($0<200||$0>299) exit $0 } - 如果 last line != 2xx，将以非零代码退出替代版本，如果你想在命令后输出错误代码： END{ if ($0<200||$0>299) {print "The requested URL returned error: " $0; exit 1} 顺便说一句，curl 从 v7.76.0 开始支持 --fail-with-body 选项。此选项允许您在不使用外部工具的情况下实现所需的行为。我找到了解决方案，因为 wget 不适合发送 multipart/form-data curl -o - -w "\n%{http_code}\n" http://httpbin.org/status/418 | tee >(tail -n 1 | cmp <(echo 2xx) - ) | tee >(grep "char 2"; echo $? > status-code) && grep 0 status-code 说明 -o - -w "\n%{http_code}\n" - 打印到标准输出（实际上它通过管道传输到下一个命令），最后带有状态代码 tee - 输出将通过管道传输到下一个命令，并另外打印到 stdout tail -n 1 - 从最后一行提取状态代码 cmp <(echo 2xx) - 比较状态代码，仅第一个字符 grep "char 2" - 如果第一个字符需要为 2，否则失败在 shell 脚本中，您还可以进行更好的比较（目前它只允许 2xx，因此像 300 这样的重定向将被作为错误处理，cmp上面如何使用它）这是我的解决方案 - 它使用 jq 并假设正文是 json # this code adds a statusCode field to the json it receives and then jq squeezes them together # curl 7.76.0 will have curl --fail-with-body and thus eliminate all this local result result=$( curl -sL -w ' { "statusCode": %{http_code}} ' -X POST "${headers[@]}" "${endpoint}" \ -d "${body}" "$curl_opts" | jq -ren '[inputs] | add' ) # always output the result echo "${result}" # jq -e will produce an error code if the expression result is false or null - thus resulting in a # error return code from this function naturally. This is much preferred rather than assume/hardcode # the existence of a error object in the body payload echo "${result}" | jq -re '.statusCode >= 200 and .statusCode < 300' > /dev/null

http curl http-post wget postfile

回答 6 投票 0

我们如何使用curl或wget从gitlab存储库下载特定文件夹？

我知道如何从 gitlab 存储库下载文件，但在下载目录时我没有找到任何内容。我只需要下载一个目录。

curl gitlab wget gitlab-api

回答 4 投票 0

crosstool-ng 无法获取 Linux tarball

我正在尝试使用 crosstool-ng 构建一个工具链，我已经将其全部设置完毕，按照 http://crosstool-ng.org/#download_and_usage 上的描述选择了我的 cpu，我正在执行此步骤我可以在哪里构建我的工具集...

linux wget crosstool-ng

回答 4 投票 0

如何使用wget而不保留日期？

当我执行 wget 时，我希望保存在文件系统中的文件具有下载日期，即现在，而不是服务器上的日期。所以当我这样做时： LL-LTR 我希望文件显示在...