我在 VBA 中创建了一个脚本,使用 XML HTTP 请求从网页中抓取面包屑。当我实现 ActiveX 组件
MSXML2.XMLHTTP.6.0
时,该脚本运行良好,但当我切换到 MSXML2.serverXMLHTTP.6.0
时,它会严重失败。
由于我计划在脚本中使用代理,因此我有必要坚持使用
MSXML2.serverXMLHTTP.6.0
。但是,第二个脚本不起作用。让您知道,当我打印 .responseText
时,我看到其中有乱码内容,如下所示:
??GN?!v:h??_??Og<]?????X ?6??'o??F??6 ?uh????x?r???????sP??????????[B??k????]??????yC????'???L???????,*?Z????? ?vX ?c?q\t?j??????K?|???P 7??k?y?<;?>????a?*P1????w???[?T?/f?? ?7?gn??V<E?Z??6t:??1??????E'v?1?? ?w??+??????-aD????wy?
.
使用
MSXML2.XMLHTTP.6.0
(完美运行):
Option Explicit
Sub GrabInfo()
Const Url$ = "https://www.amazon.com/gp/product/B00FQT4LX2?th=1"
Dim oHttp As Object, Html As HTMLDocument, breadCrumbs$
Set Html = New HTMLDocument
Set oHttp = CreateObject("MSXML2.XMLHTTP.6.0")
With oHttp
.Open "GET", Url, True
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36"
.send
While .readyState < 4: DoEvents: Wend
MsgBox "Status code: " & .Status
Html.body.innerHTML = .responseText
breadCrumbs = Html.querySelector("#wayfinding-breadcrumbs_feature_div")
MsgBox breadCrumbs
End With
End Sub
使用
MSXML2.serverXMLHTTP.6.0
(抛出显示 Object Variable or With block variable not set
的错误):
Option Explicit
Sub GrabInfo()
Const Url$ = "https://www.amazon.com/gp/product/B00FQT4LX2?th=1"
Dim oHttp As Object, Html As HTMLDocument, breadCrumbs$
Set Html = New HTMLDocument
Set oHttp = CreateObject("MSXML2.serverXMLHTTP.6.0")
With oHttp
.Open "GET", Url, True
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36"
.send
While .readyState < 4: DoEvents: Wend
MsgBox "Status code: " & .Status
Html.body.innerHTML = .responseText
breadCrumbs = Html.querySelector("#wayfinding-breadcrumbs_feature_div")
MsgBox breadCrumbs
End With
End Sub
如何使基于
构建的第二个脚本工作?MSXML2.serverXMLHTTP.6.0
我尝试了你的代码并得到了相同的“胡言乱语”响应。但是,删除语句
setRequestHeader
(设置用户代理的位置)解决了该问题,并且响应是可读的,而使用不同的用户代理(此处建议的用户代理)会导致奇怪的响应。
请注意,
Html.querySelector
的结果是一个对象,而不是字符串,您应该使用
Dim breadCrumbs as Object
Set breadCrumbs = Html.querySelector("#wayfinding-breadcrumbs_feature_div")
MsgBox breadCrumbs.innerHTML