由JavaScript生成的VBA抓取内容

问题描述 投票:2回答:2

我正在尝试使用VBA从Excel工作表中定义的URL获取推荐的客户定价信息。这些值在Excel的Cells(i,11)中,都指向https://ark.intel.com上的特定页面。值从第5行开始。

例如,如果要查找Intel Xeon 8268的价格,请导航至https://ark.intel.com/content/www/us/en/ark/products/192481/intel-xeon-platinum-8268-processor-35-75m-cache-2-90-ghz.html。如果查看源代码,很明显,此内容是使用JavaScript生成的,所以我改用Firefox Web浏览器上的“检查元素”选项。

[从这里,我可以向下浏览并在标签中找到想要的内容。见下图:

Recommendd Customer Price for Intel Xeon 8268

我无法捕获该值并将其写入excel列,该列将为E列。以下是我进行的一次尝试:

Sub ProcessorPricing()
    Dim URL As String, lastRow As Long
    Dim XMLHTTP As Object, HTML As Object, objResult As Object, Price As Object

    lastRow = Range("A" & Rows.Count).End(xlUp).row

    Dim cookie As String
    Dim result_cookie As String

    For i = 5 To lastRow

        If Cells(i, 1) <> "" Then

            URL = Cells(i, 11)

            Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
            XMLHTTP.Open "GET", URL, False
            XMLHTTP.setRequestHeader "Content-Type", "text/xml"
            XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
            XMLHTTP.send

            Set HTML = CreateObject("htmlfile")
            HTML.body.innerHTML = XMLHTTP.responseText

            Set objResult = html.getElementsByID("bladeInside")
            Set Price = objResult.getElementsByTagName("span")(0)

            Cells(i, 5) = Price.Value
            DoEvents
        End If
    Next
End Sub

任何帮助将不胜感激。

PS-我也尝试过在https://www.myonlinetraininghub.com/web-scraping-with-vba处找到的代码也无济于事

更新:

能够在您的帮助下使所有工作正常进行。谢谢Bertrand Martel和Stavros Jon。

这里是整个脚本:

Sub UpdateProcessorInfo()
'requirements:  JSON Parser installation needs to be added to project - https://github.com/VBA-tools/VBA-JSON - (Download latest release -> Import JsonConverter.bas -> File -> Import File)
'requirements:  Windows only, include Reference to "Microsoft Scripting Runtime" (Tools -> References -> Check Microsoft Scripting Runtime)
'requirements:  Add a refernce to Microsoft WinHTTP Services 5.1.  (Tools -> References -> Check Microsoft WinHTTP Services 5.1)

Dim Connection As WorkbookConnection
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, link As Object
Dim cookie As String
Dim result_cookie As String
Dim req As New WinHttpRequest
Dim ids As String
Dim responseJSON As Object

For Each Connection In ThisWorkbook.Connections
    Connection.Refresh
Next Connection

Worksheets("Processor_DB_Intel").Range("A2:A1000").Copy
Worksheets("Processor Comparisons").Range("A5").PasteSpecial Paste:=xlPasteValues

lastRow = Range("A" & Rows.Count).End(xlUp).row

Range("k5:k300").Clear

For i = 5 To lastRow

    If Cells(i, 1) <> "" Then

        url = "https://www.google.com/search?q=" & "site:ark.intel.com " & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)

        Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
        XMLHTTP.Open "GET", url, False
        XMLHTTP.setRequestHeader "Content-Type", "text/xml"
        XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
        XMLHTTP.send

        Set html = CreateObject("htmlfile")
        html.body.innerHTML = XMLHTTP.responseText
        Set objResultDiv = html.getElementById("rso")
        Set link = objResultDiv.getElementsByTagName("a")(0)

        Cells(i, 11) = link
        DoEvents
    End If

Next

lastRow = Range("A" & Rows.Count).End(xlUp).row

For i = 5 To lastRow

    ids = Cells(i, 13)
    url = "https://ark.intel.com/libs/apps/intel/support/ark/recommendedCustomerPrice?ids=" & ids & "&siteName=ark"

    If Cells(i, 1) <> "" Then

        With req
            .Open "GET", url, False
            .send
            Set responseJSON = JsonConverter.ParseJson(.responseText)
        End With

        On Error Resume Next

        'Debug.Print responseJSON(1)("displayPrice")
        Cells(i, 14) = responseJSON(1)("displayPrice")

    End If
Next

结束子

我正在尝试使用VBA从Excel工作表中定义的URL获取推荐的客户定价信息。这些值在Excel中的Cells(i,11)中,它们都指向特定的...

javascript excel vba web-scraping screen-scraping
2个回答
3
投票

AS @Bertrand Martel指出,有一个非常方便的API,可以用来获取所需的信息。


1
投票

您已经注意到数据不是嵌入在html中,而是使用外部JSON API通过Javascript加载的:

© www.soinside.com 2019 - 2024. All rights reserved.