排查从investing.com 网络抓取每月英国债券数据中的VBA 错误

问题描述 投票:0回答:1

我正在尝试从 investing.com 上抓取 1 个月至 50 年期间每次债券增量的英国债券月度数据。例如,这些可以在 https://uk.investing.com/rates-bonds/uk-5-year-bond-yield-historical-data 找到(针对不同的情况更改 URL 的“5 年”部分)年/月周期。

我对 XML 或 HTML 的了解非常有限,因此我一直在大量抄袭 @Erjon 非常有用的程序,但我不断遇到 HTML 403 错误和 VBA 类型与 HTML 定义名称不匹配的错误(运行时错误 13)。

我尝试修改拉取数据的代码来拉取5年期债券信息,如下:

Option Explicit
Sub Export_Table()

'Html Objects---------------------------------------'
 Dim htmlDoc As MSHTML.HTMLDocument
 Dim htmlBody As MSHTML.htmlBody
 Dim ieTable As MSHTML.HTMLTable
 Dim Element As MSHTML.HTMLTableRow


'Workbooks, Worksheets, Ranges, LastRow, Incrementers ----------------'
 Dim wb As Workbook
 Dim Table As Worksheet
 Dim i As Long

 Set wb = ThisWorkbook
 Set Table = wb.Worksheets("Sheet1")

 '-------------------------------------------'
 Dim xmlHttpRequest As New MSXML2.XMLHTTP60  '
 '-------------------------------------------'


 i = 2

'Web Request --------------------------------------------------------------------------'
 With xmlHttpRequest
 .Open "GET", "https://uk.investing.com/rates-bonds/uk-5-year-bond-yield-historical-data", False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.setRequestHeader "X-Requested-With", "XMLHttpRequest"
.setRequestHeader "user-agent", "Chrome/99.0.4844.74"

.send "curr_id=951681&smlID=200239&st_date=01%2F01%2F2017&end_date=03%2F01%2F2019&interval_sec=Monthly&sort_col=date&sort_ord=DESC&action=historical_data"


 If .Status = 200 Then

        Set htmlDoc = CreateHTMLDoc
        Set htmlBody = htmlDoc.body

        htmlBody.innerHTML = xmlHttpRequest.responseText

        Set ieTable = htmlDoc.getElementsByClassName("datatable_table__D_jso datatable_table--border__B_zW0 datatable_table--mobile-basic__W2ilt datatable_table--freeze-column__7YoIE")

        For Each Element In ieTable.getElementsByTagName("tr")
            Table.Cells(i, 1) = Element.Children(0).innerText
            Table.Cells(i, 2) = Element.Children(1).innerText
            Table.Cells(i, 3) = Element.Children(2).innerText
            Table.Cells(i, 4) = Element.Children(3).innerText
            Table.Cells(i, 5) = Element.Children(4).innerText
            Table.Cells(i, 6) = Element.Children(5).innerText
            Table.Cells(i, 7) = Element.Children(6).innerText

            i = i + 1
        DoEvents: Next Element
 End If
End With


Set xmlHttpRequest = Nothing
Set htmlDoc = Nothing
Set htmlBody = Nothing
Set ieTable = Nothing
Set Element = Nothing

End Sub

Public Function CreateHTMLDoc() As MSHTML.HTMLDocument
    Set CreateHTMLDoc = CreateObject("htmlfile")
End Function

目前我面临的问题是我无法设置“ieTable”变量,因为这会返回运行时错误 13。在 Erjon 的原始代码(如上面链接)中,他从“https://www”中提取信息.investing.com/instruments/HistoricalDataAjax”,但如果我尝试这样做,它会不断向我提供 HTML 错误

403:访问被禁止。

如果有人能指出我错在哪里,那就太棒了 - 谢谢!

excel xml vba parsing web-scraping
1个回答
0
投票

我使用Erjon的原始代码(使用Excel vba在Investing.com中进行网页抓取)并面临你提到的问题:403:访问被禁止。有什么解决办法吗?

© www.soinside.com 2019 - 2024. All rights reserved.