我想从网站https://www.amfiindia.com/nav-history-download导入一些数据。在此页面上,有一个链接“以文本格式下载完整的NAV报告”,该链接将为我提供所需的数据。但是此链接不是静态的,因此我不能直接在VBA中使用此链接下载数据。那么如何使用excel从网页上的超链接下载数据?
我的方法是首先在变量中获取超链接,然后使用该变量获取数据?
但是,当我向超链接发送请求时,我得到了“ BAD REQUEST”响应。我不知道为什么会出现此错误。我使用的代码是
Sub GrabLastNames()
'dimension (set aside memory for) our variables
Dim objIE As InternetExplorer
Dim ele As Object
Dim y As Integer
Dim mtbl As String
Dim request As Object
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
Dim cellAddress As String
Dim rowNumber As Long
'start a new browser instance
Set objIE = New InternetExplorer
'make browser visible
objIE.Visible = True
'navigate to page with needed data
objIE.navigate "https://www.amfiindia.com/nav-history-download"
'wait for page to load
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
' ht.querySelector(".nav-hist-dwnld a").href
'we will output data to excel, starting on row 1
y = 1
mtbl = objIE.document.querySelector(".nav-hist-dwnld a").href
' mtbl = Sheets("Sheet1").Range("A" & y).Value
' Website to go to.
' website = mtbl
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there - probably don't need to change this.
request.Open "GET", mtbl, False
' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' MsqBox "bye"
' Get the webpage response data into a variable.
response = request.responseText
' Put the webpage into an html object to make data references easier.
'html.body.innerHTML = response
MsgBox "Hi"
' MsgBox "Bye Bye"
' Get the price from the specified element on the page.
Sheets("Sheet1").Range("A" & y + 1).Value = "Hi"
MsgBox response
'look at all the 'tr' elements in the 'table' with id 'myTable',
'and evaluate each, one at a time, using 'ele' variable
ActiveWorkbook.Save
End Sub
响应变量应该具有网站上的所有数据,但是,它正在msgBox中打印此“错误请求”。
您设置了this之类的标题,即使用.setrequestheader
方法。
我看到此标头信息:
GET /spages/NAVAll.txt?t=06052020095056 HTTP/1.1
Host: www.amfiindia.com
Connection: keep-alive
Cache-Control: max-age=0
DNT: 1
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
Sec-Fetch-Site: none
Sec-Fetch-Mode: navigate
Sec-Fetch-User: ?1
Sec-Fetch-Dest: document
Accept-Encoding: gzip, deflate, br
Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
Cookie: __utma=57940026.1471746098.1588710696.1588710696.1588710696.1; __utmc=57940026; __utmz=57940026.1588710696.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
If-None-Match: "0d8e9bad223d61:0"
If-Modified-Since: Wed, 06 May 2020 18:18:24 GMT
不太可能需要所有这些,但是最可能需要403的用户代理。例如。setRequestHeader "User-Agent","Mozilla/5.0"