调整从网页中提取表格的功能以仅提取单个元素+重复

问题描述 投票:0回答:1

我有一个很奇怪的项目,我想开始。本质上,我有一个工具可以创建仓库特定部分的整个库存的扩展电子表格。它列出了它们的位置,库存状态和物料ID(“ ASIN”,基本上是内部系统中的条形码和虚拟参考。问题是,它没有列出“速度”(衡量我们销售多少的指标)一周之内),我想将此指标打印在每个商品ID的旁边,这样我就可以弄清哪些商品没有卖出去,然后将其发送到仓库的长期存储区域。 ve找到了另一个工具,该工具从我们内部的Wiki(“ FCresearch”)中获取有关单个项目ID的信息表,而该表恰好包含该特定指标。我希望仅从该表中获取项目的速度(实质上是该位置的数字:

/html/body/div[2]/div/div[1]/div/div[1]/div/div[2]/div/div/div[2]/table/tbody/tr[19]/td 

在网页上),然后对该宏进行调整,以使其作用于由前一个工具创建的表中的ASIN,将其速度打印到相邻单元格中,然后向下移动一行并重复所有〜4000个条目,直到它击中了空白区域。

这里是完整的相关功能:

    Sub getFCresearch()
Dim A As Object, H As Object, D As Object, C As Object, asin$, B$, F$
Dim x&, t&
Set C = CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
Set D = CreateObject("HTMLFile")
Set A = CreateObject("New:{00000566-0000-0010-8000-00AA006D2EA4}")
Set H = CreateObject("WinHTTP.WinHTTPRequest.5.1")
    H.SetAutoLogonPolicy 0

''passes badge
    H.Open "GET", "https://hrwfs.amazon.com/?Operation=empInfoByUid&ContentType=JSON&employeeUid=" & Environ("USERNAME")
    H.send

DoEvents

B = Split(Split(H.ResponseText, "employeeBarcode"":""")(1), Chr(34))(0)


    H.Open "POST", "http://fcmenu-iad-regionalized.corp.amazon.com/do/login"
    H.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
    H.setRequestHeader "Content-Length", Len("badgeBarcodeId=" & B)
    H.send "badgeBarcodeId=" & B

DoEvents

H.Open "GET", "http://fcmenu-iad-regionalized.corp.amazon.com/" & F
H.send
DoEvents

''Needs to derive "asin" variable from adjacent cell
asin = Sheets("Sheet1").[A1]

''This gathers the specific item's page on the wiki "FCresearch"
H.Open "GET", "http://fcresearch-na.aka.amazon.com/DEN3/results/inventory?s=" & asin, False
H.send

'''This gets the whole table,where I only need one specific element called "velocity" at: /html/body/div[2]/div/div[1]/div/div[1]/div/div[2]/div/div/div[2]/table/tbody/tr[19]/td
D.body.InnerHTML = H.ResponseText
C.SetText D.GetElementById("table-inventory").OuterHTML
C.PutInClipboard

''This pastes the table to a different sheet, but needs to paste to a cell adjacent to the "asin" variable of each row
''Before moving down to the next row and repeating the process
Sheet2.[C:Z].Cells.ClearContents
Sheet2.[C1].PasteSpecial

Sheet2.[C:N].WrapText = False
Sheet2.Columns("C:N").AutoFit
End Sub

所有人都能提供的任何帮助都将是惊人的。抱歉,这是一件很广泛的事情,对此我还很陌生,我只能调整有关代码的次要内容,而且我找不到任何地方可以找到比.GetElementById函数更深入的文档。没有ID的html元素不起作用。

Image of table HTML, + plaintext

 <table data-row-id="1579657885" class="a-keyvalue"><tbody><tr><th>ASIN</th><td><a href="/DEN3/results?s=1579657885">1579657885</a></td></tr><tr><th>Title</th><td><a target="_blank" href="http://www.amazon.com/gp/product/1579657885">1,000 Places to See Before You Die (Deluxe Edition): The World as You've Never Seen It Before</a></td></tr><tr><th>Binding</th><td>Hardcover</td></tr><tr><th>Publisher</th><td></td></tr><tr><th>Vendor Code</th><td>ATSAN</td></tr><tr><th>Weight</th><td>6.45 pounds</td></tr><tr><th>Dimensions</th><td>1.50 x 13.00 x 9.80 IN</td></tr><tr><th>List Price</th><td>USD 50.00</td></tr><tr><th>Expiration Date</th><td class=""></td></tr><tr><th>Asin Demand</th><td><a target="_blank" href="https://ufo.amazon.com/srw14na/asins/place_in_line/1579657885?warehouse=DEN3">Demand for 1579657885</a></td></tr><tr><th>Sortable</th><td>true</td></tr><tr><th>Conveyable</th><td>true</td></tr><tr><th>Very High Value</th><td>false</td></tr><tr><th>Master Case</th><td>false</td></tr><tr><th>FCSku Scope</th><td>FNSKU</td></tr><tr><th>Sales Forecast</th><td>4.0</td></tr><tr><th>Sales History (approx)</th><td>5.0</td></tr><tr><th>Sales Override</th><td>0.0</td></tr><tr><th>ASIN Velocity (approx)</th><td>5.0</td></tr><tr><th>Provenance Value</th><td>UNTRACKED</td></tr><tr><th>Provenance IOG</th><td>Info Not Found</td></tr></tbody></table>
excel vba getelementbyid getelementsbytagname getelementsbyclassname
1个回答
0
投票

好吧,这是获取所需信息的两种方法。如果您了解逻辑,我相信这些方法的任何组合都应足以根据您的需要调整代码。

[为简单起见,我假定HTML已被加载到名为HTMLDocumentD对象中。感兴趣的值将显示在您的直接窗口中,以进行演示。

首先,您需要引用Microsoft HTML Object Library(VBE>工具>参考> ...)。

我将使用以下变量:

Dim table As HTMLTable
Dim tableOfInterest As HTMLTable
Dim row As HTMLTableRow
Dim rowOfInterest As HTMLTableRow
Dim cell As HTMLTableCell
Dim cellOfInterest As HTMLTableCell

假设表的索引,行的索引和单元格的索引始终相同,并且您知道它们:

Set tableOfInterest = D.getElementsByTagName("table")(0) 'Assuming the table of interest is the first table to appear in the HTML document. Keep in mind indexing starts at zero!
Set rowOfInterest = tableOfInterest.getElementsByTagName("tr")(18) 'Assuming the row of interest is the 19th row in the table.
Set cellOfInterest = rowOfInterest.getElementsByTagName("td")(0) 'Assuming the cell of interest is the 1st cell in the row.
Debug.Print cellOfInterest.innerText

假设您不明确知道表和行的索引,但是知道其他信息,例如属性或内部文本

For Each table In D.getElementsByTagName("table")
    If table.Attributes("data-row-id").Value = "1579657885" Then 'assuming the value of this attribute is always the same
        Set tableOfInterest = table
    End If
Next table

For Each row In tableOfInterest.getElementsByTagName("tr")
    If row.innerText Like "*ASIN Velocity (approx)*" Then 'assuming that's the text you're looking for
        Set rowOfInterest = row
    End If
Next row
Debug.Print rowOfInterest.Cells(1).innerText 'in this case the "th" element is also considered a cell so the cell you're interested in is the 2nd one.
© www.soinside.com 2019 - 2024. All rights reserved.