我正在尝试从this site中提取Team Statistics表的代码中更改多个标头名称我不确定在我的代码中手动更改它们的位置。
例如,我尝试在添加'TEAM'标头的那一行中将标头8 GF手动更改为GFPG,但出现错误:
带有“ 2”自变量的异常调用“添加”:“已经添加了项。在字典中键入:'GF'正在添加键:'GF'”在C:\ NHLScraper.ps1:32 char:5+ $ objHash.Add($ headers [$ j],$ rowdata [$ j])
我的代码:
$url = "https://www.hockey-reference.com/leagues/NHL_2020.html"
#getting the data
$data = Invoke-WebRequest $url
#grab the third table
$table = $data.ParsedHtml.getElementsByTagName("table") | Select -skip 2 | Select -First 1
#get the rows of the Team Statistics table
$rows = $table.rows
#get table headers
$headers = $rows.item(1).children | select -ExpandProperty InnerText
#count the number of rows
$NumOfRows = $rows | Measure-Object
#Manually injecting TEAM header
$headers = @($headers[0];'TEAM';$headers[1..($headers.Length-1)])
#enumerate the remaining rows (we need to skip the header row) and create a custom object
$out = for ($i=2;$i -lt $NumofRows.Count;$i++) {
#define an empty hashtable
$objHash=[ordered]@{}
#getting the child rows
$rowdata = $rows.item($i).children | select -ExpandProperty InnerText
for ($j=0;$j -lt $headers.count;$j++) {
#add each row of data to the hash table using the correlated table header value
$objHash.Add($headers[$j],$rowdata[$j])
}
#turn the hashtable into a custom object
[pscustomobject]$objHash
}
$out | Select TEAM,AvAge,GP,W,L,OL,PTS,PTS%,GF,GA,SOW,SOL,SRS,SOS,TG/G,EVGF,EVGA,PP,PPO,PP%,PPA,PPOA,PK%,SH,SHA,PIM/G,oPIM/G,S,S%,SA,SV%,SO -SkipLast 1 | Export-Csv -Path "C:\$((Get-Date).ToString("'NHL Stats' yyyy-MM-dd")).csv" -NoTypeInformation
您可以添加条件来检查密钥是否已经添加,如果已添加,请对其进行更新或忽略,
if (!$objHash.Contains(headers[$j]))
$objHash.Add($headers[$j],$rowdata[$j])
else
$objHash[$headers[$j]] = $rowdata[$j] # Overwrite values
但是经过几次查看代码后,这没有意义,
$out = for ($i=2;$i -lt $NumofRows.Count;$i++) {
#define an empty hashtable
$objHash=[ordered]@{} # Overwritten each loop???
#getting the child rows
$rowdata = $rows.item($i).children | select -ExpandProperty InnerText
for ($j=0;$j -lt $headers.count;$j++) {
#add each row of data to the hash table using the correlated table header value
$objHash.Add($headers[$j],$rowdata[$j]) # Dictionary cannot have duplicate keys
}
#turn the hashtable into a custom object
[pscustomobject]$objHash # what do you do with this?
}
您正在循环x次,并且每次都覆盖$ objHash。唯一要返回的是在上一个循环中创建的内容。
建议的解决方案
您可以使用另一个变量来跟踪正在创建的所有哈希表,并确保未插入重复键会引发异常。
# If you want to change the header value from GF to GFPG, you can do that in the place you have defined $headers
#get table headers
$headers = $rows.item(1).children | select -ExpandProperty InnerText
$headers = $headers | % { if ($_ -eq "GF") { "GFPG" } else { $_ }}
#count the number of rows
$NumOfRows = $rows | Measure-Object
#Manually injecting TEAM header
$headers = @($headers[0];'TEAM';$headers[1..($headers.Length-1)])
#enumerate the remaining rows (we need to skip the header row) and create a custom object
$allData = @{}
$out = for ($i=2;$i -lt $NumofRows.Count;$i++) {
#define an empty hashtable
$objHash=[ordered]@{}
#getting the child rows
$rowdata = $rows.item($i).children | select -ExpandProperty InnerText
for ($j=0;$j -lt $headers.count;$j++) {
#add each row of data to the hash table using the correlated table header value
$objHash[$headers[$j]] = $rowdata[$j]
}
#turn the hashtable into a custom object
[pscustomobject]$objHash
$allData.Add($i, $objHash)
}
我用$AllData
和i
作为键来存储以后可以访问的每个结果。