使用选择字符串匹配多个单行的模式和写入到输出

问题描述 投票:1回答:3

我想建立一个简单的脚本来使用正则表达式,并在一行上匹配多个模式 - 递归整个输入文件,并将结果写入到输出文件。但我打墙:

示范文本:

BMC12345 COMBINED PHASE STATISTICS:  31 ROWS SELECTED FOR SPACE 'KDDT111D.DIH0345S', 0 ROWS SELECTED BUT DISCARDED DUE TBMC123456 COMBINED PHASE STATISTICS:  10 PHYSICAL (10 LOGICAL) RECORDS DISCARDED TO SYSDISC

下面是我到目前为止有:

$table = [regex] "'.*'"
$discard = [regex] "\d* PHYSICAL"

Select-String -Pattern ($table, $discard) -AllMatches .\test.txt | foreach {
    $_.Matches.Value
} > output.txt

输出:

'KDDT111D.DIH0345S'

所需的输出:

'KDDT111D.DIH0345S' 10 Physical

出于某种原因,我无法获得两个模式写入output.txt的。理想的情况是,一旦我得到这个工作,我想用Export-Csv得到的东西有点像清洁剂:

|KDDT111D|DIH0345S|10 Physical|
regex powershell select-string
3个回答
1
投票

我想你会发现-match运营商更适合这一点。 [坏笑]使用命名比赛对你的样品保存在$InStuff,这...

$InStuff -match ".+SPACE '(?<Space>.+)\.(?<SubSpace>.+)'.+: (?<Discarded>.+) \(.+"

...给出了以下一组的比赛...

Name                           Value                                                                              
----                           -----                                                                              
Space                          KDDT111D                                                                           
SubSpace                       DIH0345S                                                                           
Discarded                      10 PHYSICAL                                                                        
0                              BMC12345 COMBINED PHASE STATISTICS: 31 ROWS SELECTED FOR SPACE 'KDDT111D.DIH0345...

指定的比赛可以通过$Matches.<the capture group name>加以解决。


1
投票

已运行到Select-String限制:即.Matches发出对每一输入对象(线)[Microsoft.PowerShell.Commands.MatchInfo]对象的属性Select-String永远只含有(潜在的多个),用于传递到所述第一正则表达式匹配 -Pattern参数。[1]

您可以解决该问题通过传递一个正则表达式,而不是通过组合通过交替输入正则表达式(|):

Select-String -Pattern ($table, $discard -join '|') -AllMatches .\test.txt | 
  ForEach-Object { $_.Matches.Value } > output.txt

一个简单的例子:

# ('f.', '.z' -join '|') -> 'f.|.z'
'foo bar baz' | Select-String -AllMatches ('f.', '.z' -join '|') |
  ForEach-Object { $_.Matches.Value }

上述产率:

fo
az

证明了两个正则表达式比赛进行了报道。

需要注意的重新排序输出:采用交替(|)导致在他们在输入中发现,没有在指定这些正则表达式的顺序的顺序来报告给定的输入字符串匹配。 也就是说,两个-Pattern 'f.|.z'-Pattern '.z|f.'以上将导致相同的输出顺序。


[1]所述的问题存在如Windows PowerShell中V5.1 / PowerShell核心6.2.0-preview.4的和在this GitHub issue讨论


0
投票

多亏了贡献者的思想和学习经验。我能得到利用组合的两个答案收到所需的输出。

我发现-match操作只返回了从源文件中的正则表达式模式匹配的第一次出现,所以我需要为了整个日志文件,以递归返回匹配添加foreach循环。

我还修改了正则表达式包括仅丢弃值大于0。

示范文本:

BMC51472I COMBINED PHASE STATISTICS:  0 ROWS SELECTED FOR SPACE 'KDDT000D.KDAICH0S', 0 ROWS SELECTED BUT DISCARDED DUE TOBMC51479I COMBINED PHASE STATISTICS:  0 PHYSICAL (0 LOGICAL) RECORDS DISCARDED TO SYSDISC
BMC51472I COMBINED PHASE STATISTICS:  3499604 ROWS SELECTED FOR SPACE 'KDDT000D.KDAIND0S', 0 ROWS SELECTED BUT DISCARDED BMC51479I COMBINED PHASE STATISTICS:  0 PHYSICAL (0 LOGICAL) RECORDS DISCARDED TO SYSDISC
BMC51472I COMBINED PHASE STATISTICS:  1 ROWS SELECTED FOR SPACE 'KDDT000D.KDCISR0S', 0 ROWS SELECTED BUT DISCARDED DUE TOBMC51479I COMBINED PHASE STATISTICS:  0 PHYSICAL (0 LOGICAL) RECORDS DISCARDED TO SYSDISC
BMC51472I COMBINED PHASE STATISTICS:  9185775 ROWS SELECTED FOR SPACE 'KDDT000D.KDIADR0S', 0 ROWS SELECTED BUT DISCARDED BMC51479I COMBINED PHASE STATISTICS:  11 PHYSICAL (11 LOGICAL) RECORDS DISCARDED TO SYSDISC
BMC51472I COMBINED PHASE STATISTICS:  0 ROWS SELECTED FOR SPACE 'KDDT000D.KDICHT0S', 0 ROWS SELECTED BUT DISCARDED DUE TOBMC51479I COMBINED PHASE STATISTICS:  0 PHYSICAL (0 LOGICAL) RECORDS DISCARDED TO SYSDISC
BMC51472I COMBINED PHASE STATISTICS:  2387375 ROWS SELECTED FOR SPACE 'KDDT000D.KDICMS0S', 0 ROWS SELECTED BUT DISCARDED BMC51479I COMBINED PHASE STATISTICS:  0 PHYSICAL (0 LOGICAL) RECORDS DISCARDED TO SYSDISC
BMC51472I COMBINED PHASE STATISTICS:  1632821 ROWS SELECTED FOR SPACE 'KDDT000D.KDIPRV0S', 0 ROWS SELECTED BUT DISCARDED BMC51479I COMBINED PHASE STATISTICS:  0 PHYSICAL (0 LOGICAL) RECORDS DISCARDED TO SYSDISC
BMC51472I COMBINED PHASE STATISTICS:  0 ROWS SELECTED FOR SPACE 'KDDT000D.KDLADD0S', 0 ROWS SELECTED BUT DISCARDED DUE TOBMC51479I COMBINED PHASE STATISTICS:  24845 PHYSICAL (24845 LOGICAL) RECORDS DISCARDED TO SYSDISC

例:

  $regex = ".+SPACE '(?<Space>.+)\.(?<SubSpace>.+)'.+: (?<Discarded>.+) .[1-9][0-9]*\s\b"

    $timestamp = Get-Date
    $timestamp = Get-Date $timestamp -f "MM_dd_yy"
    $dir = "C:\Users\JonMonJovi\"

    cat $dir\*.log.txt | where {
        $_ -match $regex
    } | foreach {
        $Matches.Space, $Matches.SubSpace, $Matches.Discarded -join "|"
    } > C:\Users\JonMonJovi\Discarded\Discard_Log_$timestamp.txt

输出:

KDDT000D|KDIADR0S| 11 PHYSICAL
KDDT000D|KDLADD0S| 24845 PHYSICAL

从这里我可以使用分隔的.txt输出文件导入到Excel中的管道,满足我的要求。

© www.soinside.com 2019 - 2024. All rights reserved.