我的目标是提取包含特定关键字的文本段落。不只是包含关键字的行,而是整个段落。我的文本文件上施加的规则是,每个段落都以某个特定模式(例如Pa0)开头,该模式仅在段落开头才在整个文本中使用。每个段落以换行符结尾。
例如,假设我有以下文本:
Pa0
This is the first paragraph bla bla bla
This is another line in the same paragraph bla bla
This is a third line bla bla
Pa0
This is the second paragraph bla bla bla
Second line bla bla My keyword is here!
bla bla bla
bla
Pa0
Hey, third paragraph bla bla bla!
bla bla
Pa0
keyword keyword
keyword
Another line! bla
我的目标是提取包含“关键字”一词的这些段落。例如:
Pa0
This is the second paragraph bla bla bla
Second line bla bla My keyword is here!
bla bla bla
bla
Pa0
keyword keyword
keyword
Another line! bla
我在powershell上需要以下命令
awk'/ keyword /'RS =“ \ n \ n” ORS =“ \ n \ n” input.txt
使用Get-Content -Delimiter
将文件读取为大块,然后使用Where-Object
将关键字过滤掉:
$paragraphs = Get-Content .\input.txt -Delimiter "`n`n" |Where-Object { $_ -like '*keyword*' }