我有一个文本文件,我想删除一些行。文件的示例内容如下 -
v1 has output 1.1
v2 has output 10.2
v3 has output 5.4
v4 has output 1.1
v5 has output 10.2
v6 has output 12
------------------
and so on
因为它看到上面的1.1和10.2值重复了几次,我想保留1.1和10.2的前10行并且很像它们(这些值是不同的并且在数百个不同的数字中)但删除所有后续重复,即使值的v参数每次都不同,并且还希望保留非重复数据。
我尝试使用uniq排序,但它只消除了相同的匹配重复,但不是基于特定条件。
sort file.txt | uniq -i
听起来你需要的只是:
awk '++cnt[$NF]<11' file
EG
$ cat file
v1 has output 1.1
v2 has output 10.2
v3 has output 5.4
v4 has output 1.1
v5 has output 10.2
v6 has output 12
v7 has output 1.1
v8 has output 10.2
v9 has output 5.4
v10 has output 1.1
v11 has output 10.2
v12 has output 12
$ awk '++cnt[$NF]<3' file
v1 has output 1.1
v2 has output 10.2
v3 has output 5.4
v4 has output 1.1
v5 has output 10.2
v6 has output 12
v9 has output 5.4
v12 has output 12
这是一个awk
awk 'a[$4==1.1 || $4==10.2]++<10 {print;next} !($4==1.1 || $4==10.2)' file
v1 has output 1.1
v2 has output 10.2
v3 has output 5.4
v4 has output 1.1
v5 has output 10.2
v6 has output 12
它使用1.1
或10.2
以及所有其他行打印出所有行中的10个