巴什 - 通过每秒获取一个字组的计数[关闭]

问题描述 投票:-1回答:2

下面是文本文件的样本。我需要的pipleline之前单词“ID”组从字符串每秒计数(“|”)

2019-02-10 12:00:03.448|Id: 26102338
2019-02-10 12:00:03.448|Id: 25941418
2019-02-10 12:00:03.449|Id: 25827373
2019-02-10 12:00:03.449|Id: 26102038
2019-02-10 12:00:03.449|Id: 25929358

2019-02-10 12:00:04.382 | =====================================Start 
fetching=====================================
2019-02-10 12:00:04.451 |
2019-02-10 12:00:04.426|Id: 25713118
2019-02-10 12:00:04.426|Id: 26076208
2019-02-10 12:00:04.426|Id: 26079643
2019-02-10 12:00:04.426|Id: 26085973
2019-02-10 12:00:04.426|Id: 26090023
2019-02-10 12:00:04.426|Id: 26130133
2019-02-10 12:00:04.426|Id: 25954018
2019-02-10 12:00:04.427|Id: 25951468
2019-02-10 12:00:04.427|Id: 26136148
2019-02-10 12:00:04.427|Id: 26103013
2019-02-10 12:00:04.427|Id: 25806433

我需要这样的输出:

Time               |Count(Id)  
2019-02-10 12:00:03|5    
2019-02-10 12:00:04|11

任何人都可以帮忙吗?

bash shell group-by count centos7
2个回答
1
投票

如果每行总是有结束的Id,你不介意的格式是倒过来,这非常简单:

grep 'Id:' /tmp/data.txt | cut -f 1 -d '.' | uniq -c

   5 2019-02-10 12:00:03   
  11 2019-02-10 12:00:04
  1. grep扔掉空行。
  2. cut采场点之前(即没有毫秒的时间)。
  3. uniq计数数目似乎每次。

(如果该文件并不总是为了,你可能还需要sort之前有一个uniq)。

为了扭转数据,并添加一个管道,以符合您要求的格式,你可以通过管道sed的输出 - 是这样的:

sed -re 's/ +([0-9]+) (.+)/\2|\1/'

-1
投票

data.txt中

2019-02-10 12:00:03.448|Id: 26102338
2019-02-10 12:00:03.448|Id: 25941418
2019-02-10 12:00:03.449|Id: 25827373
2019-02-10 12:00:03.449|Id: 26102038
2019-02-10 12:00:03.449|Id: 25929358

2019-02-10 12:00:04.426|Id: 25713118
2019-02-10 12:00:04.426|Id: 26076208
2019-02-10 12:00:04.426|Id: 26079643
2019-02-10 12:00:04.426|Id: 26085973
2019-02-10 12:00:04.426|Id: 26090023
2019-02-10 12:00:04.426|Id: 26130133
2019-02-10 12:00:04.426|Id: 25954018
2019-02-10 12:00:04.427|Id: 25951468
2019-02-10 12:00:04.427|Id: 26136148
2019-02-10 12:00:04.427|Id: 26103013
2019-02-10 12:00:04.427|Id: 25806433

2019-02-10 12:00:03.427|Id: 25806433

命令:

grep 'Id:' data.txt | cut -f 1 -d '.' | sort | uniq -c | awk '{print $2" "$3" | "$1}'

排序前计数,以避免无序时间戳

输出:

2019-02-10 12:00:03 | 6
2019-02-10 12:00:04 | 11
© www.soinside.com 2019 - 2024. All rights reserved.