我想在示例数据框中创建“转向”列。我有一个包含数千行的更大数据集。该列将指示扬声器当前的轮次。即使句子跨不同行,如果由同一说话者说出,也将算作同一轮。那么,下次轮到该人发言时,将是第 n 轮。
df <- data.frame(
line = c(1:9),
speaker = c("nick", "nick", "nick", "bob", "nick", "ann", "ann", "nick", "bob"),
sentence = c("hi", "how are you?", "what's up?", "i'm good", "me too", "hi guys", "any plans for the weekend", "no", "ya, the movies"),
turn = c(1, 1, 1, 2, 3, 4, 4, 5, 6))
我用过:
line speaker sentence turn turn_curgroupid
1 1 nick hi 1 3
2 2 nick how are you? 1 3
3 3 nick what's up? 1 3
4 4 bob i'm good 2 2
5 5 nick me too 3 3
6 6 ann hi guys 4 1
line speaker sentence turn turn_seqalong
1 1 nick hi 1 1
2 2 nick how are you? 1 2
3 3 nick what's up? 1 3
4 4 bob i'm good 2 1
5 5 nick me too 3 4
6 6 ann hi guys 4 1
感谢您的帮助。
df |>
mutate(turn2 = cumsum(speaker != lag(speaker, 1, "")),
turn3 = consecutive_id(speaker))
# H/T @andre-wildberg for mentioning this useful dplyr 1.1.0 function
结果
line speaker sentence turn turn2 turn3
1 1 nick hi 1 1 1
2 2 nick how are you? 1 1 1
3 3 nick what's up? 1 1 1
4 4 bob i'm good 2 2 2
5 5 nick me too 3 3 3
6 6 ann hi guys 4 4 4
7 7 ann any plans for the weekend 4 4 4
8 8 nick no 5 5 5
9 9 bob ya, the movies 6 6 6