在此类数据中:
df
# A tibble: 9 × 6
id Utterance Story Climax Starttime_ms Endtime_ms
<dbl> <chr> <chr> <chr> <dbl> <dbl>
1 4 "yeah" NA NA 20405 23532
2 5 "Come Home " "Come Home " NA 20405 47677 #<--
3 6 ">last time " NA NA 23818 25110
4 7 "two weeks ago? " NA NA 25470 26259
5 8 "and X" NA NA 26623 32103
6 9 "and then last night" NA NA 32688 33797
7 10 "are you sure?" NA NA 34099 37542
8 11 "Come Home climax " NA "Come Home climax " 34099 39895 #<---
9 12 "=she said Y" NA NA 38075 39895
我需要重新排列行,使这些行
Starttime_ms
和 Endtime_ms
之间的间隔大于前一行 ANDStartttime_ms
与上一行中的相同放在上一行之前。这怎么办?
所需的输出是这样的:
df
# A tibble: 9 × 6
id Utterance Story Climax Starttime_ms Endtime_ms
<dbl> <chr> <chr> <chr> <dbl> <dbl>
2 5 "Come Home " "Come Home " NA 20405 47677
1 4 "yeah" NA NA 20405 23532
3 6 ">last time " NA NA 23818 25110
4 7 "two weeks ago? " NA NA 25470 26259
5 8 "and X" NA NA 26623 32103
6 9 "and then last night" NA NA 32688 33797
8 11 "Come Home climax " NA "Come Home climax " 34099 39895
7 10 "are you sure?" NA NA 34099 37542
9 12 "=she said Y" NA NA 38075 39895
这不是一种方便的复制格式(请参阅
?dput
或 reprex
包),但类似于
library(dplyr)
df_sort <- (df
|> mutate(dt = endtime - starttime)
|> arrange(starttime, dt)
)
应该有效(
arrange
的第一个参数是主要排序键;后面的参数用作主要类别中的决胜局/排序)。如果您不想保留时差变量,可以添加 select(-dt)
。