排列(字母专用)[重复]

问题描述 投票:0回答:1

我有一个这样的数据集 :

# install.packages(dplyr)
library(dplyr)
df <- tibble(period = c(201501,201502,201503,201504,201505,201506,201507,201508,201509,201510,201511,201512,201513),
             sales = sample(1:100,13),
             P1 = c(1,0,0,0,0,0,0,0,0,0,0,0,0),
             P10 = c(0,0,0,0,0,0,0,0,0,1,0,0,0),
             P11 = c(0,0,0,0,0,0,0,0,0,0,1,0,0),
             P12 = c(0,0,0,0,0,0,0,0,0,0,0,1,0),
             P13 = c(0,0,0,0,0,0,0,0,0,0,0,0,1),
             P2 = c(0,1,0,0,0,0,0,0,0,0,0,0,0),
             P3 = c(0,0,1,0,0,0,0,0,0,0,0,0,0),
             P4 = c(0,0,0,1,0,0,0,0,0,0,0,0,0),
             P5 = c(0,0,0,0,1,0,0,0,0,0,0,0,0),
             P6 = c(0,0,0,0,0,1,0,0,0,0,0,0,0),
             P7 = c(0,0,0,0,0,0,1,0,0,0,0,0,0),
             P8 = c(0,0,0,0,0,0,0,1,0,0,0,0,0),
             P9 = c(0,0,0,0,0,0,0,0,1,0,0,0,0),
             )
print(df)

所以我有这个:

# A tibble: 13 x 15
period sales    P1   P10   P11   P12   P13    P2    P3    P4    P5    P6
<dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
  1 201501    92     1     0     0     0     0     0     0     0     0     0
2 201502    60     0     0     0     0     0     1     0     0     0     0
3 201503    31     0     0     0     0     0     0     1     0     0     0
4 201504    74     0     0     0     0     0     0     0     1     0     0
5 201505    82     0     0     0     0     0     0     0     0     1     0
6 201506    86     0     0     0     0     0     0     0     0     0     1
7 201507    19     0     0     0     0     0     0     0     0     0     0
8 201508    32     0     0     0     0     0     0     0     0     0     0
9 201509    99     0     0     0     0     0     0     0     0     0     0
10 201510    47     0     1     0     0     0     0     0     0     0     0
11 201511    21     0     0     1     0     0     0     0     0     0     0
12 201512    77     0     0     0     1     0     0     0     0     0     0
13 201513    25     0     0     0     0     1     0     0     0     0     0
# ... with 3 more variables: P7 <dbl>, P8 <dbl>, P9 <dbl>

是否有一种自动的方法来获得相同的tibble,但P+数字列的正确顺序......。P1,P2,P3,P4等等等等...。

r data-manipulation
1个回答
2
投票

我们可以用 mixedsortmixedordergtools :

library(dplyr)
df %>% select(period, sales, gtools::mixedsort(names(.)))


#  period sales    P1    P2    P3    P4    P5    P6    P7    P8    P9   P10   P11   P12   P13
#    <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 201501    34     1     0     0     0     0     0     0     0     0     0     0     0     0
# 2 201502    22     0     1     0     0     0     0     0     0     0     0     0     0     0
# 3 201503    17     0     0     1     0     0     0     0     0     0     0     0     0     0
# 4 201504    91     0     0     0     1     0     0     0     0     0     0     0     0     0
# 5 201505    27     0     0     0     0     1     0     0     0     0     0     0     0     0
# 6 201506    58     0     0     0     0     0     1     0     0     0     0     0     0     0
# 7 201507    57     0     0     0     0     0     0     1     0     0     0     0     0     0
# 8 201508     2     0     0     0     0     0     0     0     1     0     0     0     0     0
# 9 201509    24     0     0     0     0     0     0     0     0     1     0     0     0     0
#10 201510    92     0     0     0     0     0     0     0     0     0     1     0     0     0
#11 201511    21     0     0     0     0     0     0     0     0     0     0     1     0     0
#12 201512    59     0     0     0     0     0     0     0     0     0     0     0     1     0
#13 201513     4     0     0     0     0     0     0     0     0     0     0     0     0     1

1
投票

这是一个基本的R解决方案,它依赖于你在开始时有额外的两列。

i1 <- order(as.numeric(gsub('\\D+', '', names(df[-c(1:2)]))))
df <- df[c(1, 2, i1 + 2)]

现在调查这些名字,我们得到

names(df)
#[1] "period" "sales"  "P1"     "P2"     "P3"     "P4"     "P5"     "P6"     "P7"     "P8"     "P9"     "P10"    "P11"    "P12"    "P13"  
© www.soinside.com 2019 - 2024. All rights reserved.