我们想从以下方面管理数据 dat_1
宽幅的,以 dat_2
的长格式。为了执行它,我们想到了使用 tidyr::pivot_longer()
带论点 names_pattern = '(.+)_(.+)'
. 它允许我们收集数据,如框中所示,从 input format
到 output format
.
> dat_1
original_id timepoint msp3_mfi msp3_dil pf_aarp_mfi pf_aarp_dil
<chr> <chr> <dbl> <int> <dbl> <int>
1 id_005 C_0 10.5 400 22.2 400
2 id_005 D10 8.5 400 10.25 400
3 id_005 D13 11 400 10.2 400
4 id_005 D28 8 400 9.75 400
5 id_005 D60 7 400 0.30 400
dat_2 <- dat_1 %>%
pivot_longer(
cols = msp3_mfi:pf_aarp_dil,
names_to = c('antigen', 'antigen_dil'),
names_pattern = '(.+)_(.+)',
values_to = c('mfi', 'dil'))
> dat_2
original_id timepoint antigen antigen_dil mfi dil
<chr> <chr> <chr> <chr> <dbl> <int>
1 id_005 C_0 msp3 mfi 10.5 NA
2 id_005 C_0 msp3 dil NA 400
3 id_005 C_0 pf_aarp mfi 22.2 NA
4 id_005 C_0 pf_aarp dil NA 400
5 id_005 D10 msp3 mfi 8.5 NA
6 id_005 D10 msp3 dil NA 400
7 id_005 D10 pf_aarp mfi 10.25 NA
8 id_005 D10 pf_aarp dil NA 400
9 id_005 D13 msp3 mfi 11 NA
10 id_005 D13 msp3 dil NA 400
11 id_005 D13 pf_aarp mfi 10.2 NA
12 id_005 D13 pf_aarp dil NA 400
13 id_005 D28 msp3 mfi 8 NA
14 id_005 D28 msp3 dil NA 400
15 id_005 D28 pf_aarp mfi 9.75 NA
16 id_005 D28 pf_aarp dil NA 400
17 id_005 D60 msp3 mfi 7 NA
18 id_005 D60 msp3 dil NA 400
19 id_005 D60 pf_aarp mfi 0.30 NA
20 id_005 D60 pf_aarp dil NA 400
然而,当我们更新de R(版本R 3.6.3)会话时,从 tibble 2.1.3
到 tibble. 3.0.1
我们得到以下错误。
Error: Assigned data `values_to` must be compatible with existing data.
x Existing data has 4 rows.
x Assigned data has 2 rows.
i Only vectors of size 1 are recycled.
Run `rlang::last_error()` to see where the error occurred.
知道为什么我们在 name_pattern
当我们更新 tibble
包的版本?
先谢谢你
虽然我不知道 何以 变化的发生,我猜测,在内心深处,这个错误与你的。dat_2
凹凸不平 整洁. 将 "mfi "和 "dil "的区别作为单独的一栏,是重复的。antigen_dil
并作为两个独立的栏目 mfi
和 dil
.
根据你的数据的含义,有两种格式可以很容易地与你的数据配合使用 pivot_longer
是。
dat_1 %>%
pivot_longer(
cols = msp3_mfi:pf_aarp_dil,
names_to = c('antigen', '.value'),
names_pattern = '(.+)_(.+)'
)
#> # A tibble: 4 x 5
#> original_id timepoint antigen mfi dil
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 id_005 C_0 msp3 10.5 400
#> 2 id_005 C_0 pf_aarp 22.2 400
#> 3 id_005 D10 msp3 8.5 400
#> 4 id_005 D10 pf_aarp 10.2 400
或
dat_1 %>%
pivot_longer(
cols = msp3_mfi:pf_aarp_dil,
names_to = c('antigen', 'antigen_dil'),
names_pattern = '(.+)_(.+)'
)
#> # A tibble: 8 x 5
#> original_id timepoint antigen antigen_dil value
#> <chr> <chr> <chr> <chr> <dbl>
#> 1 id_005 C_0 msp3 mfi 10.5
#> 2 id_005 C_0 msp3 dil 400
#> 3 id_005 C_0 pf_aarp mfi 22.2
#> 4 id_005 C_0 pf_aarp dil 400
#> 5 id_005 D10 msp3 mfi 8.5
#> 6 id_005 D10 msp3 dil 400
#> 7 id_005 D10 pf_aarp mfi 10.2
#> 8 id_005 D10 pf_aarp dil 400
如果你真的需要你描述的格式的tibble,你可以使用。
dat_1 %>%
pivot_longer(
cols = msp3_mfi:pf_aarp_dil,
names_to = c('antigen', 'antigen_dil'),
names_pattern = '(.+)_(.+)'
) %>%
mutate(
mfi = if_else(antigen_dil == "mfi", value, NA_real_),
dil = if_else(antigen_dil == "dil", value, NA_real_)
) %>%
select(-value)
#> # A tibble: 8 x 6
#> original_id timepoint antigen antigen_dil mfi dil
#> <chr> <chr> <chr> <chr> <dbl> <dbl>
#> 1 id_005 C_0 msp3 mfi 10.5 NA
#> 2 id_005 C_0 msp3 dil NA 400
#> 3 id_005 C_0 pf_aarp mfi 22.2 NA
#> 4 id_005 C_0 pf_aarp dil NA 400
#> 5 id_005 D10 msp3 mfi 8.5 NA
#> 6 id_005 D10 msp3 dil NA 400
#> 7 id_005 D10 pf_aarp mfi 10.2 NA
#> 8 id_005 D10 pf_aarp dil NA 400
上面的代码段使用下面的代码来创建: dat_1
:
library(tidyverse)
dat_1 <-
tribble(
~original_id, ~timepoint, ~msp3_mfi, ~msp3_dil, ~pf_aarp_mfi, ~pf_aarp_dil,
"id_005", "C_0", 10.5, 400, 22.2, 400,
"id_005", "D10", 8.5, 400, 10.25, 400
)
dat_1
#> # A tibble: 2 x 6
#> original_id timepoint msp3_mfi msp3_dil pf_aarp_mfi pf_aarp_dil
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 id_005 C_0 10.5 400 22.2 400
#> 2 id_005 D10 8.5 400 10.2 400