我试图根据其中一列的值将一个数据框子集成一个更小的数据框,数据框为下面的head()
Experiment SRA_Sample Sample_Name Grupo_analisis body_site
1 SRX567480 SRS626942 GTEX-111CU-0226-SM-5GZXC 1 Thyroid
2 SRX615964 SRS644174 GTEX-111FC-1026-SM-5GZX1 1 Thyroid
3 SRX563960 SRS625636 GTEX-111VG-0526-SM-5N9BW 3 Thyroid
4 SRX564185 SRS625665 GTEX-111YS-0726-SM-5GZY8 1 Thyroid
5 SRX559141 SRS624025 GTEX-1122O-0226-SM-5N9DA 1 Thyroid
6 SRX561718 SRS625313 GTEX-1128S-0126-SM-5H12S 1 Thyroid
molecular_data_type sex Group ShortName
1 Allele-Specific Expression male NIT 111CU_NIT
2 RNA Seq (NGS) male NIT 111FC_NIT
3 RNA Seq (NGS) male ELI 111VG_ELI
4 Allele-Specific Expression male NIT 111YS_NIT
5 RNA Seq (NGS) female NIT 1122O_NIT
6 Allele-Specific Expression female NIT 1128S_NIT
有3组ELI,NIT y SCI,我想随机抽取每班10个样本。我正在尝试。
> set.seed(12)
> targets10<- filter(targets, targets$Group== ("NIT", "ELI", "SFI")) %>% sample_n(., 10)
or
>targets10<-filter(targets, targets$Group== "NIT","ELI","SFI") %>% sample_n(., 10)
但却出现了以下错误: Error in (~targets$Group == "NIT") & ~"ELI" : 运算只适用于数字、逻辑或复杂的类型
谁能帮帮我?
比你提前这么多
试着把你的代码重新格式化成这样。
dplyr
解决方案:
targets10 <- targets %>%
filter(Group %in% c("NIT", "ELI", "SFI")) %>%
sample_n(., 10)
base R
solution: solution:
targets10 <- subset(survey, Group %in% c("NIT", "ELI", "SFI"))
targets10 <- sample_n(targets10, 10)
编辑。
要从每组中抽取10个样本,你只需要添加 group_by
:
targets10 <- targets %>%
filter(Group %in% c("NIT", "ELI", "SFI")) %>%
group_by(Group) %>%
sample_n(., 10)