拆分数据框的列并将它们重新组合为一列而不显示空格

问题描述 投票:1回答:2

我有两个数据帧,我想基于“;”拆分,并将结果列重新组合成一个具有相应值的列。

检查下面的例子。 df2中的每个数字对应于df1中的字符串(“aaa”< - > 111,“bbbb”< - > 2222,“ccc”< - > 333,依此类推)

#The dataframes I have
df1 = c("aaa","bbbb;ccc","dd;eeee;ffff","gg") #1st dataframe
df2 = c("111","2222;333","44;5555;6666","77") #2nd dataframe
df = as.data.frame(cbind(df1,df2)) #combine df1 and df2

#The output I'm trying to achieve
df1_desired = c("aaa","bbbb","ccc","dd","eeee","ffff","gg")
df2_desired = c("111","2222","333","44","5555","6666","77")
df_desired = as.data.frame(cbind(df1_desired,df2_desired)) #this is the format I want

我试过以下,但它没有给我我需要的安排。

split_df1 = str_split_fixed(df$df1, ";", 3)
split_df2 = str_split_fixed(df$df2, ";", 3)
combined_output = cbind(split_df1 ,split_df2 )

非常感谢您的建议!

UPDATE

@snoram提供的这个解决方案非常适合我:

library(data.table)
setDT(df)
dfd <- df[, lapply(.SD, tstrsplit, ";"), by = seq_len(nrow(df))][, seq_len := NULL]
dfd
r dataframe split
2个回答
0
投票
library(data.table)
setDT(df)
dfd <- df[, lapply(.SD, tstrsplit, ";"), by = seq_len(nrow(df))][, seq_len := NULL]
dfd
    df1  df2
1:  aaa  111
2: bbbb 2222
3:  ccc  333
4:   dd   44
5: eeee 5555
6: ffff 6666
7:   gg   77

基础R受rar的启发:

data.frame(lapply(lapply(df, strsplit, ";"), unlist))

0
投票
> data.frame(cbind(unlist(strsplit(df1,";")),unlist(strsplit(df2,";"))))
    X1   X2
1  aaa  111
2 bbbb 2222
3  ccc  333
4   dd   44
5 eeee 5555
6 ffff 6666
7   gg   77

首先基于“;”拆分文本,然后将其取消列出,同时将两个结果转换为数据帧。

© www.soinside.com 2019 - 2024. All rights reserved.