使用 R

问题描述 投票:0回答:3

我有一个带字符的数据,我想把一个数据转换成存在不存在的数据形式,如果存在任何像变量这样的字符,我想把它变成一个,如果有空的,我想变成0.

df <- data.frame(
  A = c("G1","G2","G3","G4","G5","G6","G7","G8","G9","G10"),
  B = c("A",  "",  "A", "", "G", "B", "C",  "",  "",   "" ),
  C = c("B",  "",  "Z", "", "",   "", "",  'B',  "C",  "" ),
  D = c("Z", "D",  "",  "", "",   "", "",   "",  "",   "D"),
  E = c("A", "E",  "B", "A","",   "", "",   "",  "",    "")
)

输出看起来像这样

df <- data.frame(
      A = c("G1", "G2", "G3", "G4", "G5","G6","G7", "G8", "G9","G10"),
      B = c(1, 0, 1, 0, 1, 1, 1, 0, 0, 0),
      C = c(1, 0, 1, 0, 0, 0, 0, 1, 1, 0),
      D = c(1, 1, 0, 0, 0, 0, 0, 0, 0, 1),
      E = c(1, 1, 1, 1, 0, 0, 0, 0, 0, 0))

提前致谢

r dataframe for-loop dplyr tidyverse
3个回答
2
投票

我们可以使用

!=
并在
base R

中将其强制转换为二进制
df[-1] <- +(df[-1] != "")

-输出

> df
     A B C D E
1   G1 1 1 1 1
2   G2 0 0 1 1
3   G3 1 1 0 1
4   G4 0 0 0 1
5   G5 1 0 0 0
6   G6 1 0 0 0
7   G7 1 0 0 0
8   G8 0 1 0 0
9   G9 0 1 0 0
10 G10 0 0 1 0

或与

tidyverse

library(dplyr) # version >= 1.1.0
df %>% 
   mutate(across(-A, ~ case_match(.x, "" ~ 0, .default = 1)))

-输出

     A B C D E
1   G1 1 1 1 1
2   G2 0 0 1 1
3   G3 1 1 0 1
4   G4 0 0 0 1
5   G5 1 0 0 0
6   G6 1 0 0 0
7   G7 1 0 0 0
8   G8 0 1 0 0
9   G9 0 1 0 0
10 G10 0 0 1 0

2
投票

使用 base R,您可以就地替换 data.frame

df[, -1] <- lapply(df[, -1], function(x) ifelse(nchar(x)>0,1,0))

或者使用

dplyr
你可以改变data.frame来创建一个新的

library(dplyr)
df %>% 
  mutate(across(-A, ~if_else(nchar(.)>0, 1, 0)))

0
投票

我们也可以用

nzchar

> df[-1] <- +nzchar(as.matrix(df[-1]))

> df
     A B C D E
1   G1 1 1 1 1
2   G2 0 0 1 1
3   G3 1 1 0 1
4   G4 0 0 0 1
5   G5 1 0 0 0
6   G6 1 0 0 0
7   G7 1 0 0 0
8   G8 0 1 0 0
9   G9 0 1 0 0
10 G10 0 0 1 0
© www.soinside.com 2019 - 2024. All rights reserved.