请参阅下面的可复制和所需的输出。
我想创建一个新变量,在这里我要合并其他观察值(行)中的变量值,我想使用子集在循环中进行标识。子集的条件由循环定义。在示例1中,subset(df, country == i)
不起作用,但是手动执行(在示例2中)subset(df, country == 'US')
有效。我认为country == i
和country == 'US'
应该几乎相同。
# create a df
country <- c('US', 'US', 'China', 'china')
Trump_virus <- c('Y', 'N' ,'Y', 'N')
cases <- c (1000, 2000, 4, 6)
df <- data.frame(country, Trump_virus, cases)
#################################################### Ex.1
for (i in df$country) {
print(i)
df <- df %>%
mutate(cases_corected = ifelse(
Trump_virus == 'Y'
,subset(df, Trump_virus == 'N' & country == i)$cases*1000
,'killer_virus'
))}
##
df$cases_corected
#################################################### Ex.2
for (i in df$country) {
print(i)
df <- df %>%
mutate(cases_corected = ifelse(
Trump_virus == 'Y'
,subset(df, Trump_virus == 'N' & country == 'US')$cases*1000
,'killer_virus'
))}
##
df$cases_corected
################################################### Desired output
> df$cases_corected
[1] "1e+06"
[2] "killer_virus"
[3] "4000"
[4] "killer_virus"
您不需要for loop
和subset
功能即可实现输出。 mutate
直接与列配合使用:
df <- df %>%
mutate(
cases_corected = ifelse(Trump_virus == 'Y',cases*1000,'killer_virus')
)
df
country Trump_virus cases cases_corected
1 US Y 1000 1e+06
2 US N 2000 killer_virus
3 China Y 4 4000
4 china N 6 killer_virus