我有一个相当复杂的函数,可以修改一些字符变量。在对函数进行编码时,我遇到了一个处理 NA 值的奇怪问题。我将免除您复杂的功能,而是在下面的 MWE 中提出问题:
# Create an example data frame
df <- data.frame(noun = c("apple", NA, "banana"))
# Display the example data frame
df
#> noun
#> 1 apple
#> 2 <NA>
#> 3 banana
# Introduce the function
process_my_df <- function(input_data, my_var) {
# Create a new variable based on an existing variable
for (i in 1:nrow(input_data)) {
if (!is.na(input_data[[my_var]][i])) {
input_data[[paste0(my_var, "_result")]][i] <- "is a fruit"
}
}
return(input_data)
}
# Call the function to process the data frame
processed_df <- process_my_df(df, "noun")
# Display the resulting df
processed_df
#> noun noun_result
#> 1 apple is a fruit
#> 2 <NA> is a fruit
#> 3 banana is a fruit
创建于 2023-11-03,使用 reprex v2.0.2
我的问题:根据条件
if (!is.na(input_data[[my_var]][i])) {}
我期望以下结果:
#> noun noun_result
#> 1 apple is a fruit
#> 2 <NA> <NA>
#> 3 banana is a fruit
发生什么事了?
当您隐式创建新列时会出现此问题。如果你明确地这样做,它就会正常工作:
# Call the function to process the data frame
df$noun_result = ""
processed_df <- process_my_df(df, "noun")
# Display the resulting df
processed_df
# noun noun_result
# 1 apple is a fruit
# 2 <NA>
# 3 banana is a fruit