我一直在使用 matchmaker:match_df 包中的清理字典工具。
代码如下:
dat <-import("coded-data.csv")
dict <- import("dict.csv")
df <- match_df(dat,
dictionary = dict,
from = "options",
to = "values",
by = "grp")
但是,我最近更换了计算机,现在当我运行之前有效的相同代码时,我在所有变量上都收到以下错误:
“NA
...
的每个元素都必须是命名字符串。”
我不确定这意味着什么或如何纠正它。
我的所有变量都是数据框和清理字典中的字符。
这个问题在https://stackoverflow.com/a/78228141/2752888中得到了回答,并记录在此处:https://cran.r-project.org/web/packages/matchmaker/vignettes/intro.html#values -到列
对于更多上下文和可重现的示例,如果您的字典的值列中包含空白单元格,则可能会发生这种情况:
df <- data.frame(var1 = c("aaa", "miss", "bbb"))
print(df)
#> var1
#> 1 aaa
#> 2 miss
#> 3 bbb
# Dictionary has a blank cell in the second row of the second column ------
dict <- data.frame(
from = c("aaa", "miss", "bbb"),
to = c("AAA", "", "bbb"),
col = rep("var1", 3)
)
print(dict)
#> from to col
#> 1 aaa AAA var1
#> 2 miss var1
#> 3 bbb bbb var1
matchmaker::match_df(df, dict,
from = "from",
to = "to",
by = "col",
warn = TRUE
)
#>
#> ── Errors were found in the following columns ──
#>
#> • var1
#> 1. NA Each element of `...` must be a named string.
#> var1
#> 1 aaa
#> 2 miss
#> 3 bbb
# Replacing the blank cell with the ".na" keyword fixes this. --------------
dict$to[2] <- ".na"
print(dict)
#> from to col
#> 1 aaa AAA var1
#> 2 miss .na var1
#> 3 bbb bbb var1
print(dict)
#> from to col
#> 1 aaa AAA var1
#> 2 miss .na var1
#> 3 bbb bbb var1
matchmaker::match_df(df, dict,
from = "from",
to = "to",
by = "col",
warn = TRUE
)
#> var1
#> 1 AAA
#> 2 <NA>
#> 3 bbb
创建于 2024-04-01,使用 reprex v2.0.2