在具有sapply的函数中使用str_split和union的意外结果

Question

鉴于此data.frame：

library(dplyr)
library(stringr)
ml.mat2 <- structure(list(value = c("a", "b", "c"), ground_truth = c("label1, label3", 
"label2", "label1"), predicted = c("label1", "label2,label3", 
"label1")), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-3L))

glimpse(ml.mat2)

Observations: 3
Variables: 3
$ value        <chr> "a", "b", "c"
$ ground_truth <chr> "label1, label3", "label2", "label1"
$ predicted    <chr> "label1", "label2,label3", "label1"

我想在基于ground_truth分割重复的标签之后测量每行的predicted和,之间的交叉长度。

换句话说，我希望长度为3的结果的值为2 2 1。

我写了一个函数来做这个，但它似乎只在sapply之外工作：

m_fn <- function(x,y) length(union(unlist(sapply(x, str_split,",")), 
                             unlist(sapply(y, str_split,","))))

m_fn(ml.mat2$ground_truth[1], y = ml.mat2$predicted[1])

[1] 2

m_fn(ml.mat2$ground_truth[2], y = ml.mat2$predicted[2])

[1] 2

m_fn(ml.mat2$ground_truth[3], y = ml.mat2$predicted[3])

[1] 1

不是像这样或使用循环手动遍历数据集的行，我希望能够使用sapply像这样对解决方案进行矢量化：

sapply(ml.mat2$ground_truth, m_fn, ml.mat2$predicted)

然而，意外的结果是：

label1, label3         label2         label1 
             4              3              3

Answer 1

由于您在相同的观察大小内进行迭代，因此您可以生成行数索引并在qazxsw poi中运行它：

sapply

或与sapply(1:nrow(ml.mat2), function(i) m_fn(x = ml.mat2$ground_truth[i], y = ml.mat2$predicted[i])) #[1] 2 2 1：

seq_len

在具有sapply的函数中使用str_split和union的意外结果

问题描述投票：2回答：1

1个回答

最新问题

在具有sapply的函数中使用str_split和union的意外结果

问题描述 投票：2回答：1

1个回答

最新问题

问题描述投票：2回答：1