我有以下数据集
我的数据<- datasets::volcano
install.packages('e1071')
library(e1071)
library(tidyverse) #load required libraries
head(mydata) # quick view of the data.
#Part 1
#Calculating kurtosis and new measure with apply from base package with annon
#function and using type 2 from e1071 library
kurtosis <- apply(mydata, 2, function(x) kurtosis(x, type = 2))
new_measure <- apply(mydata, 2, function(x) sd(x) / mad(x))
#create a new dataframe with the calculated kurtosis and new measure
base_mydata <- data.frame(kurtosis = kurtosis, new_measure = new_measure)
我在这方面做得很好,我现在要做的是使用 dplyr 或 purrr 进行上述计算,我不确定为什么这不起作用。我只是得到一个向量或 NaN 值?
#Part 2
# Calculate kurtosis for each column
kurtosis_value <- mydata %>%
map_dbl(~ kurtosis(.x))
感谢任何帮助/指导。
我在这方面做得很好,我现在要做的是使用 dplyr 或 purrr 进行上述计算,我不确定为什么这不起作用。我只是得到一个向量或 NaN 值?我期待返回值与每列的峰度值
#Part 2
# Calculate kurtosis for each column
kurtosis_value <- mydata %>%
map_dbl(~ kurtosis(.x))
map_dbl() 函数需要一个向量或一个列表作为输入。如果将矩阵传递给 map_dbl(),它将抛出 NA。首先,您需要将矩阵类型的 mydata 转换为数据框。通过这种格式,函数自动将数据框转换为列表并应用函数:
library(tidyverse)
library(moments)
mydata <- datasets::volcano
kurtosis_value <- map_dbl(as.data.frame(mydata), kurtosis, na.rm=T)
kurtosis_value
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15
2.371050 2.514419 2.699051 2.757678 2.784320 2.735230 2.659157 2.593125 2.475620 2.272475 2.181941 2.147706 2.146325 2.121628 2.077791
V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30
2.041687 2.038450 2.068429 2.088117 2.091098 2.087650 2.042588 1.973068 1.918383 1.855893 1.788262 1.788161 1.778543 1.771347 1.833231
V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45
1.889760 1.948411 2.016484 2.072357 2.128480 2.114815 2.154601 2.105206 2.038636 1.977894 1.950674 1.914163 1.932104 1.963528 2.004136
V46 V47 V48 V49 V50 V51 V52 V53 V54 V55 V56 V57 V58 V59 V60
2.069453 2.125611 2.148218 2.191073 2.251291 2.180624 2.204499 2.290069 2.369687 2.420440 2.417594 2.270683 2.091416 2.174677 2.169017
V61
2.152479
当您只是将矩阵传递给
map()
时,它会循环遍历每个元素,尽管您希望它遍历矩阵列。如何实现这一点的几个例子:
library(e1071)
library(purrr)
mydata <- datasets::volcano
map_dbl(1:ncol(mydata), ~ kurtosis(mydata[,.x], type = 2))
#> [1] -0.5943826 -0.4424202 -0.2467199 -0.1845791 -0.1563397 -0.2083728
#> [7] -0.2890058 -0.3589954 -0.4835449 -0.6988673 -0.7948276 -0.8311148
#> ...
#> [61] -0.8260557
mydata %>%
array_branch(margin = 2) %>%
map_dbl(\(x) kurtosis(x, type = 2))
#> [1] -0.5943826 -0.4424202 -0.2467199 -0.1845791 -0.1563397 -0.2083728
#> [7] -0.2890058 -0.3589954 -0.4835449 -0.6988673 -0.7948276 -0.8311148
#> ...
#> [61] -0.8260557
创建于 2023-02-25 与 reprex v2.0.2