如何在r中计算索引?

问题描述 投票:1回答:1

这是我第一次在这个伟大的社区提问。我正在尝试计算data.frame上的索引,通过自治市镇或邻居和情节显示它们。哪种代码最适合?

这是我拥有的数据集的示例。 albo,aegy =蚊子种类,房子=房屋预计,房屋指数计算的是(正房屋数量/预计房屋数量)* 100。积极的房子是至少找到一只蚊子的房子(值!= 0)HI =(7/11)* 100 =总共63.63(11 =预计的房屋数量,7 =总房屋数量)


borough neighborhood    concession  albo    aegyp   Total_albo_aegyp
a1  mendong                1         1        5            6
a1  mendong                2         5        2            7
a1  mendong                3         2        1            3
a1  tam tam                4         0        0            0
a2  tam tam                5         4        6            10
a2  obili                  6         0        1             1
a2  obili                  7         0        0             0
a3  acacia                 8         3        7             10
a4  melen                  9         1        1             2
a4  melen                  10        0        5             5
a4  polytech               11        8        0             10

HIcommune <- concessiondata %>% 
  group_by(commune) %>% 
  summarise(
  Mean = mean(concessiondata$total_aedes_albo_aegypti!=0),
  HIY = sum(concessiondata1$total_aedes_albo_aegypti!=0)/length(concessiondata1$total_aedes_albo_aegypti))

  Houseindex_total <- concessiondata1[, Mean := mean(total_aedes_albo_aegypti!=0), by = "commune"]


  ## This is how the results should look like

borough albo HI aegy HI Total_albo_aegyp_HI
a1        75       75         75
a2        33.33    66.66      66.66
a3        100      100        100
a4        66.66    66.66      100

r indexing group-by row
1个回答
1
投票

首先,您的代码存在一些常见的编码/语法问题。

  1. 我建议不要混合使用dplyrdata.table语法。
  2. 你不需要在$动词中使用dplyr-index列。

我建议您熟悉许多免费提供的tidyverse教程之一,以学习使用dplyr / tidyr重塑/操作数据的基础知识。

除此之外,以下内容将再现您的预期输出

calc_index <- function(x) sum(x != 0) / length(x) * 100

library(dplyr)
df %>%
    group_by(borough) %>%
    summarise(
        albo_HI = calc_index(albo),
        aegyp_HI = calc_index(aegyp),
        Total_albo_aegyp = calc_index(Total_albo_aegyp))
## A tibble: 4 x 4
#  borough albo_HI aegyp_HI Total_albo_aegyp
#  <fct>     <dbl>    <dbl>            <dbl>
#1 a1         75       75               75
#2 a2         33.3     66.7             66.7
#3 a3        100      100              100
#4 a4         66.7     66.7            100

或者你可以使用summarise_all

df %>%
    group_by(borough) %>%
    select(-neighborhood, -concession) %>%
    summarise_all(~calc_index(.x))
## A tibble: 4 x 4
#  borough  albo aegyp Total_albo_aegyp
#  <fct>   <dbl> <dbl>            <dbl>
#1 a1       75    75               75
#2 a2       33.3  66.7             66.7
#3 a3      100   100              100
#4 a4       66.7  66.7            100
© www.soinside.com 2019 - 2024. All rights reserved.