为什么不过滤和rowSums一起使用?

问题描述 投票:1回答:3

我正在阅读有关R的Kaggle教程,并试图用magrittr代替filter。但这似乎不起作用,但我不明白为什么。他们似乎做同样的事情。

我已经尝试过下面的代码。

f_countOfMen <- mutFoodData %>%
    select(starts_with("gender")) %>%
    filter(Gender == 1) %>%
    rowSums(na.rm = T)

f_countOfWomen <- mutFoodData %>%
    select(starts_with("gender")) %>%
    filter(Gender == 2) %>%
    rowSums(na.rm = T)

mutFoodData <- mutFoodData %>%
    mutate(fMen = f_countOfMen, fWomen = f_countOfWomen) # add our new variables
# however it doesn't add the variables and produces an error


m_countOfMen <- mutFoodData %>%
    select(starts_with("gender")) %>%
    magrittr::equals(1) %>%
    rowSums(na.rm = T)

m_countOfWomen <- mutFoodData %>%
    select(starts_with("gender")) %>%
    magrittr::equals(2) %>%
    rowSums(na.rm = T)

mutFoodData <- mutFoodData %>%
    mutate(mMen = m_countOfMen, mWomen = m_countOfWomen) # add our new variables
# this code does as expected

我希望添加新的列,但我不断收到此错误:

mutate_impl(.data,点)中的错误:错误的结果大小(76),预期为124或1

r filter magrittr
3个回答
1
投票

问题是,在执行filter时,行数减少了,并且此行被添加到具有完整行的原始数据集中。在这里,代替filter ing,创建一个逻辑矩阵,并为“男人”和“女人”获得rowSums

library(dplyr)
mutFoodData %>%
      mutate(fMen = rowSums(select(., starts_with("gender")) == 1, na.rm= TRUE),
           fFemale = rowSums(2*(select(., starts_with("gender")) ==2), na.rm = TRUE))

1
投票

问题是,在执行过滤器时,行数减少了,并且被添加到具有完整行的原始数据集中。

我没意识到。

只是用谷歌搜索了行数函数,然后找到了nrow。所以我运行了这段代码

rowscount <- mutFoodData %>%
    select(Gender) %>%
    nrow()

rowscountFilter <- mutFoodData %>%
    select(Gender) %>%
    filter(Gender == 1) %>%
    nrow()

rowscountMagittr <- mutFoodData %>%
    select(Gender) %>%
    magrittr::equals(1) %>%
    nrow()

print(rowscount)
print(rowscountFilter)
print(rowscountMagittr)

结果为:

124

76

124

我现在明白了。谢谢。


0
投票
# A tibble: 6 × 64
    GPA Gender breakfast calories_chicken calories_day calories_scone coffee
  <chr>  <int>     <int>            <int>        <dbl>          <dbl>  <int>
1 3.654      1         1              610            3            420      2
2   3.3      1         1              720            4            420      2
3   3.2      1         1              430            3            420      2
4   3.5      1         1              720            2            420      2
5  2.25      1         1              610            3            980      2
6   3.8      2         1              610            3            420      2
# ... with 57 more variables: comfort_food <chr>, comfort_food_reasons <chr>,
#   comfort_food_reasons_coded <int>, cook <dbl>, comfort_food_reasons_coded_1 <int>, cuisine <dbl>,
#   diet_current <chr>, diet_current_coded <int>, drink <dbl>, eating_changes <chr>,
#   eating_changes_coded <int>, eating_changes_coded1 <int>, eating_out <int>, employment <dbl>,
#   ethnic_food <int>, exercise <dbl>, father_education <dbl>, father_profession <chr>,
#   fav_cuisine <chr>, fav_cuisine_coded <int>, fav_food <dbl>, food_childhood <chr>, fries <int>,
#   fruit_day <int>, grade_level <int>, greek_food <int>, healthy_feeling <int>, healthy_meal <chr>,
#   ideal_diet <chr>, ideal_diet_coded <int>, income <dbl>, indian_food <int>, italian_food <int>,
#   life_rewarding <dbl>, marital_status <dbl>, meals_dinner_friend <chr>, mother_education <dbl>,
#   mother_profession <chr>, nutritional_check <int>, on_off_campus <dbl>, parents_cook <int>,
#   pay_meal_out <int>, persian_food <dbl>, self_perception_weight <dbl>, soup <dbl>, sports <dbl>,
#   thai_food <int>, tortilla_calories <dbl>, turkey_calories <int>, type_sports <chr>,
#   veggies_day <int>, vitamins <int>, waffle_calories <int>, weight <chr>, food <int>, mMen <dbl>,
#   mWomen <dbl>

@ akrun和其他想要查看数据外观的人

© www.soinside.com 2019 - 2024. All rights reserved.