我正在阅读有关R的Kaggle教程,并试图用magrittr
代替filter
。但这似乎不起作用,但我不明白为什么。他们似乎做同样的事情。
我已经尝试过下面的代码。
f_countOfMen <- mutFoodData %>%
select(starts_with("gender")) %>%
filter(Gender == 1) %>%
rowSums(na.rm = T)
f_countOfWomen <- mutFoodData %>%
select(starts_with("gender")) %>%
filter(Gender == 2) %>%
rowSums(na.rm = T)
mutFoodData <- mutFoodData %>%
mutate(fMen = f_countOfMen, fWomen = f_countOfWomen) # add our new variables
# however it doesn't add the variables and produces an error
m_countOfMen <- mutFoodData %>%
select(starts_with("gender")) %>%
magrittr::equals(1) %>%
rowSums(na.rm = T)
m_countOfWomen <- mutFoodData %>%
select(starts_with("gender")) %>%
magrittr::equals(2) %>%
rowSums(na.rm = T)
mutFoodData <- mutFoodData %>%
mutate(mMen = m_countOfMen, mWomen = m_countOfWomen) # add our new variables
# this code does as expected
我希望添加新的列,但我不断收到此错误:
mutate_impl(.data,点)中的错误:错误的结果大小(76),预期为124或1
问题是,在执行filter
时,行数减少了,并且此行被添加到具有完整行的原始数据集中。在这里,代替filter
ing,创建一个逻辑矩阵,并为“男人”和“女人”获得rowSums
library(dplyr)
mutFoodData %>%
mutate(fMen = rowSums(select(., starts_with("gender")) == 1, na.rm= TRUE),
fFemale = rowSums(2*(select(., starts_with("gender")) ==2), na.rm = TRUE))
问题是,在执行过滤器时,行数减少了,并且被添加到具有完整行的原始数据集中。
我没意识到。
只是用谷歌搜索了行数函数,然后找到了nrow
。所以我运行了这段代码
rowscount <- mutFoodData %>%
select(Gender) %>%
nrow()
rowscountFilter <- mutFoodData %>%
select(Gender) %>%
filter(Gender == 1) %>%
nrow()
rowscountMagittr <- mutFoodData %>%
select(Gender) %>%
magrittr::equals(1) %>%
nrow()
print(rowscount)
print(rowscountFilter)
print(rowscountMagittr)
结果为:
124
76
124
我现在明白了。谢谢。
# A tibble: 6 × 64
GPA Gender breakfast calories_chicken calories_day calories_scone coffee
<chr> <int> <int> <int> <dbl> <dbl> <int>
1 3.654 1 1 610 3 420 2
2 3.3 1 1 720 4 420 2
3 3.2 1 1 430 3 420 2
4 3.5 1 1 720 2 420 2
5 2.25 1 1 610 3 980 2
6 3.8 2 1 610 3 420 2
# ... with 57 more variables: comfort_food <chr>, comfort_food_reasons <chr>,
# comfort_food_reasons_coded <int>, cook <dbl>, comfort_food_reasons_coded_1 <int>, cuisine <dbl>,
# diet_current <chr>, diet_current_coded <int>, drink <dbl>, eating_changes <chr>,
# eating_changes_coded <int>, eating_changes_coded1 <int>, eating_out <int>, employment <dbl>,
# ethnic_food <int>, exercise <dbl>, father_education <dbl>, father_profession <chr>,
# fav_cuisine <chr>, fav_cuisine_coded <int>, fav_food <dbl>, food_childhood <chr>, fries <int>,
# fruit_day <int>, grade_level <int>, greek_food <int>, healthy_feeling <int>, healthy_meal <chr>,
# ideal_diet <chr>, ideal_diet_coded <int>, income <dbl>, indian_food <int>, italian_food <int>,
# life_rewarding <dbl>, marital_status <dbl>, meals_dinner_friend <chr>, mother_education <dbl>,
# mother_profession <chr>, nutritional_check <int>, on_off_campus <dbl>, parents_cook <int>,
# pay_meal_out <int>, persian_food <dbl>, self_perception_weight <dbl>, soup <dbl>, sports <dbl>,
# thai_food <int>, tortilla_calories <dbl>, turkey_calories <int>, type_sports <chr>,
# veggies_day <int>, vitamins <int>, waffle_calories <int>, weight <chr>, food <int>, mMen <dbl>,
# mWomen <dbl>
@ akrun和其他想要查看数据外观的人