使用我的数据集bikeshare我想将变量季节重新编码为具有有意义的级别名称的因素(即“冬季”,“春季”,“夏季”,“秋季”),以春天为基线水平。
这是我的尝试:
bikeshare <- bikeshare %>%
mutate(season = factor(c(1 = "winter",
2 = "spring",
3 = "summer",
4 = "fall")))
这是我得到的错误:
Error in UseMethod("mutate_") : no applicable method for 'mutate_' applied
to an object of class "factor"
我很感激任何帮助,我是初学者。
以下是如何将数字(1:4
)转换为因子(“冬季”,“春季”,“夏季”,“秋季”)的示例。关键是使用factor
函数并相应地设置levels
和labels
。
# Create example data frame
bikeshare <- data.frame(season = 1:4)
bikeshare
# season
# 1 1
# 2 2
# 3 3
# 4 4
library(dplyr)
bikeshare2 <- bikeshare %>%
mutate(season = factor(as.character(season),
levels = c(1, 2, 3, 4),
labels = c("winter", "spring", "summer", "fall")))
bikeshare2
# season
# 1 winter
# 2 spring
# 3 summer
# 4 fall
要创建因子,您需要向factor()
函数提供数据和季节标签。
但是,由于您希望Spring成为基线级别,我相信您必须按特定顺序指定级别和标签 - 强制spring成为第一个:
bikeshare <- data.frame(season = 1:4)
bikeshare$seasonfactor <- factor(x = bikeshare$season,
levels = c(2,3,4,1),
labels = c("spring", "summer", "fall", "winter"))
str(bikeshare$seasonfactor)
#> Factor w/ 4 levels "spring","summer",..: 4 1 2 3
bikeshare
#> season seasonfactor
#> 1 1 winter
#> 2 2 spring
#> 3 3 summer
#> 4 4 fall
由reprex package创建于2019-03-03(v0.2.1)
这有点令人困惑,因为在seasonfactor
中,Spring由1表示,但在原始的season
中它由2表示。
另一种方法是在labelled
包的帮助下标记整数级别。因此它们保持整数,但标签为元数据。如果您想在任何时候将标记的整数转换为因子,您可以使用to_factor
函数实现。
library("tidyverse")
library("labelled")
labels <- c(
"winter" = 1,
"spring" = 2,
"summer" = 3,
"fall" = 4)
bikeshare <-
tibble(season = 1:4) %>%
mutate(season = labelled(season, labels)) %>%
mutate(seasonF = to_factor(season))
bikeshare
#> # A tibble: 4 x 2
#> season seasonF
#> <int+lbl> <fct>
#> 1 1 [winter] winter
#> 2 2 [spring] spring
#> 3 3 [summer] summer
#> 4 4 [fall] fall
由reprex package创建于2019-03-03(v0.2.1)