创建列并按升序排序

问题描述 投票:0回答:1

我需要创建一个新列,它是我已有的“参与度”列的副本。该新列(名为 Hello)需要按平均值划分为两个类别(高于平均值的 8 个值,另一个类别中低于平均值的 8 个值)。我尝试使用 dplyr 包的 mutate 和排列功能,但到目前为止没有任何效果。我收到错误代码:当我尝试代码时,没有适用于“mutate”的方法应用于“c('double', 'numeric')”类的对象:

mutate (classe_df, Hello, Engagement)

我的代码有 16 个值,其中参与度列由点赞数除以观看次数组成。

这是我的完整代码:

Sujet <- sprintf("s_%d", 1:16)

Titre <- c("Coffee cake cookies", "Poutine baloney", "Polite Boy","Infinity water glitch", "Garlic  butter steak bites", "Sun Conure", "Pizza in my room", "Parkour skills", "Supercars", "Brown monster", "Batch of pasta", "Police grappler", "F340", "Ceramic brakes", "GT3 Seat Solution", "Over Priced GT2RS" )

Likes <- c(150700, 11200, 3500000, 256700, 1400000, 791300, 439900, 382700, 45400, 1300000, 246200, 32200, 319000, 1892, 3413, 17400)

Vues <- c(1300000, 214500, 13500000, 11400000, 17100000, 6700000, 400000, 11300000, 203100, 18400000, 3000000, 1000000, 1800000, 12600, 23500, 213800)


Engagement <- c(150700 / 1300000, 11200 / 214500, 3500000 / 13500000, 256700 / 11400000, 1400000 /         17100000, 791300 / 6700000, 439900 / 4000000, 382700 / 11300000, 45400 / 203100, 1300000 / 18400000, 246200 / 3000000, 32200 / 1000000, 319000 / 1800000, 1892 / 12600, 3413 / 23500, 17400 / 213800)


classe_df <- data.frame(Sujet, Titre, Likes, Vues, Engagement)

#install.packages("ggplot2")
library(ggplot2)


ggplot(classe_df, aes(x = Vues, y = Likes) ) +
geom_point() +
labs(x = 'Vues', y = 'Likes', title = 'Relation entre les likes et les vues') +
theme_light()

ggsave("ggplot.jpg",
   scale = 1,
   width = 10,
   height = 7,
   units = "cm",
   dpi = 300)


#install.packages("dplyr")
library(dplyr)
dplyr mutate
1个回答
0
投票

这就是你想要的吗,@Sam?

根据 Sam 的编辑进行了更新

Sujet <- sprintf("s_%d", 1:16)

Titre <- c("Coffee cake cookies", "Poutine baloney", "Polite Boy","Infinity water glitch", "Garlic  butter steak bites", "Sun Conure", "Pizza in my room", "Parkour skills", "Supercars", "Brown monster", "Batch of pasta", "Police grappler", "F340", "Ceramic brakes", "GT3 Seat Solution", "Over Priced GT2RS" )

Likes <- c(150700, 11200, 3500000, 256700, 1400000, 791300, 439900, 382700, 45400, 1300000, 246200, 32200, 319000, 1892, 3413, 17400)

Vues <- c(1300000, 214500, 13500000, 11400000, 17100000, 6700000, 400000, 11300000, 203100, 18400000, 3000000, 1000000, 1800000, 12600, 23500, 213800)

Engagement <- c(150700 / 1300000, 11200 / 214500, 3500000 / 13500000, 256700 / 11400000, 1400000 /         17100000, 791300 / 6700000, 439900 / 4000000, 382700 / 11300000, 45400 / 203100, 1300000 / 18400000, 246200 / 3000000, 32200 / 1000000, 319000 / 1800000, 1892 / 12600, 3413 / 23500, 17400 / 213800)


classe_df <- data.frame(Sujet, Titre, Likes, Vues, Engagement)

library(dplyr)
classe_df <- classe_df %>%
  mutate(Hello = ifelse(median(Engagement) > Engagement, "below", "above"))


使用

dplyr
我创建了一个列,用于查看
classe_df$Engagement
,确定中位数,并根据值与中位数的关系返回“高于”或“低于”(当然,一半高于中位数,而其余的如下)。如果这回答了您的问题,请将此回复标记为正确/接受。如果没有,请告诉我。谢谢!

© www.soinside.com 2019 - 2024. All rights reserved.