这是我的数据框,称为cat_data
print(cat_data)
Metrics 2016 2017 2018
Number of Cats 100 120 150
Number Leaving 32 40 65
Number Staying 68 80 85
Percent of Leavers .32 .33 .43
Percent of Stayers .68 .67 .57
我只想将带有百分比符号的行和5转换为百分比。
这是我想要的输出。
Metrics 2016 2017 2018
Number of Cats 100 120 150
Number Leaving 32 40 65
Number Staying 68 80 85
Percent of Leavers 32% 33% 43%
Percent of Stayers 68% 67% 57%
我尝试过,但是无法正常工作。
cat_data[4:5,2:4] <- paste0(cat_data[4:5,2:4] * 100,%)
谁能告诉我我需要解决什么?谢谢。
您的代码不起作用,因为paste0()
向量化,并且您的表未正确设置为允许向量化。
有点笨重,但是您可以这样做:
cat_data <- tibble::tribble(
~"Metrics", ~"2016", ~"2017", ~"2018",
"Number of Cats", 100, 120, 150,
"Number Leaving", 32, 40, 65,
"Number Staying", 68, 80, 85,
"Percent of Leavers", .32 , .33, .43,
"Percent of Stayers", .68, .67, .57) # create data
percent_data <- cat_data[4:5,] # separate percent rows
cat_data <- cat_data[-(4:5),] # remove percent rows
for (i in 2:4) { # apply the desired transformation to each column
percent_data[[i]] <- paste0(percent_data[[i]] * 100, "%")
}
cat_data <- rbind(cat_data, percent_data) # bind them back
cat_data
# A tibble: 5 x 4
Metrics `2016` `2017` `2018`
<chr> <chr> <chr> <chr>
1 Number of Cats 100 120 150
2 Number Leaving 32 40 65
3 Number Staying 68 80 85
4 Percent of Leavers 32% 33% 43%
5 Percent of Stayers 68% 67% 57%
您确定要对R中的数据帧执行此操作吗?最好在任何分析结束时对数字进行格式化以呈现它们,并且在数据框中进行数字化是一个不寻常的选择。
问题尚不清楚,您的列采用哪种格式。它们是数字,因子还是字符?
[不知道这一点,在基数R中执行此操作的最佳方法可能是在每一列上使用lapply
,通过字符将其转换为数字,将小于1的任何值乘以100,将整列转换为字符格式,然后在转换后的数字后附加一个百分号。
但是,这将使您将整个数据帧转换为字符串,因此,如果不进行进一步转换,您将无法再对它们进行数学运算。最好重新考虑如何使用或显示数据。
也就是说,这是上述方法的实现:
as.data.frame(lapply(df, function(x)
{
if(!any(grepl("Number", x)))
{
x <- as.numeric(as.character(x))
s <- which(x < 1)
x[s] <- x[s] * 100
x <- as.character(x)
x[s] <- paste0(x[s], "%")
}
return(x)
}))
#> Metrics X2016 X2017 X2018
#>1 Number of Cats 100 120 150
#>2 Number Leaving 32 40 65
#>3 Number Staying 68 80 85
#>4 Percent of Leavers 32% 33% 43%
#>5 Percent of Stayers 68% 67% 57%
这也应该工作
cat_data[4:5,2:4] <- apply(cat_data[4:5,2:4]*100, 2, function(x) paste0(x, "%"))
正如@Phil在他的回答中提到的,问题是您的数据类型冲突。您必须将字段2016、2017和2018的值转换为字符。一种方法是按如下方式更改这些字段:
# load packages
library(tidyverse)
library(scales) # package with function for converting decimal to percent
df %>%
rowwise() %>%
mutate(`2016` = if_else(str_detect(Metrics, "Percent"), scales::percent(`2016`, accuracy = 1), as.character(`2016`))) %>%
mutate(`2017` = if_else(str_detect(Metrics, "Percent"), scales::percent(`2017`, accuracy = 1), as.character(`2017`))) %>%
mutate(`2018` = if_else(str_detect(Metrics, "Percent"), scales::percent(`2018`, accuracy = 1), as.character(`2018`)))
# # A tibble: 5 x 4
# Metrics `2016` `2017` `2018`
# <fct> <chr> <chr> <chr>
# 1 Number of Cats 100 120 150
# 2 Number Leaving 32 40 65
# 3 Number Staying 68 80 85
# 4 Percent of Leavers 32% 33% 43%
# 5 Percent of Stayers 68% 67% 57%
这里是一个整洁的解决方案。很难知道数据的结构是什么,但并不是“整洁”的。我假设您正在尝试创建摘要表。在尝试执行相同操作之前,我曾遇到过类似问题。将case_when与mutate_at函数结合使用是一种方法,如果要包含%符号,则cols必须为字符。
library(dplyr)
library(stringr)
a <- c("Metrics", "Number of Cats", "Number Leaving", "Number Staying", "Percent of Leavers", "Percent of Stayers")
b <- c(2016, 100, 32, 68, .32, .68)
c <- c(2017, 120, 40, 80, .33, .67)
d <- c(2018, 150, 65, 85, .43, .57)
df <- tibble(a = a ,b = b, c = c, d = d)
df %>%
mutate_at(.vars = c("b", "c", "d"), .funs = list(~case_when(a %in% c("Percent of Leavers", "Percent of Stayers") ~ str_c(round(.x*100), " %"),
TRUE ~ as.character(.x))))
#OUTPUT
a b c d
<chr> <chr> <chr> <chr>
1 Metrics 2016 2017 2018
2 Number of Cats 100 120 150
3 Number Leaving 32 40 65
4 Number Staying 68 80 85
5 Percent of Leavers 32 % 33 % 43 %
6 Percent of Stayers 68 % 67 % 57 %