R中的多个条件的问题不起作用

问题描述 投票:1回答:4

我在R中遇到多个条件的问题。我的数据是这样的:

Region in UK     Year        Third column (year.city) 
Liverpool        2008           
Manchester       2010
Liverpool        2016
Chester          2015
Birmingham       2016
Blackpool        2012
Birmingham       2005
Chester          2009
Liverpool        2005
Hull             2011
Leeds            2013
Liverpool        2014
Bradford         2008
London           2010
Coventry         2009
Cardiff          2016 
Liverpool        2007

我想要创建的是第三列,其中包含其中的群体:2010年之前的利物浦,2010年之后的利物浦,2010年之前的其他城市,2010年之后的其他城市。我尝试了几个代码,如mutate但无法解决它。你能帮帮我吗?谢谢

r if-statement
4个回答
1
投票

我会这样做@dvibisan建议并使用dplyr。

# Create a dataframe
df <- structure(list(`Region in UK` = c("Liverpool", "Manchester", "Liverpool", 
                                        "Chester", "Birmingham", "Blackpool", "Birmingham", "Chester", 
                                        "Liverpool", "Hull", "Leeds", "Liverpool", "Bradford", "London", 
                                        "Coventry", "Cardiff", "Liverpool"), 
                     Year = c(2008L, 2010L, 2016L, 2015L, 2016L, 2012L, 2005L, 2009L, 2005L, 2011L, 2013L, 2014L, 2008L, 2010L, 2009L, 2016L, 2007L)), 
                row.names = c(NA, -17L), class = c("data.table", "data.frame"))

# Load the dplyr library to use mutate and if_else (if there were more than 2 conditions of interest for each variable could use case_when)
library(dplyr) 

# Create a new column using mutate, pasting together two conditions
df <-
  df %>% 
  mutate(`Third column (year.city)` = paste0(if_else(grepl("Liverpool", `Region in UK`, fixed = TRUE), `Region in UK`, "Other cities"),
                                             if_else(Year < 2010, " before 2010", " 2010 or after")))

1
投票

我认为最简单的方法是使用基数R的矢量化:

# create index of categories
vec <- c("Other cities after 2010", "Liverpool after 2010", "Other cities before 2010", "Liverpool before 2010")
# create index vector
ix <- 1 + (df$Region.in.UK == "Liverpool") + 2*(df$Year < 2010)

# index the categories-vector with the index-vector
df$year.city <- vec[ix]

结果:

> df
   Region.in.UK Year                year.city
1     Liverpool 2008    Liverpool before 2010
2    Manchester 2010  Other cities after 2010
3     Liverpool 2016     Liverpool after 2010
4       Chester 2015  Other cities after 2010
5    Birmingham 2016  Other cities after 2010
6     Blackpool 2012  Other cities after 2010
7    Birmingham 2005 Other cities before 2010
8       Chester 2009 Other cities before 2010
9     Liverpool 2005    Liverpool before 2010
10         Hull 2011  Other cities after 2010
11        Leeds 2013  Other cities after 2010
12    Liverpool 2014     Liverpool after 2010
13     Bradford 2008 Other cities before 2010
14       London 2010  Other cities after 2010
15     Coventry 2009 Other cities before 2010
16      Cardiff 2016  Other cities after 2010
17    Liverpool 2007    Liverpool before 2010

0
投票

试试这个

Region_in_UK = c( "Liverpool", "Manchester", "Liverpool", "Chester", 
"Birmingham", "Blackpool", "Birmingham", "Chester", "Liverpool", "Hull", 
"Leeds", "Liverpool", "Bradford", "London", "Coventry", "Cardiff", "Liverpool")
Year = c(2008, 2010, 2016, 2015, 2016, 2012, 2005, 2009, 2005, 2011, 2013,
2014, 2008, 2010, 2009, 2016, 2007)
df = data.frame(Region_in_UK, Year)

# erase the code above and replace your own dataframe if its bigger 
# than the data you displayed at this point and name it "df" (e.g.: 
# df = your_dataframe)

df$year_city = rep(NA, dim(df)[1])
df = mutate(df, year_city = 
              ifelse (grepl("Liverpool", df$Region_in_UK) &  df$Year < 2010, 
                      "Liverpool before 2010", df$year_city))
df = mutate(df, year_city = 
              ifelse (grepl("Liverpool", df$Region_in_UK) &  df$Year >= 2010, 
                      "Liverpool 2010 and after", df$year_city))
df = mutate(df, year_city = 
              ifelse (!grepl("Liverpool", df$Region_in_UK) &  df$Year < 2010, 
                      "Other before 2010", df$year_city))
df = mutate(df, year_city = 
              ifelse (!grepl("Liverpool", df$Region_in_UK) &  df$Year >= 2010, 
                      "Other 2010 and after", df$year_city))

0
投票

使用base R你可以做到:

 transform(df, year.city = factor(paste(sub('^((?!Liver).)*$', 'other', Region_in_UK,perl = TRUE), Year>2010), label=1:4))

   Region_in_UK Year year.city
1     Liverpool 2008         1
2    Manchester 2010         3
3     Liverpool 2016         2
4       Chester 2015         4
5    Birmingham 2016         4
6     Blackpool 2012         4
7    Birmingham 2005         3
8       Chester 2009         3
9     Liverpool 2005         1
10         Hull 2011         4
11        Leeds 2013         4
12    Liverpool 2014         2
13     Bradford 2008         3
14       London 2010         3
15     Coventry 2009         3
16      Cardiff 2016         4
17    Liverpool 2007         1

你也可以这样做:

transform(df,m=factor(paste(!grepl("Liverpool",Region_in_UK),Year>2010),label=1:4))

要么

transform(df,m = factor(paste(sub('(Liverpool)|.*','\\1',Region_in_UK),Year<=2010),label=4:1))
   Region_in_UK Year m
1     Liverpool 2008 1
2    Manchester 2010 3
3     Liverpool 2016 2
4       Chester 2015 4
5    Birmingham 2016 4
6     Blackpool 2012 4
7    Birmingham 2005 3
8       Chester 2009 3
9     Liverpool 2005 1
10         Hull 2011 4
11        Leeds 2013 4
12    Liverpool 2014 2
13     Bradford 2008 3
14       London 2010 3
15     Coventry 2009 3
16      Cardiff 2016 4
17    Liverpool 2007 1
© www.soinside.com 2019 - 2024. All rights reserved.