R中分割数据的线性回归

问题描述 投票:0回答:1

我想进行数据分组,在同一YearLat上对同一物种的多个Long进行测量。然后,我想对所有这些组进行线性回归(使用N作为因变量,Year作为自变量)。

实践数据集:

  Species Year Lat Long   N
1       1 1999   1    1   5
2       1 2001   2    1   5
3       2 2010   3    3   4
4       2 2010   3    3   2
5       2 2011   3    3   5
6       2 2012   3    3   8
7       3 2007   8    7 -10
8       3 2019   8    7 100
9       2 2000   1    1   5

[首先,我对在相同YearSpecies上在同一latitude上的同一longitude中进行多次测量的数据取平均。然后,我根据LatLongSpecies分割数据。但是,这仍然将行分组在一起,其中LatLongSpecies不等于($ '4')。此外,我想删除$'1',因为我只想使用在多个Year上进行多次测量的数据。我该怎么做?

Data <- read.table("Dataset.txt", header = TRUE)
Agr_Data <- aggregate(N ~ Lat + Long + Year + Species, data = Data, mean)
Split_Data <- split(Agr_Data, Agr_Data$Lat + Agr_Data$Long + Agr_Data$Species)
Regression_Data <- lapply(Split_Data, function(Split_Data) lm(N~Year, data = Split_Data) )


Split_Data

$`3`
  Lat Long Year Species N
1   1    1 1999       1 5

$`4`
  Lat Long Year Species N
2   2    1 2001       1 5
3   1    1 2000       2 5

$`8`
  Lat Long Year Species N
4   3    3 2010       2 3
5   3    3 2011       2 5
6   3    3 2012       2 8

$`18`
  Lat Long Year Species   N
7   8    7 2007       3 -10
8   8    7 2019       3 100

所需的输出:

  Lat Long Species   Coefficients
    3    3       2   2.5
    8    7       3   9.167
r aggregate lm
1个回答
0
投票

Base R解决方案:

# 1. Import data: 

df <- structure(list(Species = c(1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 2L ),
                     Year = c(1999L, 2001L, 2010L, 2010L, 2011L, 2012L, 2007L, 2019L, 2000L),
                     Lat = c(1L, 2L, 3L, 3L, 3L, 3L, 8L, 8L, 1L),
                     Long = c(1L, 1L, 3L, 3L, 3L, 3L, 7L, 7L, 1L),
                     N = c(5L, 5L, 4L, 2L, 5L, 8L, -10L, 100L, 5L)),
                class = "data.frame", row.names = c(NA, -9L ))

# 2. Aggregate data: 

df <- aggregate(N ~ Lat + Long + Year + Species, data = df, mean)

# 3. Concatenate vecs to create grouping vec: 

df$grouping_var <- paste(df$Species, df$Lat, df$Long, sep = ", ")

# 4. split apply combine lm:  

coeff_n <- as.numeric(do.call("rbind", lapply(split(df, df$grouping_var), 

                          function(x){

                           ifelse(nrow(x) > 1, coef(lm(N ~ Species+Lat+Long, data = x)), NA)

                          }

                        )

                      )

                    )


# 5.  Create a dataframe of coeffs: 

coeff_df <- data.frame(cbind(grouping_var = unique(df$grouping_var), coeff_n = coeff_n))

# 6. Merge the dataframes together: 

df <- merge(df, coeff_df, by = "grouping_var", all.x = TRUE)
© www.soinside.com 2019 - 2024. All rights reserved.