如何在 R 中最好地执行此特定行操作?

问题描述 投票:0回答:1

对于这个任务,在我的真实数据集中。我有 18 行,其 indcode = 000000 且所有权代码 = 10。区分因素是面积。同样,我有 18 行,其 indcode = 4911 和所有权代码 = 10。为了便于计算,下面的示例数据将其缩小到 4。一些背景..在我的真实数据集中,我有今年(02)的月度数据和从 02 月到 1 月的月份(一月)。 910 是新的索引代码。它代表特定地区和时间的联邦就业总数。联邦就业定义为 indcode = 000000 减去 indcode = 4911。indcode = 55 只是为了让它更现实。

PS,我对“02-Jan”有一些困难,所以请随意将其重命名为 Jan。只是试图使其与真实产品保持一致。

 indcode <- c("000000","000000","000000","000000", "55", "4911","4911","4911","4911")
 ownership <- c("10","10","10","10","10","10","10","10","10")
 area <- c("000000","031","029","017","029","000000","031","029","017")
 "02-Jan" <- c(1000,600,300,100,50,100,50,40,10)
 "02-Feb" <- c(1003,601,301,101,51,101,51,41,11)

  first <- data.frame(indcode, ownership, area, `02-Jan`, `02-Feb`)

因此对于每个区域,这里都有一个示例。实际的 02 值不会是 1000-100 而是 900,但我认为这会让它更清楚。

    indcode    ownership    area     02-Jan    02-Feb
      910          10        000000    1000-100     1003-101  
      910          10        031       600-50       601-51
r dplyr data.table pivot
1个回答
0
投票
library(dplyr)
first |>
  summarize(across(3:4, ~paste(rev(range(.)), collapse = "-")), .by = area)
  #"3:4" refers to the 3rd and 4th column once we set aside the area grouping
  # We could alternated specify the columns by name, e.g. X02.Jan:X02.Feb

结果

    area  X02.Jan  X02.Feb
1 000000 1000-100 1003-101
2    031   600-50   601-51
3    029   300-40   301-41
4    017   100-10   101-11
© www.soinside.com 2019 - 2024. All rights reserved.