如何基于另一个不同长度的数据集在R中创建变量

问题描述 投票:0回答:1

我正在尝试创建一个变量STATE,该变量存在于另一个长度与我的长度不同的数据集中。

两个对象都有一个状态编码变量GESTFIPS。因此,我只想让R检查GESTFIPS是否匹配,然后在我的数据集中相应地创建变量STATE

我尝试过:

> state_1865_base$STATE[state_1865_base$GESTFIPS==urate2$GESTFIPS] < - 
+ urate2$STATE[state_1865_base$GESTFIPS==urate2$GESTFIPS] 

并收到错误消息:

Error in -urate2$STATE[state_1865_base$GESTFIPS == urate2$GESTFIPS] : 
  invalid argument to unary operator
In addition: Warning messages:
1: In state_1865_base$GESTFIPS == urate2$GESTFIPS :
  longer object length is not a multiple of shorter object length
2: In state_1865_base$GESTFIPS == urate2$GESTFIPS :
  longer object length is not a multiple of shorter object length

我的数据集看起来像(132990个观测值,包含117个变量):

data.frame':    132990 obs. of  117 variables:
 $ IDENTIFIER          : chr  "20030100013280" "20030100013344" "20030100013352" "20030100013848" ...
 $ AGE                 : num  60 41 26 36 51 32 44 21 33 39 ...
 $ MALE                : num  1 0 0 0 1 0 0 0 0 0 ...
 $ BLACK               : num  1 0 0 1 0 0 0 0 0 1 ...
 $ MARRIED             : num  1 1 1 1 1 0 1 0 1 1 ...
 $ NUM_CHILD           : num  0 2 0 2 2 1 1 1 3 4 ...
 $ HV_CHILD            : num  0 1 0 1 1 1 1 1 1 1 ...
 $ AGE_YOUNGEST        : num  NA 0 NA 9 14 2 9 14 3 4 ...
 $ CHILD_4             : num  0 1 0 0 0 1 0 0 1 0 ...
 $ CHILD_5             : num  0 1 0 0 0 1 0 0 1 1 ...
 $ GRADE               : num  17 13 13 12 17 16 12 13 13 13 ...
 $ SPOUSE_EMP          : num  0 1 0 1 0 1 1 NA 1 0 ...
 $ SPOUSE_WORKHOURS    : num  NA 50 NA 40 NA 40 50 NA 40 NA ...
 $ WORKING             : num  1 1 1 0 1 1 1 1 1 1 ...
 $ UNEMP               : num  0 0 0 1 0 0 0 0 0 0 ...
 $ RETIRED             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ DISABLED            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ STUDENT             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ HOMEMAKER           : num  0 0 0 0 0 0 0 0 0 0 ...
 $ WORK_PART           : num  1 1 1 0 0 0 0 0 0 0 ...
 $ HH_INCOME_03        : num  660 200 200 NA NA ...
 $ WAGE_03             : num  22 6.67 16.67 NA NA ...
 $ WAGE_03_ALT         : num  22 NA 12.5 NA NA NA NA 9.5 14 12 ...
 $ YEAR                : num  2003 2003 2003 2003 2003 ...
 $ DATASET             : num  2003 2003 2003 2003 2003 ...
 $ INTERVIEW_DAY       : num  5 6 6 4 4 4 1 2 6 4 ...
 $ INTERVIEW_DATE      : Date, format: "2003-01-03" "2003-01-04" "2003-01-04" "2003-01-02" ...
 $ GESTFIPS            : num  6 6 6 13 21 21 22 26 27 34 ...
[list output truncated]

这是数据集urate,用于存储状态。 (204个Obs。,共6个变量)

STATE GESTFIPS NOBS TWOYEAR UNEMP       URATE
   AL   1      434    1     0.05392952  5.19585
   AL   1      288    2     0.02666941  3.63750
   AL   1      266    3     0.03848163  4.24585
   AL   1      248    4     0.11545039  9.59580
   AK   2       62    1     0.07917716  7.52915
   AK   2       41    2     0.12782212  6.70415
   AK   2       38    3     0.00000000  6.25835
variables dataset conditional-statements creation
1个回答
0
投票

state_1865_base$STATE <- urate2$STATE[match(state_1865_base$GESTFIPS, urate2$GESTFIPS)]应该可以。

© www.soinside.com 2019 - 2024. All rights reserved.