整理我的面板数据集-如果先前的ID满足补充条件,则滤除符合条件的观察结果

问题描述 投票:0回答:1

我正在使用一个数据集,该数据集包含Stata 16.0中9个宽变量的118,979个观测值。最突出的变量是公司在多个日期中的观察报告的是“ GPS”还是“ EPS”。这些公司可以报告数据点中的“ GPS”观测值,也可以报告以下数据点中的“ EPS”观测值。请参阅下面的数据概述以进一步可视化。

数据样本:

clear
input str8 cusip8 str16 cname str4 measure double actual long anndats_act float(fyear tanalyst meanforcast UE)
"87482X10" "TALMER BANCORP"   "EPS"   1.21 20118 2014  29   .8686207     .3930131
"87482X10" "TALMER BANCORP"   "GPS"   1.02 20479 2015  34   .8576471     .1893004

一旦标识符(即上表中的cusip8)报告了多个日期的EPS,我就需要丢弃GPS的观测值(多个日期)。就是说,如果一家公司报告了GPS和EPS,例如2010年1月1日,我想放弃GPS观测值,以便保留EPS。如果公司仅报告GPS,而在给定日期内未报告EPS,则我希望将GPS观测值保留在数据集中。

stata panel-data
1个回答
1
投票

以下内容对我有用(根据需要调整变量名:]

. clear

. input str10(company_id measure) month day year

     company_id measure month day year
  1. "Company A" "EPS" 1 1 2010
  2. "Company A" "GPS" 1 1 2010 
  3. "Company A" "GPS" 1 1 2010
  4. "Company A" "GPS" 1 2 2010
  5. "Company B" "EPS" 1 2 2010
  6. "Company B" "GPS" 1 1 2010
  7. "Company C" "GPS" 1 4 2010
  8. "Company C" "EPS" 1 4 2010
  9. end

. 
. gen date = mdy(month,day,year)

. format date %d

. drop month day year

. 
. sort company_id date measure

. 
. gen both = 0

. by company_id date: replace both = 1 if measure[1] == "EPS" & measure[2] == "GPS"
(5 real changes made)

. 
. list, sepby(company_id)

     +----------------------------------------+
     | company~d   measure        date   both |
     |----------------------------------------|
  1. | Company A       EPS   01jan2010      1 |
  2. | Company A       GPS   01jan2010      1 |
  3. | Company A       GPS   01jan2010      1 |
  4. | Company A       GPS   02jan2010      0 |
     |----------------------------------------|
  5. | Company B       GPS   01jan2010      0 |
  6. | Company B       EPS   02jan2010      0 |
     |----------------------------------------|
  7. | Company C       EPS   04jan2010      1 |
  8. | Company C       GPS   04jan2010      1 |
     +----------------------------------------+

. 
. drop if measure == "GPS" & both == 1
(3 observations deleted)

. 
. list, sepby(company_id)

     +----------------------------------------+
     | company~d   measure        date   both |
     |----------------------------------------|
  1. | Company A       EPS   01jan2010      1 |
  2. | Company A       GPS   02jan2010      0 |
     |----------------------------------------|
  3. | Company B       GPS   01jan2010      0 |
  4. | Company B       EPS   02jan2010      0 |
     |----------------------------------------|
  5. | Company C       EPS   04jan2010      1 |
     +----------------------------------------+
© www.soinside.com 2019 - 2024. All rights reserved.