如何计算r中的滞后投资

问题描述 投票:0回答:2

我创建了df1

year gvkey  capex   ppent
2004  1004 13.033 139.137
2005  1004 16.296 213.380
2006  1004 29.891 260.167
2007  1004 30.334 310.393
2008  1004 27.535 245.586
2009  1004 28.855 334.430
...

我创建了df2

year gvkey    ROA
2005  1004 0.02796478
2006  1004 0.04665171
2007  1004 0.05976127
2008  1004 0.06255035
2009  1004 0.03549220
2005  1013 0.06882688
...

我想创建df3

year gvkey    ROA               lag_investment
2005  1004 0.02796478  capex from 2004 / ppent from 2004
2006  1004 0.04665171  capex from 2005 / ppent from 2005
2007  1004 0.05976127  capex from 2006 / ppent from 2006
2008  1004 0.06255035  capex from 2007 / ppent from 2007
2009  1004 0.03549220  capex from 2008 / ppent from 2008
2005  1013 0.06882688  capex from 2004 / ppent from 2004
...

我有超过2,000家公司。 gvkey = firm id

我基本上想要做的是以下事情:

1)从df1计算上一年的投资>

2)在df2中创建一个名为“ lag_investment”的列>

2)将步骤1)中的值插入df2中的当前年份行>

其他问题:

如果要执行以下操作,代码将如何显示?

我创建了df1

  year gvkey        ROA   ppent  capex
1 2004  1004 0.01320911 139.137 13.033
2 2005  1004 0.03005708 213.380 16.296
3 2006  1004 0.05014214 260.167 29.891
4 2007  1004 0.06423255 310.393 30.334
5 2008  1004 0.06723031 245.586 27.535
6 2009  1004 0.03814769 334.430 28.855
...

我想将变量添加到df1

  year gvkey        ROA   ppent  capex         lag_investment
1 2004  1004 0.01320911 139.137 13.033
2 2005  1004 0.03005708 213.380 16.296  capex from 2004 / ppent from 2004
3 2006  1004 0.05014214 260.167 29.891  capex from 2005 / ppent from 2005
4 2007  1004 0.06423255 310.393 30.334  capex from 2006 / ppent from 2006
5 2008  1004 0.06723031 245.586 27.535  capex from 2007 / ppent from 2007
6 2009  1004 0.03814769 334.430 28.855  capex from 2008 / ppent from 2008
...

我想计算2004年以外所有年份的lag_investment。

非常感谢你!

我创建了df1年gvkey capex ppent 2004 1004 13.033 139.137 2005 1004 16.296 213.380 2006 1004 29.891 260.167 2007 1004 30.334 310.393 2008 1004 27.535 245.586 2009 1004 28.855 334 ....

]
我想您可以在lag中使用dplyr
library(dplyr) df1 %>% mutate(lag_investment = lag(capex)/lag(ppent)) # year gvkey ROA ppent capex lag_investment #1 2004 1004 0.0132 139 13.0 NA #2 2005 1004 0.0301 213 16.3 0.0937 #3 2006 1004 0.0501 260 29.9 0.0764 #4 2007 1004 0.0642 310 30.3 0.1149 #5 2008 1004 0.0672 246 27.5 0.0977 #6 2009 1004 0.0381 334 28.9 0.1121
如果未订购数据框,请先使用arrange按年订购。

df1 %>% arrange(year) %>% mutate(lag_investment = lag(capex)/lag(ppent))

shift中的data.table

library(data.table)
setDT(df1)[, lag_investment := shift(capex)/shift(ppent)]

使用data.table,我们可以做
library(data.table) setDT(df1)[, lag_investment :=Reduce(`/`, shift(.SD)), .SDcols = c("capex", "ppent")] df1 # year gvkey ROA ppent capex lag_investment #1: 2004 1004 0.01320911 139.137 13.033 NA #2: 2005 1004 0.03005708 213.380 16.296 0.09367027 #3: 2006 1004 0.05014214 260.167 29.891 0.07637079 #4: 2007 1004 0.06423255 310.393 30.334 0.11489159 #5: 2008 1004 0.06723031 245.586 27.535 0.09772772 #6: 2009 1004 0.03814769 334.430 28.855 0.11211958

或在base R中>

df1$lag_investment <- with(df1, c(NA, head(capex, -1)/head(ppent, -1)))

数据

df1 <- structure(list(year = 2004:2009, gvkey = c(1004L, 1004L, 1004L, 1004L, 1004L, 1004L), ROA = c(0.01320911, 0.03005708, 0.05014214, 0.06423255, 0.06723031, 0.03814769), ppent = c(139.137, 213.38, 260.167, 310.393, 245.586, 334.43), capex = c(13.033, 16.296, 29.891, 30.334, 27.535, 28.855)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6"))

r transform lag
2个回答
1
投票

0
投票

或在base R中>

df1$lag_investment <- with(df1, c(NA, head(capex, -1)/head(ppent, -1)))
© www.soinside.com 2019 - 2024. All rights reserved.