我有一个面板数据集,需要在特定年份之前和之后添加一个计数器。即,我有以下数据:
my.panel <- data.frame(id =c(1,1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2,2), year = c(1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 1929, 1930)
, indicator = c(0,0,0,1,0,0,0,0,0, 0,0,0,0,0,0,1,0,0))
my.panel
id year indicator
1 1 1910 0
2 1 1911 0
3 1 1912 0
4 1 1913 1
5 1 1914 0
6 1 1915 0
7 1 1916 0
8 1 1917 0
9 1 1918 0
10 2 1922 0
11 2 1923 0
12 2 1924 0
13 2 1925 0
14 2 1926 0
15 2 1927 0
16 2 1928 1
17 2 1929 0
18 2 1930 0
并且需要以下内容:
id year indicator counter
1 1 1910 0 -3
2 1 1911 0 -2
3 1 1912 0 -1
4 1 1913 1 0
5 1 1914 0 1
6 1 1915 0 2
7 1 1916 0 3
8 1 1917 0 4
9 1 1918 0 5
10 2 1922 0 -6
11 2 1923 0 -5
12 2 1924 0 -4
13 2 1925 0 -3
14 2 1926 0 -2
15 2 1927 0 -1
16 2 1928 1 0
17 2 1929 0 1
18 2 1930 0 2
我敢打赌那里有一些简单的dplyr解决方案。
使用dplyr
:
my.panel %>%
group_by(id) %>%
mutate(counter=year-year[indicator==1])
此产量
# A tibble: 18 x 4
# Groups: id [2]
id year indicator counter
<dbl> <dbl> <dbl> <dbl>
1 1 1910 0 -3
2 1 1911 0 -2
3 1 1912 0 -1
4 1 1913 1 0
5 1 1914 0 1
6 1 1915 0 2
7 1 1916 0 3
8 1 1917 0 4
9 1 1918 0 5
10 2 1922 0 -6
11 2 1923 0 -5
12 2 1924 0 -4
13 2 1925 0 -3
14 2 1926 0 -2
15 2 1927 0 -1
16 2 1928 1 0
17 2 1929 0 1
18 2 1930 0 2