In this brief R demo, we will use Difference in Differences (DiD) and the Synthetic Control Method (SCM) for a stylized case study evaluating the impact of Right to Carry Laws on Violent Crime, following Right-to-Carry Laws and Violent Crime: A Comprehensive Assessment Using Panel Data and a State-Level Synthetic Control Analysis (2017) by John J. Donohue, Abhay Aneja, and Kyle D. Weber (available online here). We will use a reduced version of the dataset:
library(tidyverse)
rtc_panel <- read_csv("https://ebenmichael.github.io/assets/code/rtc_panel.csv")
The rtc_panel
dataset contains the following variables:
fips
: State FIPS codestusps
: State abbreviationyear
: YearRTC
: Year that Right to Carry Law was enacted (infinity if never enacted)violent_rate
: Violent crime rate per 100,000did
packageTo estimate the effect using differences in differences with the did
package, we first use the att_gt
function to estimate the “group-time” average treatment effects that correspond to the 2x2 DiD estimate for each treatment time cohort at each event time. The att_gt
function takes in the outcome variable (violent_rate
), the time variable (year
), the unit variable (fips
), the treatment variable (RTC
), and the dataset (rtc_panel
). (Note that the did
package requires the unit variable to be numeric, like the fips code). There is also an additional option for setting the “base period”. Choosing base_period = "universal"
will set the base period to be the first period preceding treatment for each treatment-time cohort. This is how DiD estimates are usually reported, but the default is base_period = "variable"
, which sets the base period to be the immediately preceding period. Here’s a longer discussion about these two choices.
library(did)
did_gt <- att_gt("violent_rate", "year", "fips", "RTC",
data = rtc_panel,
base_period = "universal")
We can then use the aggte
function to aggregate the effects and the ggdid
function to plot them. For instance, we can estimate the ATT across event times and plot the results:
dynamic_effects <- aggte(did_gt, type = "dynamic")
dynamic_effects
##
## Call:
## aggte(MP = did_gt, type = "dynamic")
##
## Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015>
##
##
## Overall summary of ATT's based on event-study/dynamic aggregation:
## ATT Std. Error [ 95% Conf. Int.]
## 110.6341 61.4493 -9.8043 231.0726
##
##
## Dynamic Effects:
## Event time Estimate Std. Error [95% Simult. Conf. Band]
## -37 105.3104 63.0344 -61.4677 272.0886
## -36 113.5941 58.6744 -41.6482 268.8363
## -35 118.4728 67.2342 -59.4173 296.3629
## -34 -91.3345 122.9539 -416.6493 233.9803
## -33 -120.4423 121.8629 -442.8703 201.9857
## -32 -140.7078 129.7619 -484.0351 202.6196
## -31 -162.7163 138.0197 -527.8925 202.4600
## -30 -153.9433 112.1725 -450.7322 142.8456
## -29 -167.8528 89.1481 -403.7233 68.0177
## -28 -146.0240 105.5458 -425.2799 133.2318
## -27 -115.0682 91.1402 -356.2094 126.0731
## -26 -104.0672 85.6207 -330.6046 122.4702
## -25 -108.3694 78.5029 -316.0745 99.3357
## -24 -94.2017 87.9473 -326.8949 138.4915
## -23 -103.5069 85.8359 -330.6137 123.5999
## -22 -112.2479 79.6504 -322.9891 98.4933
## -21 -118.2910 86.6415 -347.5294 110.9474
## -20 -110.6664 84.3430 -333.8233 112.4906
## -19 -113.5881 66.8787 -290.5375 63.3612
## -18 -83.9368 53.0146 -224.2042 56.3306
## -17 -72.9038 46.6644 -196.3697 50.5622
## -16 -90.0053 49.5295 -221.0518 41.0411
## -15 -106.1990 50.5708 -240.0007 27.6026
## -14 -84.5428 46.3322 -207.1297 38.0441
## -13 -49.6734 39.7648 -154.8841 55.5373
## -12 -41.9055 33.2076 -129.7671 45.9561
## -11 -35.2669 28.7651 -111.3744 40.8405
## -10 -34.5409 26.2050 -103.8747 34.7929
## -9 -42.2515 26.3759 -112.0376 27.5346
## -8 -30.9341 26.1834 -100.2108 38.3427
## -7 -29.5279 23.7564 -92.3832 33.3274
## -6 -29.0372 20.2852 -82.7084 24.6340
## -5 -39.9943 21.8755 -97.8731 17.8846
## -4 -37.6013 19.2878 -88.6336 13.4309
## -3 -23.3254 16.9534 -68.1812 21.5305
## -2 -7.8295 10.2285 -34.8923 19.2333
## -1 0.0000 NA NA NA
## 0 7.1452 8.2569 -14.7012 28.9917
## 1 8.8381 13.5319 -26.9649 44.6410
## 2 14.1987 18.5919 -34.9923 63.3896
## 3 20.5033 24.6949 -44.8350 85.8416
## 4 16.7708 29.8539 -62.2175 95.7591
## 5 25.1077 32.9377 -62.0398 112.2552
## 6 27.8412 35.1800 -65.2390 120.9214
## 7 42.7764 39.4095 -61.4942 147.0471
## 8 51.5599 46.9210 -72.5849 175.7046
## 9 71.2388 48.3499 -56.6867 199.1644
## 10 86.3364 51.8398 -50.8228 223.4956
## 11 102.3300 61.4329 -60.2107 264.8706
## 12 106.7955 73.8582 -88.6206 302.2116
## 13 98.3099 71.6390 -91.2343 287.8542
## 14 96.7693 74.8033 -101.1472 294.6859
## 15 104.1865 76.7680 -98.9282 307.3013
## 16 101.5163 78.0221 -104.9166 307.9491
## 17 107.2055 82.0136 -109.7882 324.1992
## 18 113.7354 68.9825 -68.7803 296.2511
## 19 155.2488 87.4207 -76.0512 386.5489
## 20 170.9679 96.6353 -84.7122 426.6480
## 21 175.9863 106.3567 -105.4150 457.3877
## 22 181.8122 103.4490 -91.8960 455.5203
## 23 194.4066 100.4358 -71.3290 460.1422
## 24 168.3268 95.2447 -83.6743 420.3278
## 25 160.4866 132.2870 -189.5219 510.4950
## 26 117.7078 177.9970 -353.2413 588.6569
## 27 140.7333 173.8922 -319.3551 600.8218
## 28 316.1989 97.2005 59.0232 573.3745 *
## 29 333.9844 92.2557 89.8919 578.0768 *
## ---
## Signif. codes: `*' confidence band does not cover 0
##
## Control Group: Never Treated, Anticipation Periods: 0
## Estimation Method: Doubly Robust
ggdid(dynamic_effects)
augsynth
packageNow we can use the augsynth
package to estimate the effect. To do this, we use the `multisynth
function. This function needs the treatment variable to be an indicator, so we can define a new variable rtc
which is equal to 1 if RTC is currently enacted in the year (i.e. year >= RTC
) and 0 otherwise. We can then estimate effects by defining the outcome and treatment variables as the left and right hand sides of an equation, violent_rate ~ rtc
, respectively. We also need to specify the unit variable (stusps
), the time variable (year
), and the dataset (rtc_panel
). We can also specify fixedeff = T
to include fixed effects and time_cohort = T
to aggregate to the time-cohort level when fitting the synthetic control weights. Finally, we can specify the number of leads (n_leads
) to include in the analysis, which is the number of periods after the initial treatment period for which we want to estimate effects.
rtc_panel <- rtc_panel %>%
mutate(rtc = ifelse(year >= RTC, 1, 0))
library(augsynth)
fe_ascm <- multisynth(
violent_rate ~ rtc, stusps, year,
rtc_panel, fixedeff = T, time_cohort = T, n_leads = 11)
We can then use the summary
function to report the ATT estimates along with confidence intervals:
fe_ascm_summ <- summary(fe_ascm)
fe_ascm_summ
##
## Call:
## multisynth(form = violent_rate ~ rtc, unit = stusps, time = year,
## data = rtc_panel, n_leads = 11, fixedeff = T, time_cohort = T)
##
## Average ATT Estimate (Std. Error): 25.485 (42.844)
##
## Global L2 Imbalance: 8.120
## Scaled Global L2 Imbalance: 0.049
## Percent improvement from uniform global weights: 95.1
##
## Individual L2 Imbalance: 76.800
## Scaled Individual L2 Imbalance: 0.158
## Percent improvement from uniform individual weights: 84.2
##
## Time Since Treatment Level Estimate Std.Error lower_bound upper_bound
## 0 Average 15.046516 32.66643 -44.23925 81.15388
## 1 Average 5.814935 36.88648 -58.49601 81.43983
## 2 Average 10.216679 39.22125 -56.00555 90.13640
## 3 Average 19.465837 41.78658 -50.06514 105.77084
## 4 Average 13.217443 40.85973 -54.23392 97.68726
## 5 Average 23.953310 39.17384 -41.29902 104.40867
## 6 Average 21.576615 42.50792 -49.76541 110.33772
## 7 Average 29.743117 46.95288 -45.08877 128.60331
## 8 Average 37.319898 51.22182 -48.60595 142.15950
## 9 Average 50.525898 57.76714 -44.31048 162.82109
## 10 Average 66.709053 60.20486 -33.18479 186.80444
We can also plot the effects (note that we can also use plot(fe_ascm)
, but this will run the summary
function first):
plot(fe_ascm_summ)
To hone in on just the ATT estimates, we can use the plot function with the levels
argument set to "Average"
:
plot(fe_ascm_summ, levels = "Average")