Chapter 17 Event Studies in Huntington-Klein (2021)
N <- 500
T <- 2
time_effect <- c(3.5, 0)
rd_did_firm <- tibble(
firm = 1:N,
performance = runif(N, 1, 10),
firm_effect = rnorm(N, 0, 2) + ifelse(performance < 3, 3, 0)
)
rd_did_panel <- tibble(
firm = rep(1:N, each = T),
time = rep(1:T, times = N)) %>%
left_join(rd_did_firm, by = "firm") %>%
mutate(
report = ifelse(time == 2, ifelse(performance > 3, 1, 0), 0),
noise = rnorm(N*T, 0, 3),
profit_report = 6.5 + time_effect[time] + firm_effect + noise,
profit_no_report = 1.5 + time_effect[time] + firm_effect + noise,
actual_profit = ifelse(report == 1, profit_report, profit_no_report))
rd_did_panel %>%
mutate(causal_effect = profit_report - profit_no_report) %>%
group_by(time, report2 = performance > 3) %>%
summarise(profit_report = mean(profit_report),
profit_no_report = mean(profit_no_report),
causal_effect = mean(causal_effect)) %>%
kable(digits = 1)
time | report2 | profit_report | profit_no_report | causal_effect |
---|---|---|---|---|
1 | FALSE | 13.1 | 8.1 | 5 |
1 | TRUE | 10.2 | 5.2 | 5 |
2 | FALSE | 9.4 | 4.4 | 5 |
2 | TRUE | 6.6 | 1.6 | 5 |
did_lm <- feols(actual_profit ~ report, data = rd_did_panel)
did_sub <- feols(actual_profit ~ report, data = filter(rd_did_panel, time == 2))
did_fixed <- feols(actual_profit ~ report | firm, data = rd_did_panel)
did_did <- feols(actual_profit ~ report | firm + time, data = rd_did_panel)
msummary(list(simple = did_lm, "time 2" = did_sub, "firm FE" = did_fixed, "two-way FE" = did_did),
gof_omit = gof_omit, stars = stars)
simple | time 2 | firm FE | two-way FE | |
---|---|---|---|---|
(Intercept) | 5.580*** | 4.380*** | ||
(0.144) | (0.308) | |||
report | 1.005*** | 2.206*** | 1.403*** | 5.091*** |
(0.233) | (0.352) | (0.208) | (0.428) | |
Num.Obs. | 1000 | 500 | 1000 | 1000 |
R2 | 0.018 | 0.073 | 0.624 | 0.685 |
R2 Within | 0.071 | 0.222 | ||
RMSE | 3.58 | 3.34 | 2.22 | 2.03 |
Std.Errors | IID | IID | by: firm | by: firm |
FE: firm | X | X | ||
FE: time | X | |||
* p < 0.1, ** p < 0.05, *** p < 0.01 |
Note
We assume that over time investors and regulators get better at detecting when firms exaggerate in their report.
N <- 1000
T <- 3
cutoff2 <- 3 # performance cutoff to report for time 1
cutoff3 <- c(4/3, 4 + 2/3) # performance cutoff to report for time 2
profit1 <- 5
profit2 <- c(1.5, 6.5) #Profits for time 2 depending on report
profit3 <- c(2/3, 3, 7 + 1/3) #Profits for time 2 depending on report
rd_did3_firm <- tibble(
firm = 1:N,
performance = runif(N, 0, 10),
firm_effect = rnorm(N, 0, 2) + ifelse(performance < cutoff2, 3, 0)
)
rd_did3_panel <- tibble(
firm = rep(1:N, each = T),
time = rep(1:T, times = N)) %>%
left_join(rd_did3_firm, by = "firm") %>%
mutate(
# When will firms report?
report = case_when(
time == 1 ~ 0,
time == 2 & performance < cutoff2 ~ 0,
time == 3 & performance < cutoff3[1] ~ 0,
TRUE ~ 1),
noise = rnorm(T*N, 0, 5),
profit_no_report = firm_effect + noise +
case_when(
time == 1 ~ profit1,
time == 2 ~ profit2[1],
time == 3 ~ profit3[1]
),
profit_report = firm_effect + noise +
case_when(
time == 1 ~ profit1,
time == 2 ~ profit2[2],
time == 3 & performance < cutoff3[2] ~ profit3[2],
TRUE ~ profit3[3]
),
actual_profit = ifelse(report == 1, profit_report, profit_no_report)
)
causal_effects <- rd_did3_panel %>%
mutate(causal_effect = profit_report - profit_no_report,
group = case_when(
performance < cutoff3[1] ~ 1,
performance < cutoff2 ~ 2,
performance < cutoff3[2] ~ 3,
TRUE ~ 4
)) %>%
group_by(time, group) %>%
summarise(report = mean(report),
N = n(),
M_report = mean(profit_report),
M_no_report = mean(profit_no_report),
M_causal_effect = mean(causal_effect))
time | group | report | N | M_report | M_no_report | M_causal_effect |
---|---|---|---|---|---|---|
1 | 1 | 0 | 141 | 7.4 | 7.4 | 0.0 |
1 | 2 | 0 | 159 | 7.5 | 7.5 | 0.0 |
1 | 3 | 0 | 168 | 4.9 | 4.9 | 0.0 |
1 | 4 | 0 | 532 | 5.0 | 5.0 | 0.0 |
2 | 1 | 0 | 141 | 9.3 | 4.3 | 5.0 |
2 | 2 | 0 | 159 | 9.3 | 4.3 | 5.0 |
2 | 3 | 1 | 168 | 6.7 | 1.7 | 5.0 |
2 | 4 | 1 | 532 | 6.7 | 1.7 | 5.0 |
3 | 1 | 0 | 141 | 4.7 | 2.3 | 2.3 |
3 | 2 | 1 | 159 | 5.4 | 3.1 | 2.3 |
3 | 3 | 1 | 168 | 2.9 | 0.6 | 2.3 |
3 | 4 | 1 | 532 | 7.2 | 0.5 | 6.7 |
msummary(list("time 1 and 2" = twoway12, "time 1 and 3" = twoway13,
"time 1, 2 and 3" = twoway123), gof_omit = gof_omit,
stars = c("*" = .1, "**" = .05, "***" = .01))
time 1 and 2 | time 1 and 3 | time 1, 2 and 3 | |
---|---|---|---|
report | 4.882*** | 5.671*** | 4.219*** |
(0.488) | (0.647) | (0.409) | |
Num.Obs. | 2000 | 2000 | 3000 |
R2 | 0.578 | 0.565 | 0.437 |
R2 Within | 0.093 | 0.066 | 0.051 |
RMSE | 3.49 | 3.71 | 4.15 |
Std.Errors | by: firm | by: firm | by: firm |
FE: firm | X | X | X |
FE: time | X | X | X |
* p < 0.1, ** p < 0.05, *** p < 0.01 |
The paper is forthcoming in JFE but available on ssrn
Finally, when research settings combine staggered timing of treatment effects and treatment effect heterogeneity across firms or over time, staggered DiD estimates are likely to be biased. In fact, these estimates can produce the wrong sign altogether compared to the true average treatment effects.
While the literature has not settled on a standard, the proposed solutions all deal with the biases arising from the “bad comparisons” problem inherent in TWFE DiD regressions by modifying the set of effective comparison units in the treatment effect estimation process. For example, each alternative estimator ensures that firms receiving treatment are not compared to those that previously received it.
treatment_group
: first year of treatmentyear
: calendar year [1] "year::-18:cohort::1998" "year::-17:cohort::1998" "year::-16:cohort::1998"
[4] "year::-15:cohort::1998" "year::-14:cohort::1998" "year::-13:cohort::1998"
[7] "year::-12:cohort::1998" "year::-11:cohort::1998" "year::-10:cohort::1998"
[10] "year::-9:cohort::1989" "year::-9:cohort::1998" "year::-8:cohort::1989"
[13] "year::-8:cohort::1998" "year::-7:cohort::1989" "year::-7:cohort::1998"
[16] "year::-6:cohort::1989" "year::-6:cohort::1998" "year::-5:cohort::1989"
[19] "year::-5:cohort::1998" "year::-4:cohort::1989" "year::-4:cohort::1998"
[22] "year::-3:cohort::1989" "year::-3:cohort::1998" "year::-2:cohort::1989"
[25] "year::-2:cohort::1998" "year::0:cohort::1989" "year::0:cohort::1998"
[28] "year::1:cohort::1989" "year::1:cohort::1998" "year::2:cohort::1989"
[31] "year::2:cohort::1998" "year::3:cohort::1989" "year::3:cohort::1998"
[34] "year::4:cohort::1989" "year::4:cohort::1998" "year::5:cohort::1989"
[37] "year::5:cohort::1998"
msummary(sa_fe, gof_omit = gof_omit, stars = stars, statistic = NULL,
estimate = "{estimate} ({std.error}) {stars}", coef_omit = "-1")
(1) | |
---|---|
year = -9 | −0.003 (0.006) |
year = -8 | 0.001 (0.005) |
year = -7 | −0.001 (0.006) |
year = -6 | −0.002 (0.005) |
year = -5 | 0.005 (0.005) |
year = -4 | 0.003 (0.005) |
year = -3 | 0.004 (0.004) |
year = -2 | 0.010 (0.006) |
year = 0 | 0.011 (0.005) * |
year = 1 | 0.025 (0.006) *** |
year = 2 | 0.042 (0.006) *** |
year = 3 | 0.055 (0.005) *** |
year = 4 | 0.062 (0.005) *** |
year = 5 | 0.082 (0.006) *** |
Num.Obs. | 119996 |
R2 | 0.727 |
R2 Within | 0.005 |
RMSE | 0.17 |
Std.Errors | by: state |
FE: firm | X |
FE: year | X |
Note
Note
What is the level of the treatment variable? What is the comparison?