Stock price data

Author

Stijn Masschelein

On this page, we download the stock price data so that we can later calculate the abnormal return after the earnings announcements.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(here)
here() starts at /Users/stijnmasschelein/Library/CloudStorage/Dropbox/Teaching/lecturenotes/method_package
library(RPostgres)
library(dbplyr)

Attaching package: 'dbplyr'

The following objects are masked from 'package:dplyr':

    ident, sql
i_am("freaky_friday/download_stocks.qmd")
here() starts at /Users/stijnmasschelein/Library/CloudStorage/Dropbox/Teaching/lecturenotes/method_package
earn_ann <- readRDS(here("data", "freaky_friday", "earn_ann.RDS")) 
wrds <- dbConnect(Postgres(),
                  host='wrds-pgdata.wharton.upenn.edu',
                  port=9737,
                  dbname='wrds',
                  user='stimas',
                  sslmode='require')

This section sets the parameters that we will need to limit the download. Following Dellavigna and Pollet (), the beginning date is 300 days before the first earnings announcement and the end date is 75 days after the last earnings announcement. I could have done this for every earnings announcement specificially but Finally, I keep the permno identifiers because these are the only stocks we want the data from.

crsp_input <- earn_ann %>%
  summarise(begin = min(anndat) - 300, end = max(anndat) + 75, .by = permno) %>%
  glimpse()
Rows: 8,745
Columns: 3
$ permno <dbl> 10560, 10656, 88784, 10574, 80585, 10659, 87832, 84606, 88836, …
$ begin  <date> 2003-12-26, 2002-07-12, 2001-01-17, 2001-07-04, 2003-05-06, 20…
$ end    <date> 2006-07-10, 2006-07-19, 2006-07-18, 2006-07-24, 2006-04-11, 20…
permnos <- crsp_input$permno
begin_date <- min(crsp_input$begin)
end_date <- max(crsp_input$end)

I use the same syntax as before to call the WRDS databases as before with sql interspersed with the R parameters created in the previous code block. We get the daily volume, return, price, shares outstanding, cumulative factor to adjust price, and cumulative factor to adjust shares. The latter two are adjustment factors for stock splits and dividends which we probably will not need but if we do we have them.

This is by far the largest download from WRDS and this is why it has it’s own page. We do not want to rerun this more than strictly necessary.

crsp_query <- tbl(wrds, in_schema("crsp_a_stock", "dsf")) %>%
  filter(permno %in% permnos, date >= begin_date, date <= end_date) %>%
  select(permno, date, vol, ret, prc, shrout, cfacpr, cfacshr)
all_stocks <- collect(crsp_query)
saveRDS(all_stocks, here("data", "freaky_friday", "all_stocks.RDS"))
print(all_stocks)
# A tibble: 14,932,742 × 8
   permno date          vol      ret   prc shrout cfacpr cfacshr
    <int> <date>      <dbl>    <dbl> <dbl>  <dbl>  <dbl>   <dbl>
 1  10002 1994-03-10    700 -0.00952 13      2999   1.5     1.5 
 2  10010 1994-03-10  65220 -0.0161   7.62   9348   1.1     1.1 
 3  10011 1994-03-10  31600  0.0179   7.12   5303   1       1   
 4  10012 1994-03-10  46000 -0.05     2.38  14581   1       1   
 5  10019 1994-03-10   2100  0        8.75   5238   1.5     1.5 
 6  10025 1994-03-10  16304 -0.00685 18.1    7329   1       1   
 7  10026 1994-03-10  23525 -0.00680 18.2   10341   2       2   
 8  10032 1994-03-10   4269  0       16.2    6456   4       4   
 9  10035 1994-03-10 268078  0.0282  18.2   13029   1       1   
10  10042 1994-03-10 103900  0        3.56  25318   0.25    0.25
# ℹ 14,932,732 more rows

One important footnote is that the price is negative on days where there were no trades. This might be important going forward.

References

Dellavigna, Stefano, and Joshua M. Pollet. 2009. “Investor Inattention and Friday Earnings Announcements.” The Journal of Finance 64 (2): 709–49. https://doi.org/10.1111/j.1540-6261.2009.01447.x.