Chapter 1 Preamble

The goal of the course is to give you enough information to start a quantitative research project in accounting and finance. The emphasis of the course is not so much on the actual accounting and finance focus of the thesis.1 Everyone will have a different focus and it is difficult to give general advice that will be useful to everyone. I think these notes could be useful to people outside of accounting and finance who want to embark on their first research project with observational data The aim for the course is threefold.

  1. To teach general research skills such as doing a literature review, pitching a research idea, and understanding a theory.
  2. To teach practical R programming skills
  3. To teach how to perform common statistical procedures

This should give you just enough information to start a research project but also just enough to make mistakes. In the notes, I avoid most of the statistical theory and ignore the assumptions of a lot of the statistical methods we are going to use. First of all, the skills we will teach in this unit will translate better to jobs outside of academia. Second, there are excellent introductions to the theory available (Angrist and Pischke 2008; Cunningham 2018).2 See Scott Cunningham’s website for the book and Jake Johnson’s Github repository for the R datasets and the R code equivalent to the Stata code in the book If you plan on becoming an academic in accounting and finance, these are must reads. Third, the literature in your chosen field will have a number of preferred statistical methods for a given research question. While it is a good idea to question current research practices, as a budding researcher you might not have yet developed the knowledge base to question the practices that have survived the peer review process and criticism from follow-up research.

The advantage of focusing on teaching just enough statistical knowledge to start a project is that I can spend more time and space on other pet peeves of mine topics. While most statistical books emphasise the role of theory in the analysis, there is often very little guidance on what that actually means in an actual research project. From the start, I will use an example with compensation data from S&P500 firms in the U.S to illustrate how to make sure that your data reflects your theory.3 I am not a specialist on the executive compensation literature and all the conclusions I draw should be taken with a grain of salt.

This is where the R statistical language comes in. To test whether a theory is correct, we can use graphs and the R language has excellent facilities to make graphics. Descriptive plots of the data can help to evaluate whether the data are measuring what we think they are measuring and whether the statistical method is actually appropriate. In addition, a lot of modern theories in accounting and finance require strong mathematical skills to really understand them. However, I find that simulations can often provide a better intuition for which variables are important in a theory and why. The R language again makes it very easy to program those simulations and I will use simulations to illustrate an advanced theory in the executive compensation literature.

Lastly, simulations can also help to better understand the assumptions and limitations of a statistical method. My aim with the focus on simulations is to give you the tools to evaluate your preferred method even if you do not have the mathematical background to understand all the assumptions of the method.4 I do not want to imply that simulations can replace a strong mathematical understanding but they can get you half way there.


Angrist, Joshua D, and Jörn-Steffen Pischke. 2008. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton university press.
Cunningham, Scott. 2018. Causal Inference: The Mixtape.

Page built: 2022-02-01 using R version 4.1.2 (2021-11-01)