Instructor: Kyle Carlson

Experiments are commonly used in software and internet businesses to evaluate product changes, marketing strategies, and other decisions. Expertise in applying experiments in this context is valuable skill that this course will help students develop. We will cover fundamentals of causality, experimental design, and statistical analysis. In addition we will devote special attention to the particular challenges of experiments in a software business environment. We will analyze how these challenges relate to the theory of causality and assumptions of statistical approaches. Students will learn to build intuition about these problems using simulation.

## Course Goals

Students should learn the following by completing the course:

- How to think critically about causality and experiments. This skill includes a critical and skeptical attitude towards claims and analysis regarding causality. Students should understand the fundamental problem of causal inference and to what extent experiments can solve the problem when implemented correctly. Students should be able to understand current debates about the reliability of experiments and statistics.
- How to use experiments to understand causal relationships and make decisions. This skill includes knowledge of experiment design and statistical analysis. However, because this is an applied course, students should learn how to adapt the design and analysis towards the practical goals of their organization.
- How to identify and address practical complications in experiments. Experiments in software or internet businesses present numerous challenges that are less often seen in a research setting. Students should learn how to minimize practical threats to the validity of their experiments.
- How to plan and evaluate experiments using simulations. Experimenters can use simulations to quickly understand complications and potential solutions. This skill is especially useful in the applied setting when theoretical approaches may be too time consuming or difficult to communicate.

## Grading

The course will be graded based on the following components:

- 65% Assignments – Homework and Labs
- 10% Attendance
- 25% Final Exam

## Attendance

You will lose 10% of this grade for every class period you miss. And yes, that can turn negative (e.g. if you miss 11 class periods, your attendance score is -10%).

## Lecture plan

### Class schedule

The course roughly splits in half. The first half focuses on establishing frameworks for thinking about experiments rigorously. In the second half we apply those frameworks to understand the complications of experimentation in practice.

Date | Topics |

8/19 Monday |
Lecture 1 - Causality fundamentals |

8/30 Friday |
Lecture 2 - Uncertainty |

9/9 Monday |
Lecture 3 - Estimation |

9/16 Monday |
Lecture 4 - Complications in practice: Bias |

9/23 Monday |
Lecture 5 - Complications in practice: Inference |

9/30 Monday |
Lecture 6 - Decision-making |

10/7 Monday |
Lecture 7 - Review |

10/14 Monday |
Lecture 8 - TBD |

### Class outlines

#### Lecture 1 - Causality fundamentals

- Example cases: Why is experimentation important? Why is it challenging?
- Overview of the course
- Topics: Potential outcomes model, treatment effects, identification, selection bias, independence assumption, SUTVA
- In-class exercises: Law of large numbers, selection bias, conditional independence

#### Lecture 2 - Uncertainty

- Review material from lecture 1 + commentary on problems
- Topics
- Identification vs. statistical inference
- “Repeated sampling”
- Standard errors and confidence intervals
- Data-generating processes and models
- Hypothesis testing and p-values
- Power
- Commentary about other forms of uncertainty (randomization, Bayesian)

- In-class exercises
- Central limit theorem
- Confidence intervals
- Hypothesis testing

#### Lecture 3 - Estimation

- Review causality fundamentals + uncertainty + commentary on problems
- Topics
- Calculating means
- Regression for calculating means
- Linear probability model
- Matching to achieve conditional independence
- Regression to achieve conditional independence
- Standard errors for mean and regressions
- Hypothesis testing with regression
- Covariates to improve precision and appropriate use in experiments
- Robust and clustered standard erorrs

- In-class exercises
- Calculating means and probabilities with regression (identifying various distributions)
- Achieving conditional independence with matching and regression
- Sampling distribution of point estimates vs. standard error estimates
- Covariates in regression

#### Lecture 4 - Complications in practice: Biased estimates

- Review causality fundamentals
- Topics
- Commentary on instrumentation, tech debt, attention, neglect
- Roll-out: Bias and conditional independence
- Compliance: Intention-to-treat, “trigger analysis”
- Unintended treatment-control differences (e.g., error rates, load time)
- Experiments that affect sample size
- Interacting experiments

- In-class exercises
- Simulate bias from weekly cyclicality + roll-out + conditional independence
- Simulate non-compliance and ITT

#### Lecture 5 - Complications in practice: Inference

- Topics
- Planning experiments: Power, unknown sample sizes
- Early-stopping and “peeking”
- Types of sampling: Simple randomization, blocked, stratified, clustered
- Clustering and users of a site/software (cookies, sessions, pages)
- Multiple hypothesis testing, “multiple comparisons”

- In-class exercises
- Simulate peeking
- Simulating bias from clustering (application of clustered SEs a la Athey)
- Simulating multiple hypothesis testing

#### Lecture 6 - Experiments and economic decision-making

- Review hypothesis testing and power and reasons for experimentation
- Topics
- Differences between research experiments and applied experiments
- Hypothesis tests and decision-making: Paradox if simple null hypothesis
- Value of information
- Deciding whether to do an experiment
- Deciding which experiment to do
- Experimentation for hypothesis-testing vs. optimization

- In-class exercises
- Simulate researcher vs. applied and hypothesis testing errors
- Simulate hypothesis testing vs. utility in an applied context
- Simulate value of information of different experiment designs
- Simulating multiple treatments and optimization

#### Lecture 7 - Review

#### Lecture 8 - TBD

## Additional materials

### Reference books

*Mostly Harmless Econometrics: An Empiricist’s Companion*by Joshua D. Angrist and Jörn-Steffen Pischke*Counterfactuals and Causal Inference: Methods and Principles for Social Research*by Stephen L. Morgan and Christopher Winship*Econometric Analysis of Cross Section and Panel Data*by Jeffrey M. Wooldridge

### Edifying readings

- “Let’s Take the Con Out of Econometrics.” Edward E. Leamer.
*American Economic Review*(1983) - “The Credibility Revolution in Empirical Economics: How Better Research Design Is Taking the Con out of Econometrics.” Joshua D. Angrist and Jörn-Steffen Pischke.
*Journal of Economic Perspectives*(2010) - EconTalk: “Susan Athey on Machine Learning, Big Data, and Causation. Sept. 12 2016”: Note that the discussion covers the Ed Leamer article above and the work of Card and Krueger (see slides).
- How to Design (and Analyze) a Business Experiment (Hauser and Luca)
- “Consumer heterogeneity and paid search effectiveness: A large‐scale field experiment.” Blake, Thomas, Chris Nosko, and Steven Tadelis. Econometrica (2015)
- “Online Experimentation at Microsoft”
- Challenging Problems in Online Controlled Experiments: MIT Code 2015 invited talk (10/17/2015): Discussion of esoteric SUTVA violations
- Trustworthy Online Controlled Experiments at Large Scale slides

### Related courses

- “The Statistics of Causal Inference in the Social Sciences” (Sekhon, UC Berkeley)
- “Measuring Impact in Practice” (Broockman, Stanford GSB) [materials]
- “From Data to Decisions: The Role of Experiments” (Luca, Harvard Business School)
- “Empirical Microeconomics” (Jakiela and Ozier, U of MD)
- “Statistical Learning and Causal Inference for Economics “ (DiTraglia, UPenn)