class: center, middle, inverse, title-slide # The Applied Economist’s Toolkit ### Wei Yang Tham
weiyang.tham@gmail.com
Twitter:
@wytham88
Website: wytham.rbind.io
### 2019-06-20 --- # About Me Finishing my PhD in Economics at The Ohio State University Economics of science and innovation R, dataviz, causal inference -- Improv at [The Nest Theatre](https://nesttheatre.com/) on Broad St. --- ## What happens if you do X? - Does employment decrease if you raise the minimum wage? - Does investing in stem cell research actually lead to more research? - Will running an advertising campaign raise revenue? -- These are questions about **causation** --- ## Estimating causal effects is the bread and butter of many economists Provide intuition for how economists and social scientists have approached this problem Inspiration for ways you might apply this in other contexts --- ## Caveats Lots of advances in methodology have come from other fields including political science, epidemiology, statistics, computer science Only one part of econometrics and economics --- class: inverse, middle ## Why is estimating causal effects hard? --- ## Eating chocolate wins Nobel Prizes? <img src="img/chocolate_nobel.png" width="60%" style="display: block; margin: auto;" /> --- ## .large[$$y = D\beta+ X\gamma + u$$] - `\(y\)` is an outcome we care about (e.g. wages) - `\(D\)` is an indicator for a treatment/intervention (e.g. going to college) - `\(X\)` are some characteristics we observe ("observables") - `\(u\)` is stuff we don't observe ("unobservables") -- #### The goal is estimate `\(\beta\)`, the effect of `\(D\)` on `\(y\)` -- ### Why can't we just run a regression and call it a day? --- ## People don't make choices randomly People who go to college are different than people who don't go to college - Different social networks - Different aptitudes -- ### Many of these differences are *unobserveable* and could directly affect wages --- ## The effect of `\(U\)` on `\(Y\)` is wrongly attributed as the effect of D on Y <img src="img/ovb.png" width="60%" style="display: block; margin: auto;" /> --- ## Usual Solution? Randomized Control Trials (RCTs) - Naive regression ends up comparing people with different unobservables - If we randomly assign the treatment, then the treatment and control groups have similar unobservables (on average) --- ## But RCTs are not always feasible - Some things can't be A/B tested (e.g. TV ads) - Expense - Internal resistance - Ethics (randomly assign lead poisoning) --- class: inverse, middle ## Quasi-experiments can be helpful here --- class: middle ### Sometimes experiments pop up "in the wild" because of arbitrary decisions or events --- ## Quasi-experimental designs - Instrumental variables - Difference-in-differences - Synthetic control - Regression discontinuity --- ## Focus is on credibility of Research Design, less on methods - To what extent does this approximate an actual randomized experiment? - Estimation is often done with linear regression --- class: middle, inverse ## Instrumental variables --- ## Cool origin story - First appears in appendix of a book on animal and vegetable oil tariffs - Unclear if idea came from the author, economist Philip Wright, or his son, a geneticist - Text analysis shows the writing style most closely matched Philip's --- ## Intuition Problem: College attendance and wages are both related to unobserved characteristics An *instrument* is something that affects college attendance but NOT wages Roughly, we are using the instrument to "purge" the part of college attendance that is explained by unobservables --- ## What makes a good instrument? 1. Relevant: Has an effect on the intervention ( `\(Z\)` affects `\(D\)`) 2. Exclusion restriction: Only related to the outcome through its effect on intervention ( `\(Z\)` only affects `\(Y\)` through `\(D\)`) --- ## The exclusion restriction is untestable <img src="img/iv_dag.png" width="85%" style="display: block; margin: auto;" /> --- ## Finding clever instruments is an entire industry... - Gender of children as instrument for family size - Quarter-of-birth for years spent in school - Muslim holidays as for number of drivers on the road --- ## ...but *randomization* is a great instrument! - Oregon Medicaid Lottery - If you have A/B tests, you have instruments! --- ## Do referrals increase retention? - Problem: people who already refer more friends are probably more invested in the product - Solution: find an A/B test that successfully increased referrals (e.g. an email campaign) and you have a ready-made instrument --- ## Validity checks There are statistical tests for relevance of an instrument Exclusion restriction is usually justified based on institutional knowledge Check that there is balance across observables - e.g. if instrument is binary, then check that units exposed to instrument look similar to units not exposed to instrument --- class: inverse, middle ## Difference-in-differences --- ### If you use quasi-experimental designs, you will almost certainly use difference-in-differences --- ## Example - Say you want to know how an increase in the Ohio state minimum wage changed employment - If employment decreased after the minimum wage increased, does that mean the causal effect of the minimum wage was to decrease employment? --- ## Intuition - Use a control group to account for underlying time trends - E.g. find a state that had similar trends in employment to Ohio before the minimum wage increase happened --- ## Similar trends, different levels <img src="econ_toolkit_2019-06-19_files/figure-html/dd_plots-1.png" width="85%" style="display: block; margin: auto;" /> --- ## Control and Treated groups both increased <img src="econ_toolkit_2019-06-19_files/figure-html/unnamed-chunk-4-1.png" width="85%" style="display: block; margin: auto;" /> --- ## Take the difference before and after treatment for the treatment and contrl groups <img src="econ_toolkit_2019-06-19_files/figure-html/unnamed-chunk-5-1.png" width="85%" style="display: block; margin: auto;" /> --- ## Take the difference of those differences <img src="econ_toolkit_2019-06-19_files/figure-html/unnamed-chunk-6-1.png" width="85%" style="display: block; margin: auto;" /> --- ## Difference-in-differences relies on the parallel trend assumption Without the intervention, the treatment and control groups would have experienced the same time trends --- ## Parallel trend assumption <img src="econ_toolkit_2019-06-19_files/figure-html/unnamed-chunk-7-1.png" width="85%" style="display: block; margin: auto;" /> --- ## Example: Ineffectiveness of Paid search <img src="img/ebayad.png" width="60%" style="display: block; margin: auto;" /> --- ## Randomization + Difference-in-differences - Randomly selected cities to turn off paid search - Selected control cities with similar trends in sales - Estimated little to no effect of paid search on sales --- <img src="img/ebay_dd.png" width="75%" style="display: block; margin: auto;" /> --- ## Validity checks - Statistical and graphical checks that trends are similar before the intervention - Even if trends are similar, need to justify why *levels* are different - Contextual knowledge: knowing what else is changing at the same time as the intervention - Placebo test: an intervention on 6th graders should have no effect 8th graders --- class: inverse, middle ## Synthetic Control --- ## Synthetic Control is a close cousin of difference-in-differences Used in comparative case studies - how did an event affect aggregate outcomes of a single unit (e.g. country, state)? Example: What was the economic impact of reunification on West Germany? --- ## Synthetic control finds the weighted combination of countries that are most similar to Germany - Minimize the distance between West Germany and potential control countries based on a set of pre-unification variables (e.g. GDP per capita, trade openness, inflation rate) --- ## West Germany looks different from the rest of the OECD <img src="img/beforesynth.png" width="50%" style="display: block; margin: auto;" /> --- ## West Germany and Synthetic West Germany look similar pre-reunification <img src="img/aftersynth.png" width="50%" style="display: block; margin: auto;" /> --- ## Synthetic Control: How can you measure the impact of TV advertising? - Can't randomly assign TV ads to individual users - Instead, randomly assign treatment to cities --- ## Two issues with randomizing by geographic area 1. There are a limited number of cities and they are very different from each other 2. If you divide up the US into treatment and control, then you can't estimate the effect for *entire* US --- ## Create synthetic US for both treated *and* control cities .center2[ <img src="img/wayfair_syntheticcontrol.png" width="85%" style="display: block; margin: auto;" /> ] --- ## Regression discontinuity - Compare units just above or below an arbitrary/random cutoff - E.g. test scores to get a scholarship --- ## Causal effect is difference between points "close" to the threshold <img src="econ_toolkit_2019-06-19_files/figure-html/unnamed-chunk-12-1.png" width="85%" style="display: block; margin: auto;" /> --- ## Do online reviews affect consumer demand? Yelp rounds online ratings to the nearest half-star (3.24 -> 3, 3.25 -> 3.5) Compared difference in revenue between firms around the rounding thresholds --- ## 1-star increase leads to 5-9% increase in revenue <img src="img/luca_yelp.png" width="70%" style="display: block; margin: auto;" /> --- ## Regression discontinuity doesn't work if units around threshold are not randomly assigned - There could be manipulation around the threshold e.g. influential parents lobby to get their children extra points if they're just below the threshold --- ## Validity Checks 1. Look at distribution of units around the threshold 2. Look at distribution of observables around the threshold --- ## #Units above threshold `\(\approx\)` #Units below threshold <img src="econ_toolkit_2019-06-19_files/figure-html/unnamed-chunk-14-1.png" width="85%" style="display: block; margin: auto;" /> --- <img src="econ_toolkit_2019-06-19_files/figure-html/unnamed-chunk-15-1.png" width="85%" style="display: block; margin: auto;" /> --- ## Observables should not be "affected" by the discontinuity In scholarship example, if we observe a jump in parental income at the discontinuity, then something is wrong Placebo test works here: there should be no effect on an unrelated outcome --- ### Even if we are satisfied with our quasi-experiment (internally valid), how do we know it will accurately predict effects of a future action? --- ## Local average treatment effects Often, we don't get to randomize across the whole population E.g. in regression discontinuity, we're estimating the causal effect for people *around the threshold* and ignoring those far away from the threshold This may still be relevant! Interventions often occur at the margin --- ## Equilibrium effects Scaling up an intervention can have substantial effects on how people interact E.g. Retaining sellers on an online marketplace could affect revenue gains due to competition --- ## Causal inference things I haven't covered - Matching - Directed Acyclic Graphs (DAGs) - Machine learning + causal inference --- ### Resources .small[ #### Books - [Mostly Harmless Econometrics](https://www.amazon.com/Mostly-Harmless-Econometrics-Empiricists-Companion/dp/0691120358) or [Mastering 'Metrics (less technical)](https://www.amazon.com/Mastering-Metrics-Path-Cause-Effect-ebook/dp/B00MZG71MC) by Angrist and Pischke - [Causal Inference: The Mixtape (free)](http://scunning.com/cunningham_mixtape.pdf) #### Many ideas for this presentation were borrowed liberally from: - [What's the Science in Data Science? - Skipper Seabold](https://youtu.be/kTo16ieMCi8) - [Teconomics](https://medium.com/teconomics-blog/) - [Wayfair example on synthetic control](https://tech.wayfair.com/data-science/2017/08/using-geographic-splitting-optimization-techniques-to-measure-marketing-performance/) #### Structural models are sometimes proposed as a way of overcoming external validity challenge - [A light intro](https://towardsdatascience.com/the-holy-grail-of-econometrics-3a45a2295ce5) - [Big picture discussion of their role in research](http://noahpinionblog.blogspot.com/2017/05/how-should-theory-and-evidence-relate.html) ] --- class: inverse, middle # Questions <i class="fas fa-envelope"></i> weiyang.tham@gmail.com <i class="fab fa-twitter"></i> @wytham88 Website: wytham.rbind.io