If you would like to produce good quantitative social-science research, try remembering these two words: “ceteris paribus.”
That’s Latin for “other things being equal.” And it’s a key principle when designing studies: Find two groups of people who, other things being equal, are distinguished by one key feature.
Consider health care. If you can find two otherwise equal groups of people who differ only in terms of health care coverage — one group has it, one doesn’t — then you may be able identify a causal relationship at work: What difference does it make when people get health insurance?
Without such a research strategy, scholars can be left staring at a tangle of potential causes and effects. Suppose you have one group of people with health insurance, and one without — but the insured people are wealthier. Are they better off financially because they have insurance, or do they have insurance because they’re better off financially? It may be hard to know. You need groups of equal wealth to solve the causation conundrum.
“People are constantly looking at the world around them and trying to learn from it, and that’s natural,” MIT economist Joshua Angrist says. “But it turns out to be very difficult to sort out cause and effect, because the world is complicated, with many things happening at once.”
Angrist, the Ford Professor of Economics, has long been one of the leading advocates of research that uses “ceteris paribus” principles. Now, along with Jorn-Steffen Pischke of the London School of Economics, Angrist has written a book on the subject for a general audience, “Mastering ‘Metrics: The Path from Cause to Effect,” published this month by Princeton University Press.
“Hopefully our book will find a place in the undergraduate curriculum,” Angrist says. “We also hope that many nonstudents — interested observers who like to think about data and the light it sheds on our world — will find it useful.”
The hunt for randomization
Much of Angrist’s work has been an attempt to replicate the clean structure of randomized controlled trials, as seen in research laboratories. “The best way to isolate cause and effect, and make sure you’ve only got one thing going on, is to do a randomized experiment,” Angrist notes.
But for logistical, monetary, or ethical reasons, social scientists conduct few long-term, large-scale experiments. As a substitute, they look for “natural experiments,” or “quasi-experiments,” where otherwise equal groups of people wind up in different circumstances because of policy changes, quirks of geography, or other such factors.
In their new book, Angrist and Pischke detail five methods of identifying causality in society (they call these methods the “furious five” — part of a Kung Fu motif in the text). The furious five of econometrics, they contend, are randomized trials; regression analysis; use of the “instrumental variables”; regression discontinuity designs; and the “differences in differences” approach.
Take randomization: Several years ago, the state of Oregon instituted a lottery system to fill out its allotment of slots for Medicaid treatment. This created “ceteris paribus” conditions: A pool of similar people applied for Medicaid, but only a random subset received it.
Thus researchers can study the difference Medicaid coverage makes. So far, it appears, Medicaid coverage leads people to get more medical care, and reduces the incidence of depression, but coverage has not produced changes in biomarkers such as blood pressure. Access to Medicaid, however, has reduced the financial burden on enrollees.
“The big problem in social science is that the relationship between ‘A’ and ‘B’ may be a misleading guide to the effect of ‘A’ on ‘B,’” Angrist says. “That isn’t a theorem that that’s always true. But a lot of what people believe about the world turns out to be incorrect or misleading.”
The rest of the “furious five”
As for the other methods Angrist and Pischke detail, a regression analysis charts the relationship between two phenomena, while another important econometrics tool, instrumental-variables, amounts to a kind of imperfect randomized trial. Does growing up in a larger family make you a poorer adult? Using quasi-experimental variations in family size — including randomly occurring twin births, and the fact that family size relates to the gender composition of a set of children — researchers have found there is at least no effect of family size on completed education.
Regression discontinuity designs compare people who are narrowly on opposite sides of, say, a fateful policy cutoff. For example, students who took the entrance test but just missed being accepted to prestigious “exam schools” in New York and Boston did about as well over the long run as students who barely made it into the exam schools. Here the “ceteris paribus” principle stems from the basic similarity of both groups’ performance on the entrance exams.
Finally, the differences-in-differences approach looks at the variation of different statistical trajectories. Do bank rescues help economies? During the Great Depression, the Atlanta-based district of the U.S. Federal Reserve instituted a policy of lending to troubled banks, while the Fed’s St. Louis-based district did not. These districts shared a border that split Mississippi — creating a natural experiment, since other policy conditions in the state were equal. Ultimately the Atlanta Fed’s bank-saving efforts, dating to 1930, improved its district’s economic trajectory, while the St. Louis Fed’s district saw no such change.
Fun and failure
Angrist and Pischke are not alone in advocating these methods, which have gained popularity thanks to many scholars including the economists Orley Ashenfelter, David Card, Lawrence Katz, and Alan Krueger. Moreover, Angrist emphasizes, the best work of this type combines methodological sophistication and hard-earned knowledge of particular subject s.
“A PhD student who wants to be successful in our profession as a research economist has to master one of our fields, like labor economics, health economics, or development economics,” says Angrist, himself a labor economist. On the other hand, less specialized readers “should be able to understand our book, developing, we hope, a sense of how to think clearly about data and statistical relationships.”
Even sophisticated research may not produce airtight results. The book discusses a published study Angrist and a colleague conducted on the relationship between compulsory schooling and earnings. Over time, other researchers concluded their findings stemmed from varying regional economic trends in the U.S. states being sampled, not a strict causal link between the two phenomena.
“I think that’s a great lesson for graduate students and undergraduates,” Angrist reflects. “There’s more failure than success in empirical work.”
Such struggles aside, he concludes, “We want the book to be fun. If it makes readers smile, that will be a big accomplishment. If we get them to think clearly about statistics and causal relationships, then it will be a success.”