Researchers at the Broad Institute of MIT and Harvard and colleagues have published new papers this week that bring scientists closer to their ultimate goal -- to grasp the core mechanisms of human biology and disease -- by developing a comprehensive catalog of the genetic diversity in the human genome sequence across human populations.
The first step towards grasping those core mechanisms was realized in 2001, with the completion of the human genome sequence.
The new Broad papers describe both the content and uses of a comprehensive genomic catalog, known as HapMap, that maps common human DNA sequence variations, enabling systematic testing of genetic variants for their association with disease and their place in human evolutionary history.
HapMap gets its name from "haplotypes" -- collectively inherited groups of human genetic variants that are located physically close to one another in the genome.
"Built upon the foundation laid by the human genome sequence, the HapMap is a powerful new tool for exploring the root causes of common diseases. We absolutely require such a resource so that we can develop new and much-needed approaches to understand these diseases, such as diabetes, bipolar disorder, [and] cancer," said David Altshuler, director of the Broad's program in medical and population genetics and an associate professor at Massachusetts General Hospital and Harvard Medical School. Altshuler and Peter Donnelly, of the University of Oxford in England, are two of the authors of an Oct. 27 paper in Nature.
Diseases run in families, and perhaps half the risk of any given common disease can be explained by genetic differences inherited from one's parents. Inheritance also plays a role in the different responses people can have to a drug or to an environmental factor. A "map" to discern the range of genetic contributions to common diseases and responses to therapies was proposed 10 years ago. With HapMap, technology has caught up to biomedical research needs.
"The data from the HapMap project allows scientists to select the particular DNA variants that provide the greatest information in the most efficient manner, lowering the costs and increasing the power of genetic research," said Mark Daly, assistant professor at Massachusetts General Hospital, and an associate member of the Broad Institute. Daly led the Boston team's statistical and analytical work.
HapMap not only builds on the 2001 completion of the human genome sequence, it also advances the massive effort to characterize and catalog the millions of individual DNA base variations (single nucleotide polymorphisms or SNPs) across the genome in the human population. Based on the initial SNP and sequence data, the haplotype structure of the human genome was recognized as early as 2001. Broad Institute scientists led or contributed significantly to all of these efforts.
The HapMap project has also spurred remarkable advances in the technology for testing genetic variations in DNA, making it possible to undertake comprehensive studies in large numbers of patient samples at a lower cost. Stacey Gabriel, director of the Broad Institute's genetic analysis platform, noted, "Several years ago, determining the genotype of a single SNP in a patient cost nearly a dollar, and we could do hundreds a day. Today, the prices have dropped in many cases to a fraction of a penny per genotype, and we can do millions a day. This is the difference between not being able to do the studies, and getting them done rapidly and well."
The availability of rich "real world" data in HapMap has also led to the realization that previous computer models of human genetics are simply too limited, and can even lead to false conclusions about the role of genes or genetic loci in different diseases.
In a paper to be published in the November issue of Genome Research, Stephen Schaffner, Altshuler and their colleagues at the Broad Institute used HapMap's rich, real world data not only to demonstrate the limitations of prior computer genetic models, but also to provide updated models for the use of the entire scientific community that more closely approximate the reality of human genetic variation.
Although much of the interest in HapMap focuses on disease genetics, its data are equally powerful in uncovering potential sites of natural selection in the human genome. Pardis Sabeti, Eric Lander and colleagues at the Broad Institute, together with Stephen O'Brien and his colleagues at the National Cancer Institute, used the HapMap to re-examine earlier work on natural selection on CCR5-Â32, a genetic variation in a T-cell receptor that confers strong resistance to infection by HIV and that has been implicated in resistance to the bubonic plague.
"With the benefit of greater genotyping and empirical comparisons from the HapMap, we were able to show that the pattern of genetic variation seen at CCR5-Â32 does not stand out as exceptional relative to other loci across the genome and is consistent with neutral evolution," said Sabeti, a student at Harvard Medical School and a postdoctoral fellow at the Broad Institute. "In fact, the CCR5-Â32 allele is likely to have arisen more than 5,000 years ago, rather than during the last 1,000 years as was previously thought."
Their findings, reported in the November issue of PLoS Biology, show that the HapMap also gives scientists unprecedented ability to identify novel candidates for natural selection.
HapMap data are freely available in several public databases, including the HapMap Data Coordination Center, the NIH-funded National Center for Biotechnology Information's dbSNP and the JSNP Database in Japan.
A version of this article appeared in MIT Tech Talk on November 2, 2005 (download PDF).