A new, free software package developed by the Broad Institute of MIT and Harvard makes it easier and faster for scientists to capture the molecular signatures of cells in a particular state--information that's crucial to disease diagnosis and prognosis.
Almost all cells in an organism have the same genes, but those genes are expressed differently under different conditions such as tissue type and developmental stage. Technologies developed over the last 10 to 15 years have enabled scientists to measure the expression levels of all the genes in a cell during a single experiment, giving a molecular profile or signature of the cell.
The new software, dubbed GenePattern, allows researchers to better analyze--and share--the copious data resulting from these experiments. Among other things, it can be used to perform custom gene expression analysis experiments, record and replay analyses, and use tools from many different software sources within a single interface.
GenePattern addresses several hurdles facing biomedical research, particularly the need for interoperable tools that let researchers exchange one another's methods and data. Researchers can use GenePattern through a simple user interface or a powerful scripting language. Computational biologists can add a software module written in any language.
"GenePattern is a significant step forward," said Professor George M. Church of Harvard Medical School, the Harvard-MIT Division of Health Sciences and Technology, and MIT's Computational and Systems Biology Initiative. "It's much more flexible [than other programs of its type] and allows for a broad range of analyses. Additionally, it enables analyses to be performed in a way that they can be replicated. I can't overstate how important that is."
"GenePattern was designed to provide a powerful integrated environment that is unique among microarray analysis tools," said Michael Reich (S.B. 1989), group leader for cancer genomics informatics at the Broad Institute and software engineering manager for GenePattern.
Among its features, GenePattern can be used as a standalone application on a laptop, or it can take advantage of the greater power of a client-server installation. It is packaged with a core library of analytic and visualization modules. New modules are frequently released for download from the GenePattern web site.
"Researchers often have problems trying to reproduce work that others have published because the publishing process often leaves out necessary steps," said Pablo Tamayo, senior computational biologist and manager of cancer genomics informatics at the Broad Institute. "GenePattern allows users to create and share scripts that will reproduce an analysis at the level of detail that researchers require."
In the near future, the Broad scientists plan to integrate additional modules into the package that will extend the scope of GenePattern into sequence analysis, proteomics and metabolomic methods.
Other members of the GenePattern development team include Josh Gould, Charlotte Henson, Jim Lerner, Ted Liefeld, Stefano Monti, Keith Ohm, Ken Ross and Aravind Subramanian. The work was supported by the National Cancer Institute through a grant from the Biomedical Information Science and Technology Initiative of the National Institutes of Health.
GenePattern may be downloaded at http://www.broad.mit.edu/cancer/software/genepattern.
A version of this article appeared in MIT Tech Talk on April 14, 2004.