It’s been said that MIT adjunct professor of computer science Michael Stonebraker is a pioneer of modern database software.
Through a series of academic projects and startups, starting in the 1970s, he’s brought to market technology that still drives much of the commercial database management system (DBMS) software available today — such as that released by Microsoft, IBM and others.
Out of the six database-technology companies he’s co-founded since coming to MIT in 2001, perhaps most notable is Vertica Systems. It was one of a few companies that helped popularize the column-based DBMS — which could rapidly manage massive, fast-growing volumes of data — before its purchase by Hewlett-Packard (HP).
Like most of Stonebraker’s ventures, Vertica Systems started as research. The company was built around an academic project called C-Store — developed at MIT by Stonebraker and other researchers — that stored data vertically, in columns, rather than in successive rows like most DBMSs. Grouping similar data in columns minimizes the time to read a disk containing data, which can add up with large-scale calculations, typically done in the data warehouses used by large enterprises.
After earning a grant from MIT’s Deshpande Center for Technological Innovation, Stonebraker and Andrew Palmer co-founded Vertica and developed a working prototype. Soon commercialized as the Vertica Analytic Database, this DBMS demonstrated speeds up to two orders of magnitude faster than row-oriented DBMSs, Stonebraker says.
Against the backdrop of the “big-data craze” in the mid-2000s — when database management became a crucial tool for companies to make sense of rapidly growing quantities of data — the startup earned millions in venture capital, found a large customer base, and was sold to HP in 2011 for an undisclosed amount.
“In database management, the business plan is pretty straightforward: You find a data warehouse user and tell them, ‘If you’re currently getting a response time of an hour, we can give you a response time of a minute. Are you interested?’ Most everyone is,” Stonebraker says.
Other C-Store developers included researchers from MIT, Yale University, Brown University, Brandeis University and the University of Massachusetts at Boston.
Building ‘a better mousetrap’ for data
Database management is “a field where the rubber really hits the road,” says Stonebraker, who served as Vertica’s chief technology officer until its acquisition: Once a DBMS has proven it can meet two general criteria — reading data faster and processing it more efficiency than other systems — it’s probably commercially viable.
By flipping a row-oriented DBMS 90 degrees to create a column-oriented DBMS, Stonebraker and his team met both of those criteria — by up to two orders of magnitude over its row-based counterpart. In that way, Stonebraker says, the team built “a better mousetrap” for data — meaning it captures data faster and processes it more rapidly.
The difference in the DBMSs comes down to the way they search for data, Stonebraker explains. For example, a row-oriented DBMS used for employee records may store the data in the following sequence: employee identification number, last name, first name, age, salary. A new record, with the same information for another employee, would start directly after the previous one.
This DBMS is effective in searching for individual employee records, but not for seeking one data element across all employees — such as finding the average age or salary. In that case, it would have to search each record looking for that data, using multiple disk operations to retrieve and examine the data.
In that same example, a column-oriented DBMS only looks at the age, which is grouped with all the other ages, and ignores the other columns. “Essentially, you read wildly less data,” Stonebraker says — and that speeds things up.
But the DBMS can’t just be fast. It has to be “at least an order of magnitude faster,” Stonebraker says. “If it’s, say, two or three times faster, then potential users will be reluctant to switch away from their current solution.” When C-Store proved that it worked 50 times faster than row-oriented DBMSs, “it got everyone’s attention, including venture capitalists.”
To succeed commercially in database management, Stonebraker says, you have to both save enterprises time and make their processes easier. “Vertica was a case of simply having a better mousetrap in a well-established market that was growing in size and momentum,” he says.
Taking software to the next step
Stonebraker has spent decades building startups around academic research, starting with Ingres Corporation, Illustra and Cohera, which he co-founded while at the University of California at Berkeley, where he worked from 1970s to 1990s, before moving on to do the same thing at MIT.
But, he says, it can be challenging for academic researchers to find the wherewithal to build a commercial prototype: Government agencies and other organizations that typically fund academic projects cut the funding when it’s time to build commercial prototypes. So at MIT, when C-Store was about ready for commercialization, Stonebraker turned to the Deshpande Center, which provides funding for academic researchers to bring their innovations to market.
“The Deshpande Center was very helpful in providing funds to expedite the prototype, because you have to build working software to get to the next step,” Stonebraker says. “They serve a terrific need for entrepreneurs.”
The commercial version of C-Store, of course, evolved into something completely different than its academic prototype, as is common. But, Stonebraker says, the academic research is always the foundation. “You tell your engineers, ‘Start with this prototype and make it work better.’ It gives them something to build on. So from the lab to the market is more code evolution than code revolution,” he says.
Stonebraker’s five other MIT startups were also based on research. Two were acquired by other companies: Goby, a local search engine (which also received a Deshpande Center grant) based on the Morpheus project and bought by Telenav in 2011; and StreamBase, built around a DBMS that rapidly analyzes streaming data, which sold to TIBCO Software in June.
Three others, where Stonebraker serves as chief technology officer, are still independent: VoltDB, whose technology is based on the H-Store DBMS designed for online-transaction processes; and Data Tamer and Paradigm4, two early-stage startups focusing on solving big-data issues.
Although the companies are headquartered throughout Massachusetts, Stonebraker says Kendall Square is a prime location for tech startups. Primarily, he says, it’s because there’s a desirable — and affordable — talent pool at MIT and other nearby institutions.
He notes the many venture capitalists who have moved to Kendall Square to invest in startups — a sign of a budding entrepreneurial ecosystem. “The hub of the world [for tech startups] is right here,” he says. “I think MIT has helped make Kendall Square the center of entrepreneurial activity it is today.”
Through a series of academic projects and startups, starting in the 1970s, he’s brought to market technology that still drives much of the commercial database management system (DBMS) software available today — such as that released by Microsoft, IBM and others.
Out of the six database-technology companies he’s co-founded since coming to MIT in 2001, perhaps most notable is Vertica Systems. It was one of a few companies that helped popularize the column-based DBMS — which could rapidly manage massive, fast-growing volumes of data — before its purchase by Hewlett-Packard (HP).
Like most of Stonebraker’s ventures, Vertica Systems started as research. The company was built around an academic project called C-Store — developed at MIT by Stonebraker and other researchers — that stored data vertically, in columns, rather than in successive rows like most DBMSs. Grouping similar data in columns minimizes the time to read a disk containing data, which can add up with large-scale calculations, typically done in the data warehouses used by large enterprises.
After earning a grant from MIT’s Deshpande Center for Technological Innovation, Stonebraker and Andrew Palmer co-founded Vertica and developed a working prototype. Soon commercialized as the Vertica Analytic Database, this DBMS demonstrated speeds up to two orders of magnitude faster than row-oriented DBMSs, Stonebraker says.
Against the backdrop of the “big-data craze” in the mid-2000s — when database management became a crucial tool for companies to make sense of rapidly growing quantities of data — the startup earned millions in venture capital, found a large customer base, and was sold to HP in 2011 for an undisclosed amount.
“In database management, the business plan is pretty straightforward: You find a data warehouse user and tell them, ‘If you’re currently getting a response time of an hour, we can give you a response time of a minute. Are you interested?’ Most everyone is,” Stonebraker says.
Other C-Store developers included researchers from MIT, Yale University, Brown University, Brandeis University and the University of Massachusetts at Boston.
Building ‘a better mousetrap’ for data
Database management is “a field where the rubber really hits the road,” says Stonebraker, who served as Vertica’s chief technology officer until its acquisition: Once a DBMS has proven it can meet two general criteria — reading data faster and processing it more efficiency than other systems — it’s probably commercially viable.
By flipping a row-oriented DBMS 90 degrees to create a column-oriented DBMS, Stonebraker and his team met both of those criteria — by up to two orders of magnitude over its row-based counterpart. In that way, Stonebraker says, the team built “a better mousetrap” for data — meaning it captures data faster and processes it more rapidly.
The difference in the DBMSs comes down to the way they search for data, Stonebraker explains. For example, a row-oriented DBMS used for employee records may store the data in the following sequence: employee identification number, last name, first name, age, salary. A new record, with the same information for another employee, would start directly after the previous one.
This DBMS is effective in searching for individual employee records, but not for seeking one data element across all employees — such as finding the average age or salary. In that case, it would have to search each record looking for that data, using multiple disk operations to retrieve and examine the data.
In that same example, a column-oriented DBMS only looks at the age, which is grouped with all the other ages, and ignores the other columns. “Essentially, you read wildly less data,” Stonebraker says — and that speeds things up.
But the DBMS can’t just be fast. It has to be “at least an order of magnitude faster,” Stonebraker says. “If it’s, say, two or three times faster, then potential users will be reluctant to switch away from their current solution.” When C-Store proved that it worked 50 times faster than row-oriented DBMSs, “it got everyone’s attention, including venture capitalists.”
To succeed commercially in database management, Stonebraker says, you have to both save enterprises time and make their processes easier. “Vertica was a case of simply having a better mousetrap in a well-established market that was growing in size and momentum,” he says.
Taking software to the next step
Stonebraker has spent decades building startups around academic research, starting with Ingres Corporation, Illustra and Cohera, which he co-founded while at the University of California at Berkeley, where he worked from 1970s to 1990s, before moving on to do the same thing at MIT.
But, he says, it can be challenging for academic researchers to find the wherewithal to build a commercial prototype: Government agencies and other organizations that typically fund academic projects cut the funding when it’s time to build commercial prototypes. So at MIT, when C-Store was about ready for commercialization, Stonebraker turned to the Deshpande Center, which provides funding for academic researchers to bring their innovations to market.
“The Deshpande Center was very helpful in providing funds to expedite the prototype, because you have to build working software to get to the next step,” Stonebraker says. “They serve a terrific need for entrepreneurs.”
The commercial version of C-Store, of course, evolved into something completely different than its academic prototype, as is common. But, Stonebraker says, the academic research is always the foundation. “You tell your engineers, ‘Start with this prototype and make it work better.’ It gives them something to build on. So from the lab to the market is more code evolution than code revolution,” he says.
Stonebraker’s five other MIT startups were also based on research. Two were acquired by other companies: Goby, a local search engine (which also received a Deshpande Center grant) based on the Morpheus project and bought by Telenav in 2011; and StreamBase, built around a DBMS that rapidly analyzes streaming data, which sold to TIBCO Software in June.
Three others, where Stonebraker serves as chief technology officer, are still independent: VoltDB, whose technology is based on the H-Store DBMS designed for online-transaction processes; and Data Tamer and Paradigm4, two early-stage startups focusing on solving big-data issues.
Although the companies are headquartered throughout Massachusetts, Stonebraker says Kendall Square is a prime location for tech startups. Primarily, he says, it’s because there’s a desirable — and affordable — talent pool at MIT and other nearby institutions.
He notes the many venture capitalists who have moved to Kendall Square to invest in startups — a sign of a budding entrepreneurial ecosystem. “The hub of the world [for tech startups] is right here,” he says. “I think MIT has helped make Kendall Square the center of entrepreneurial activity it is today.”