Humans have a natural tendency to find order in sets of information, a skill that has proven difficult to replicate in computers. Faced with a large set of data, computers don't know where to begin -- unless they're programmed to look for a specific structure, such as a hierarchy, linear order, or a set of clusters.
Now, in an advance that may impact the field of artificial intelligence, a new model developed at MIT can help computers recognize patterns the same way that humans do. The model, reported earlier this month in the Proceedings of the National Academy of Science, can analyze a set of data and figure out which type of organizational structure best fits it.
"Instead of looking for a particular kind of structure, we came up with a broader algorithm that is able to look for all of these structures and weigh them against each other," said Josh Tenenbaum, an associate professor of brain and cognitive sciences at MIT and senior author of the paper.
The model could help scientists in many fields analyze large amounts of data, and could also shed light on how the human brain discovers patterns.
The computer algorithm was developed by recent MIT PhD recipient Charles Kemp, now an assistant professor of psychology at Carnegie Mellon University, along with Tenenbaum.
The model considers a range of possible data structures, such as trees, linear orders, rings, dominance hierarchies, clusters, etc. It finds the best-fitting structure of each type for a given data set and then picks the type of structure that best represents the data.
Humans perform the same feat in everyday life, often unconsciously. Several scientific milestones have resulted from the human skill of finding patterns in data -- for example, the development of the periodic table of the chemical elements or the organization of biological species into a tree-structured system of classification.
Children exhibit this data organization skill at a young age, when they learn that social networks can be organized into cliques, and that words can fit into overlapping categories (for example, dog, mammal, animal).
"We think of children as taking in data, forming theories, and testing those theories with experiments. They're like little scientists," Tenenbaum said. "Until now there's been no good computational model for how children can, like scientists, grasp the underlying global structure of a set of data."
The research was funded by the James S. McDonnell Foundation Causal Learning Research Collaborative, the Air Force Office of Scientific Research, and the NTT Communication Sciences Laboratory.