Deep learning is advancing at lightning speed, and Alexander Amini ’17 and Ava Soleimany ’16 want to make sure they have your attention as they dive deep on the math behind the algorithms and the ways that deep learning is transforming daily life.
Last year, their blockbuster course, 6.S191 (Introduction to Deep Learning) opened with a fake video welcome from former President Barack Obama. This year, the pair delivered their lectures “live” from Stata Center — after taping them weeks in advance from their kitchen, outfitted for the occasion with studio lights, a podium, and a green screen for projecting the blackboard in Kirsch Auditorium on their Zoom backgrounds.
“It’s hard for students to stay engaged when they’re looking at a static image of an instructor,” says Amini. “We wanted to recreate the dynamic of a real classroom.”
Amini is a graduate student in MIT’s Department of Electrical Engineering and Computer Science (EECS), and Soleimany a graduate student at MIT and Harvard University. They co-developed 6.S191’s curriculum and have taught it during MIT’s Independent Activities Period (IAP) for four of the last five years. Their lectures and software labs are updated each year, but this year’s pandemic edition posed a special challenge. They responded with a mix of low- and high-tech solutions, from filming the lectures in advance to holding help sessions on a Minecraft-like platform that mimics the feel of socializing in person.
Some students realized the lectures weren’t live after noticing clues like the abrupt wardrobe change as the instructors shifted from lecture mode to the help session immediately after class. Those who caught on congratulated the pair in their course evaluations. Those who didn’t reacted with amazement. “You mean they weren’t livestreamed?” asked PhD student Nada Tarkhan, after a long pause. “It absolutely felt like one instructor was giving the lecture, while the other was answering questions in the chat box.”
The growing popularity of 6.S191 — both as a for-credit class at MIT, and a self-paced course online — mirrors the rise of deep neural networks for everything from language translation to facial recognition. In a series of clear and engaging lectures, Amini and Soleimany cover the technical foundations of deep nets, and how the algorithms pick out patterns in reams of data to make predictions. They also explore deep learning’s myriad applications, and how students can evaluate a model’s predictions for accuracy and bias.
Responding to student feedback, Amini and Soleimany this year extended the course from one week to two, giving students more time to absorb the material and put together final projects. They also added two new lectures: one on uncertainty estimation, the other on algorithmic bias and fairness. By moving the class online, they were also able to admit an extra 200 students who would have been turned away by Kirsch Auditorium’s 350-seat limit.
To make it easier for students to connect with teaching assistants and each other, Amini and Soleimany introduced Gather.Town, a platform they discovered at a machine learning conference this past fall. Students moved their avatars about in the virtual 6.S191auditorium to ask homework questions, or find collaborators and troubleshoot problems tied to their final projects.
Students gave the course high marks for its breadth and organization. “I knew the buzzwords like reinforcement learning and RNNs, but I never really grasped the details, like creating parameters in TensorFlow and setting activation functions,” says sophomore Claire Dong. “I came out of the class clearer and more energized about the field.”
This year, 50 teams presented final projects, twice as many as the year before, and they covered an even broader range of applications, say Amini and Soleimany, from trading cryptocurrencies to predicting forest fires to simulating protein folding in a cell.
“The extra week really helped them craft their idea, create some of it, code it up, and put together the pieces into a presentation,” says Amini.
“They were just brilliant,” adds Soleimany. “The quality and organization of their ideas, the talks.”
Four projects were picked for prizes.
The first was a proposal for classifying brain signals to differentiate right hand movements from the left. Before transferring to MIT from Miami-Dade Community College, Nelson Hidalgo had worked on brain-computer interfaces to help people with paralysis regain control of their limbs. For his final project, Hidalgo, a sophomore in EECS, used EEG brain wave recordings to build a model for sorting the signals of someone attempting to move their right hand and left.
His neural network architecture featured a combined convolutional and recurrent neural net working in parallel to extract sequential and spatial patterns in the data. The result was a model that improved on other methods for predicting the brain’s intention to move either hand, he says. “A more accurate classifier could really make this technology accessible to patients on a daily basis.”
A second project explored the potential of AI-based forestry. Tree planting has become a popular way for companies to offset their carbon emissions, but tracking how much carbon dioxide those trees actually absorb is still an inexact science. Peter McHale, a master’s student at MIT Sloan School of Management, proposed that his recently launched startup, Gaia AI, could fly drones over forests to take detailed images of the canopy above and below.
Those high-resolution pictures could help forest managers better estimate tree growth, he says, and calculate how much carbon they’ve soaked up from the air. The footage could also provide clues about what kinds of trees grow best in certain climates and conditions. “Drones can take measurements more cheaply and accurately than humans can,” he says.
Under Gaia AI’s first phase of development, McHale says he plans to focus on selling high-quality, drone-gathered sensor data to timber companies in need of cheaper, more accurate surveying methods, as well as companies providing third-party validation for carbon offsets. In phase two, McHale envisions turning those data, and the profits they generate, to attack climate change through drone-based tree-planting.
A third project explored the state of the art for encoding intelligent behavior into robots. As a SuperUROP in Professor Sangbae Kim’s lab, Savva Morozov works with the mini cheetah and is interested in figuring out ways that robots like it might learn how to learn.
For his project, Morozov, a junior in the Department of Aeronautics and Astronautics, presented a scenario: a mini cheetah-like robot is struggling to scale a pile of rubble. It spots a wooden plank that could be picked up with its robotic arm and turned into a ramp. But it has neither the imagination nor repertoire of skills to build a tool to reach the summit. Morozov explained how different learning-to-learn methods could help to solve the problem.
A fourth project proposed the use of deep learning to make it easier to analyze street-view images of buildings to model an entire city’s energy consumption. An algorithm developed by MIT’s Sustainable Design Lab and graduate student Jakub Szczesniak estimates the window-to-wall ratio for a building based on details captured in the photo, but processing the image requires a lot of tedious work at the front end.
Nada Tarkhan, a PhD student in the School of Architecture and Planning, proposed adding an image-processing convolutional neural net to the workflow to make the analysis faster and more reliable. “We hope it can help us gather more accurate data to understand building features in our cities — the façade characteristics, materials, and wall-to-window ratios,” she says. “The ultimate goal is to improve our understanding of how buildings perform citywide.”
Based on student feedback, Amini and Soleimany say they plan to keep the added focus on uncertainty and bias while pushing the course into new areas. “We love hearing that students were inspired to take further AI/ML classes after taking 6.S191,” says Soleimany. “We hope to continue innovating to keep the course relevant.”
Funding for the class was provided by Ernst & Young, Google, the MIT-IBM Watson AI Lab, and NVIDIA.