X-ray vision has long seemed like a far-fetched sci-fi fantasy, but over the last decade a team led by Professor Dina Katabi from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has continually gotten us closer to seeing through walls.
Their latest project, “RF-Pose,” uses artificial intelligence (AI) to teach wireless devices to sense people’s postures and movement, even from the other side of a wall.
The researchers use a neural network to analyze radio signals that bounce off people’s bodies, and can then create a dynamic stick figure that walks, stops, sits, and moves its limbs as the person performs those actions.
The team says that RF-Pose could be used to monitor diseases like Parkinson’s, multiple sclerosis (MS), and muscular dystrophy, providing a better understanding of disease progression and allowing doctors to adjust medications accordingly. It could also help elderly people live more independently, while providing the added security of monitoring for falls, injuries and changes in activity patterns. The team is currently working with doctors to explore RF-Pose’s applications in health care.
All data the team collected has subjects' consent and is anonymized and encrypted to protect user privacy. For future real-world applications, they plans to implement a “consent mechanism” in which the person who installs the device is cued to do a specific set of movements in order for it to begin to monitor the environment.
“We’ve seen that monitoring patients’ walking speed and ability to do basic activities on their own gives health care providers a window into their lives that they didn’t have before, which could be meaningful for a whole range of diseases,” says Katabi, who co-wrote a new paper about the project. “A key advantage of our approach is that patients do not have to wear sensors or remember to charge their devices.”
Besides health care, the team says that RF-Pose could also be used for new classes of video games where players move around the house, or even in search-and-rescue missions to help locate survivors.
Katabi co-wrote the new paper with PhD student and lead author Mingmin Zhao, MIT Professor Antonio Torralba, postdoc Mohammad Abu Alsheikh, graduate student Tianhong Li, and PhD students Yonglong Tian and Hang Zhao. They will present it later this month at the Conference on Computer Vision and Pattern Recognition (CVPR) in Salt Lake City, Utah.
One challenge the researchers had to address is that most neural networks are trained using data labeled by hand. A neural network trained to identify cats, for example, requires that people look at a big dataset of images and label each one as either “cat” or “not cat.” Radio signals, meanwhile, can’t be easily labeled by humans.
To address this, the researchers collected examples using both their wireless device and a camera. They gathered thousands of images of people doing activities like walking, talking, sitting, opening doors and waiting for elevators.
They then used these images from the camera to extract the stick figures, which they showed to the neural network along with the corresponding radio signal. This combination of examples enabled the system to learn the association between the radio signal and the stick figures of the people in the scene.
Post-training, RF-Pose was able to estimate a person’s posture and movements without cameras, using only the wireless reflections that bounce off people’s bodies.
Since cameras can’t see through walls, the network was never explicitly trained on data from the other side of a wall – which is what made it particularly surprising to the MIT team that the network could generalize its knowledge to be able to handle through-wall movement.
“If you think of the computer vision system as the teacher, this is a truly fascinating example of the student outperforming the teacher,” says Torralba.
Besides sensing movement, the authors also showed that they could use wireless signals to accurately identify somebody 83 percent of the time out of a line-up of 100 individuals. This ability could be particularly useful for the application of search-and-rescue operations, when it may be helpful to know the identity of specific people.
For this paper, the model outputs a 2-D stick figure, but the team is also working to create 3-D representations that would be able to reflect even smaller micromovements. For example, it might be able to see if an older person’s hands are shaking regularly enough that they may want to get a check-up.
“By using this combination of visual data and AI to see through walls, we can enable better scene understanding and smarter environments to live safer, more productive lives,” says Zhao.