Robohub.org
 

Q&A: Vivienne Sze on crossing the hardware-software divide for efficient artificial intelligence


by
02 May 2021



share this:

Associate professor Vivienne Sze is bringing artificial intelligence applications to smartphones and tiny robots by co-designing energy-efficient hardware and software. Image credits: Lillie Paquette, MIT School of Engineering

Not so long ago, watching a movie on a smartphone seemed impossible. Vivienne Sze was a graduate student at MIT at the time, in the mid 2000s, and she was drawn to the challenge of compressing video to keep image quality high without draining the phone’s battery. The solution she hit upon called for co-designing energy-efficient circuits with energy-efficient algorithms.

Sze would go on to be part of the team that won an Engineering Emmy Award for developing the video compression standards still in use today. Now an associate professor in MIT’s Department of Electrical Engineering and Computer Science, Sze has set her sights on a new milestone: bringing artificial intelligence applications to smartphones and tiny robots.

Her research focuses on designing more-efficient deep neural networks to process video, and more-efficient hardware to run those applications. She recently co-published a book on the topic, and will teach a professional education course on how to design efficient deep learning systems in June.

On April 29, Sze will join Assistant Professor Song Han for an MIT Quest AI Roundtable on the co-design of efficient hardware and software moderated by Aude Oliva, director of MIT Quest Corporate and the MIT director of the MIT-IBM Watson AI Lab. Here, Sze discusses her recent work.

Q: Why do we need low-power AI now?

A: AI applications are moving to smartphones, tiny robots, and internet-connected appliances and other devices with limited power and processing capabilities. The challenge is that AI has high computing requirements. Analyzing sensor and camera data from a self-driving car can consume about 2,500 watts, but the computing budget of a smartphone is just about a single watt. Closing this gap requires rethinking the entire stack, a trend that will define the next decade of AI.

Q: What’s the big deal about running AI on a smartphone?

A: It means that the data processing no longer has to take place in the “cloud,” on racks of warehouse servers. Untethering compute from the cloud allows us to broaden AI’s reach. It gives people in developing countries with limited communication infrastructure access to AI. It also speeds up response time by reducing the lag caused by communicating with distant servers. This is crucial for interactive applications like autonomous navigation and augmented reality, which need to respond instantaneously to changing conditions. Processing data on the device can also protect medical and other sensitive records. Data can be processed right where they’re collected.

Q: What makes modern AI so inefficient?

A: The cornerstone of modern AI — deep neural networks — can require hundreds of millions to billions of calculations — orders of magnitude greater than compressing video on a smartphone. But it’s not just number crunching that makes deep networks energy-intensive — it’s the cost of shuffling data to and from memory to perform these computations. The farther the data have to travel, and the more data there are, the greater the bottleneck.

Q: How are you redesigning AI hardware for greater energy efficiency?

A: We focus on reducing data movement and the amount of data needed for computation. In some deep networks, the same data are used multiple times for different computations. We design specialized hardware to reuse data locally rather than send them off-chip. Storing reused data on-chip makes the process extremely energy-efficient.  

We also optimize the order in which data are processed to maximize their reuse. That’s the key property of the Eyeriss chip that was developed in collaboration with Joel Emer. In our followup work, Eyeriss v2, we made the chip flexible enough to reuse data across a wider range of deep networks. The Eyeriss chip also uses compression to reduce data movement, a common tactic among AI chips. The low-power Navion chip that was developed in collaboration with Sertac Karaman for mapping and navigation applications in robotics uses two to three orders of magnitude less energy than a CPU, in part by using optimizations that reduce the amount of data processed and stored on-chip. 

Q: What changes have you made on the software side to boost efficiency?

A: The more that software aligns with hardware-related performance metrics like energy efficiency, the better we can do. Pruning, for example, is a popular way to remove weights from a deep network to reduce computation costs. But rather than remove weights based on their magnitude, our work on energy-aware pruning suggests you can remove the more energy-intensive weights to improve overall energy consumption. Another method we’ve developed, NetAdapt, automates the process of adapting and optimizing a deep network for a smartphone or other hardware platforms. Our recent followup work, NetAdaptv2, accelerates the optimization process to further boost efficiency.

Q: What low-power AI applications are you working on?

A: I’m exploring autonomous navigation for low-energy robots with Sertac Karaman. I’m also working with Thomas Heldt to develop a low-cost and potentially more effective way of diagnosing and monitoring people with neurodegenerative disorders like Alzheimer’s and Parkinson’s by tracking their eye movements. Eye-movement properties like reaction time could potentially serve as biomarkers for brain function. In the past, eye-movement tracking took place in clinics because of the expensive equipment required. We’ve shown that an ordinary smartphone camera can take measurements from a patient’s home, making data collection easier and less costly. This could help to monitor disease progression and track improvements in clinical drug trials.

Q: Where is low-power AI headed next?

A: Reducing AI’s energy requirements will extend AI to a wider range of embedded devices, extending its reach into tiny robots, smart homes, and medical devices. A key challenge is that efficiency often requires a tradeoff in performance. For wide adoption, it will be important to dig deeper into these different applications to establish the right balance between efficiency and accuracy.



tags: , ,


MIT News





Related posts :



Robot Talk Episode 117 – Robots in orbit, with Jeremy Hadall

  11 Apr 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Jeremy Hadall from the Satellite Applications Catapult about robotic systems for in-orbit servicing, assembly, and manufacturing.

Robot Talk Episode 116 – Evolved behaviour for robot teams, with Tanja Kaiser

  04 Apr 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Tanja Katharina Kaiser from the University of Technology Nuremberg about how applying evolutionary principles can help robot teams make better decisions.

Robot Talk Episode 115 – Robot dogs working in industry, with Benjamin Mottis

  28 Mar 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Benjamin Mottis from ANYbotics about deploying their four-legged ANYmal robot in a variety of industries.

Robot Talk Episode 114 – Reducing waste with robotics, with Josie Gotz

  21 Mar 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Josie Gotz from the Manufacturing Technology Centre about robotics for material recovery, reuse and recycling.

Robot Talk Episode 113 – Soft robotic hands, with Kaspar Althoefer

  14 Mar 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Kaspar Althoefer from Queen Mary University of London about soft robotic manipulators for healthcare and manufacturing.

Robot Talk Episode 112 – Getting creative with robotics, with Vali Lalioti

  07 Mar 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Vali Lalioti from the University of the Arts London about how art, culture and robotics interact.

Robot Talk Episode 111 – Robots for climate action, with Patrick Meier

  28 Feb 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Patrick Meier from the Climate Robotics Network about how robots can help scale action on climate change.

Robot Talk Episode 110 – Designing ethical robots, with Catherine Menon

  21 Feb 2025
In the latest episode of the Robot Talk podcast, Claire chatted to Catherine Menon from the University of Hertfordshire about designing home assistance robots with ethics in mind.





Robohub is supported by:




Would you like to learn how to tell impactful stories about your robot or AI system?


scicomm
training the next generation of science communicators in robotics & AI


©2024 - Association for the Understanding of Artificial Intelligence


 












©2021 - ROBOTS Association