If you’ve ever told Siri to call your friend Bob and she answers with, “Calling cops,” you’ve seen the instability of artificial intelligence (AI) in action.
Those mistakes are the limitation of the AI technology known as deep learning. They arise from the design of the deep neural network, as well as the network’s “training,” which applies mathematical optimization methods to massive amounts of data rather than hand-crafting rules to accomplish a specific task.
Emory College mathematician Lars Ruthotto has dedicated his research to modeling and solving such 21st century problems with the innovative use of differential equations that date back to the late 1600s. The National Science Foundation has rewarded his efforts with a CAREER Award, its most prestigious honor for junior faculty.
Put simply, Ruthotto is pioneering a new field — combining applied math, engineering and computer science — that applies the logic of differential equations to refine the chaos of deep learning.
“Focusing on this research question can impact specific areas of deep learning now as well as emerging technology,” says Ruthotto, an assistant professor in mathematics and computer science. “It’s uncharted territory, and my students and I will be at the forefront exploring it.”
The award is recognition of the new knowledge Ruthotto and his students are creating in the emerging field and also a hint of what’s to come, says Vaidy Sunderam, chair of the Department of Math and Computer Science in Emory College of Arts and Sciences.
“This grant establishes Emory as a research and education pioneer in innovative methods for robust deep learning, a key technology in the coming AI decade,” Sunderam says.
The problem with deep learning
Overall, AI is seen as a technological wonder that has the potential to make life more convenient, such as with Siri and other personal assistants, as well as more efficient, with self-driving cars and predictive analytics from machines learning to help with everything from medical diagnoses to stock market investments. All are AI technologies based on deep learning.
Deep learning relies on artificial neural networks with many hidden layers, which are known to be tricky to design and train so that they reliably perform critical tasks.
Ruthotto investigates the stability of those networks, or their ability to perform against perturbations of input. While being sensitive to background noise in Siri is desirable to avoid dialing a wrong number, the risks are clearly far greater if a machine makes a mistake with an autonomous vehicle.
Ruthotto began looking at differential equations as a modeling tool for deep neural networks about a year ago. His work relates the deep learning problem to other problems of optimizing the parameters of differential equations that he has tackled in applications of medical and geophysical imaging throughout his career.
Relating deep learning to those problems and related areas in physics, where trajectory optimization relies on sophisticated computations to find the closest route that does not violate a specific set of constraints, provides a new direction in AI research.
The interdisciplinary nature of Ruthotto’s work also adds complexity to his goal of using math models to ultimately simplify the learning networks’ design.
Once he can demystify the design, he will be able to create the theoretical guidelines that help practitioners train networks to learn — making deep learning more reliable, more efficient and safer.
“We will be able to create optimization algorithms for a whole new breed of deep learning networks,” Ruthotto says. “This new breed will be more stable, ergo more predictable and more trustworthy.”
Students needed
A critical component in Ruthotto’s work — and the NSF CAREER Award — is equipping students with the skills needed to bridge the math and computer science involved. In fact, his most valuable contribution to the whole problem may be to populate the field whose research he is leading.
Already, he teaches a graduate-level course, Numerical Methods for Deep Learning, that covers the rigorous math required to understand the instabilities of network architectures and the resulting problems of derivative-based learning algorithms.
Plans are underway to recruit and involve undergraduates in research this summer. Computer science majors who understand the applications of deep learning will learn about the mathematics underlying its successes and failures, Ruthotto says.
Those with applied math background will have a chance to apply their knowledge of differential equations and optimization to advance data science and discover an area with plenty of employment opportunities, he adds.
“There is great opportunity here,” Ruthotto says. “It used to be quite a complicated task to design a neural network and relate two different architectures. As in many other areas, mathematical modeling can also provide new tools that simplify deep learning.”