Vladimir Vapnik | Unlocking a Complex World Mathematically
Professor of Computer Science
This profile is included in the publication Excellentia, which features current research of Columbia Engineering faculty members.
Photo by Eileen Barroso
“When the solution is simple, God is answering,” Albert Einstein once commented. He believed we could discover nature’s laws only when they connected a few variables, like the relationship between temperature and pressure or energy and mass.
“When the number of factors coming into play is too large, scientific methods in most cases fail,” Einstein said.
Of course, Einstein did not have computers. Vladimir Vapnik does.
Vapnik works in machine learning, a discipline that uses algorithms to detect automatically those laws of nature that depend on hundreds or even thousands of parameters. This enables computers to make better predictions, and also provides insights into the elusive nature of human learning.
Today’s machine learning technology requires many examples to generate accurate rules. Yet humans clearly learn to understand their complex world from far fewer examples. This led Vapnik to consider how teachers provide students with what he calls “privileged information,” holistic knowledge often delivered as metaphors and comparisons.
Master classes for musicians are an example.
“The teachers cannot show students how to play an instrument because their technique is not as good,” he said. “Instead, teachers may use metaphors or comparisons to show students how to understand a piece. This may sound like nonsense in terms of musical technique, but it helps them play better.”
Vapnik has shown mathematically that privileged information could slash the samples needed for machine learning by the square root of the original number.
“Instead of 10,000 examples, we would need only 100,” he said.
He demonstrated this using privileged information to help a computer identify handwritten numbers. He asked Professor of Russian Poetry Natalia Pavlovitch to write a short verse describing her feelings about each number sample. The information was subjective and not available by analyzing only the numbers. Including it during training yielded more accurate results than training with the numbers alone.
Vapnik also used surgeons’ descriptions of biopsy pictures—from “quiet” to “wide aggressive proliferation”—to improve the classification of tumors. The notes were impressionistic, but improved the computer’s ability to identify cancerous cells.
Humans frequently use such holistic privileged information to make sense of complex phenomena. Providing it to machines could open a new door onto a complex universe.
“For 2,000 years, we believed logic was the only instrument for solving intellectual problems. Now, our analysis of machine learning is showing us that to address truly complex problems, we need images, poetry, and metaphors as well,” Vapnik concluded.
M.S., Uzbek State University, Samarkand, 1958; Ph.D., Institute of Control Sciences, Moscow, 1964