Alumnus algorithms frugal with resources yet have wealth of applications

If you need to monitor something big and complex with a tight budget of money, electrical power, or some other resource, Carlos Guestrin (PhD 2003 CS) knows just what to do. The Carnegie Mellon computer science assistant professor blends machine learning and optimization to produce algorithms that can optimize a wide variety of tasks. Applications range from cost-effectively placing contamination sensors in a municipal water system to determining which of the thousands of blogs around the Internet are the most influential. Guestrin’s work so fascinated the editors of Popular Science magazine that earlier this fall they named him to their “Brilliant 10,” an annual list of top scientists under 40 years old (BioE Assistant Professor Karl Deisseroth was also on the list). Below Guestrin discusses his work, its many applications, and how, early in his career, he made the decision to give his theoretical research a strong grounding in real problems.

What was the Popular Science award for, and how is that an outgrowth of what you are trying to do?

The question that came out from my research turned out to be a very general question, but started out after Stanford when in 2003 I was working at an Intel lab located in Berkeley. The researchers there developed and deployed wireless sensors. I was talking to them about a deployment in a redwood forest and I asked, “How did you decide where to put the sensors?” And they said, “Oh, basically using intuition.” I thought maybe I could help them out. It turned out to be a much bigger, more interesting problem than I expected — this idea of optimizing what information to gather. Maximizing the amount of information to gather with a minimum possible cost turned out to be extremely useful in a number of real world applications. For example, when we were just starting out, we worked with researchers at UCLA and USC who wanted to monitor lakes and rivers for different kinds of algal blooms. When different kinds of pollutant levels get high, algae can grow out of control and have a devastating effect on the water resource. The researchers wanted to characterize when this happens and why. One way of doing this is to collect data from lakes and rivers, and there are cool sensors in buoys and little boats that carry sensors around. One of the tasks was to optimize what information gets collected, because you have limited resources. Another example is developing a smart chair — a chair that can make predictions about your posture. This is a part of Carnegie Mellon’s Quality of Life Technology Center, where my colleagues here are developing technology for the aging and disabled population. The first chair developed for the project was based on a very interesting sensor that measures pressure per square centimeter. It will give you a really nice view of what is going on when you sit on the chair. The problem is that the sensor costs per chair was about $16,000. So, the question is, can you do something that still has the prediction quality at the fraction of the price? We built a prototype, and our version cost only about $100 with the same prediction accuracy.

That’s a tremendous savings. Did you use cheaper sensors that you could reconfigure in a way that would accomplish the same prediction as the expensive one?

Exactly. That is the idea. Let me give you another example in water distribution systems, or the pipes that bring water to your spout. If the water is contaminated, it can spread and infect thousands of people. What do water systems do now about contamination? We essentially put chlorine in at the source and hope for the best. We are interested in implementing a system with sensors that can detect contamination quickly. The problem, again, is that the sensors are really expensive. They cost about $14,000 each. This is such a big issue that we participated in a competition where we were given a model of a city and a simulator from the Environmental Protection Agency and we had to tell them where the sensors should be placed. We won the competition using the algorithms that we had developed. Most of the algorithmic ideas were the same as we used for the smart chair or monitoring algal blooms, though some new insights came out of trying to address the huge, city-wide scale required by the competition.

Would this perhaps allow a reduction in the use of chlorine?

Right now, systems sometimes over-chlorinate the water just to be extra safe. If we had effective sensing we could do this in a more adaptive fashion. But also chlorine can’t protect you from everything. It’s just one type of protection. There are all sorts of other types of contamination that might happen that we want to be aware of and treat. So these are examples of physical sensing. And as we were working on this project, one time I was sitting with my students, Andreas Krause and Jure Leskovec, we were discussing the possibilities, and we were wondering whether there are any other problems in which this sort of optimization would be useful. We came up with this idea of what would it mean to instrument information. Let me give you an example. There are thousands of blogs and millions of blog postings on the Web. What blogs would we need to stay on top of to get the big stories? It seems like a totally different problem than the ones we’ve discussed so far, but in fact, it can be modeled in a very similar way, and we can optimize the information. So we took blog data from 2006 and in a paper we published a list of blogs that you should read in order to stay on top of the most important information.

What were the criteria that the algorithm considered to distinguish among the blogs?

Here we find the notion of information cascade. Suppose somebody posts a story and other people write about it and they link to it. More people write, link to other people, and so on, and it creates a cascade of information on the Web about that story. Now we say a “better” blog will capture the story as early as possible. If the story is going to be big, then many people are going to write about it, so the number of people who read after you, or capture it after you, is a measure of how important the story was as well as how good your blog is.

You’ve talked a lot about optimization. At Stanford you studied machine learning. What’s the connection?

Although all of these applications are very different, they share a common mathematical property: rates of diminishing returns. Say that I had four sensors. Adding a fifth sensor is going to help me more than if I had ten thousand sensors and added a ten thousand and first. Anytime you have a problem that has these properties, which is called a submodular function optimization problem, then beautiful things happen. We can come up with algorithms in order to guarantee that our solution is going to be near the optimum. This is the optimization side of our work. We prove that all these applications have this mathematical property, and then design algorithms that can exploit this mathematical property in order to come up with near-optimal solutions. Even though there are all possible combinations of blogs — if you have twenty, all combinations of twenty blogs are thousands and thousands — we can still do it very, very fast because we have this algorithm that doesn’t have to exhaust itself to achieve near-optimal solutions. Machine learning allows you to take data and make predictions. We can use the quality of our predictions as a measure of how good that set of sensors would be, whether for reading a set of blogs, or putting a set of sensors in a chair. And so, by building these machine learning models, we can say, if I were to put these four sensors out here, how good a prediction would these machine learning models have? So I have some data, and I want to make predictions, based on what data I collect. The kind of data I collect is based on what sensors I put in, and the machine learning models go from data to predictions, so they can measure how good a prediction I would have from a given set of data. And that’s how the machine learning connects with optimization.

What was your thesis topic at Stanford?

The thesis topic was how one can scale up decision making under uncertainties in very, very large problems. A key insight from my thesis was that even though you might have a very large problem, like a big factory, where you are trying to make decisions, you can exploit the structure within this problem. The problem might look very complicated, but it is usually formed of interacting sub-parts that can’t be solved independent of each other but you can still exploit the fact that each part is small and they are interacting, too. There’s a flow within the factory. Exploiting that structure, that flow to solve very large problems, was the focus of my thesis work. We developed a number of algorithms that can exploit that kind of structure to scale up to very large problems. Take a car, for example. The whole car is a very complex system, and there are many parts of the car, many subsystems in the car. The fact is that, even though they are interacting, they are interacting in a very structured manner. The injection system connects to the engine which then connects to the exhaust system and so on. You break it into simpler problems that you can handle almost separately, but you must deal with the interactions between them too. We have a way of characterizing the interaction and the effects of the interaction, and exploiting the local interaction between subsystems to scale up to something very large.

You mentioned that after Stanford, you worked for Intel for a while. Did you expect that you would return to academia as a professor?

When I finished at Stanford, I got a job at Carnegie Mellon but I left for a year to work at Intel. My thesis work got me the job at Carnegie Mellon but it was more the theory side, and I really wanted to work with real world applications. I thought the notion of wireless sensors was going to be an important notion for the future, and at Intel there was a lab where they develop and build and deploy wireless sensors. It’s more of a systems and engineering lab. So I went there for a year and learned about the technology and problems and issues that arose, and tried to connect some of the ideas that I was working on with machine learning and artificial intelligence and optimization with real world applications. After that we had so many cool applications to work on.

Alumnus algorithms frugal with resources yet have wealth of applications

What was the Popular Science award for, and how is that an outgrowth of what you are trying to do?

That’s a tremendous savings. Did you use cheaper sensors that you could reconfigure in a way that would accomplish the same prediction as the expensive one?

Would this perhaps allow a reduction in the use of chlorine?

What were the criteria that the algorithm considered to distinguish among the blogs?

You’ve talked a lot about optimization. At Stanford you studied machine learning. What’s the connection?

What was your thesis topic at Stanford?

You mentioned that after Stanford, you worked for Intel for a while. Did you expect that you would return to academia as a professor?

More News

Laying the foundation for today’s generative AI

John Hennessy to receive the National Science Board Vannevar Bush Award

Stanford Engineering announces annual staff award winners