Skip to content Skip to navigation

Research & Ideas

Search this site

How data can help us understand cancer and its treatment

​​The dream, says a bioengineer, is to have cancer-relevant medical data flow unimpeded around the world, so that everyone, wherever they are, can see and use this information.

Illustration by Tricia Seibold and iStock/liuzishan

During his 2016 State of the Union address, President Barack Obama called on Vice President Joe Biden – who had months earlier lost his son Beau to brain cancer – to head a “moonshot” to significantly accelerate research into the disease. The president said he wanted to harness the spirit of American innovation that took us from zero to landing a man on the moon in a decade to similarly find new ways to prevent, diagnose and treat cancer.

One of those intrigued by that call to action was Stanford’s Jan Liphardt, an associate professor of bioengineering who specializes in biophysics, the tumor microenvironment and data analysis. Stanford Engineering talked to Liphardt about how he came to be involved with the moonshot and his approach to using data and the voice of patients to better understand cancer and how it can be treated, and how sharing information can better inform the course of cancer research.

How did you get involved in the National Cancer Moonshot?

In March, after the president’s charge, the vice president challenged scientists, doctors, industry and patients to give their best ideas to the moonshot. The White House also reached out to a few outsiders, myself included. The White House instructions were unusual: “Do something big and different. There is no money and you have 87 days. Go.”

I like a challenge, and this was a chance to serve, even in the face of administrative hurdles. So I looked for advice, teammates and support. Russ Altman, a colleague at Stanford, suggested it was time to give patients a way to volunteer their own health data in order to help find cures. I collaborated with Peter Kuhn, a professor of medicine and engineering at the University of Southern California, who’s known for carefully listening to cancer patients, advocates and their supporters. In short order we had links with advocates like AnneMarie Ciccarella, Sonja Durham, Lori Marx-Rubiner, Jack Whelan and Jack Park. That’s how we got to

What’s the idea the team came up with?

We thought for about a week: What would matter to the patients that Stanford and other research institutions serve? What would scale? Well, we’re not going to run a clinical trial, go near protected health information, invent a new drug or write a research proposal. There’s no time for that. Whatever it was, it had to be useful, scalable, legal and different. That pointed to data, the web, patients and decisions.

One thing jumped out: Right now, there’s significant friction in medical data sharing. People all over the world can already effortlessly share other kinds of information – pictures, movies, ideas, stories, tweets. Increasingly, they are using the same tools to share personal medical information. It’s remarkable what cancer patients already share: diagnoses, genomes, pathology images. But that information is not yet widely used to understand where they are with their diseases.

Ideally, everyone, including scientists and doctors, would have as much information as possible at their fingertips. Many patients think when they give data for research, magically scientists all over the world can dig into this information, find patterns and help. The practical reality is that it’s nearly impossible for any one scientist to access the amounts of data they would like.

So that’s the simple idea: A global map and give patients the tools they need to share their data – if they want to. They can donate information for the greater good. In return, we make a simple promise: When you post data, we’ll anonymize them and make them available to anyone on Earth in one second. We plan to display this information like real-time traffic data. HIPAA doesn’t apply to this direct data-sharing. The patients can give us whatever information they want, and they can tell us what they want us to do with it. We’re a conduit. Their data belong to them, not to us.

How does it work?

Today we ask just five basic questions. Over time we will add more. You join, give some information, and we’ll put you on a global map. Right now, some of the things we don’t know about cancer are incredibly simple: Where is everyone on Earth with cancer? How old are they? What is their diagnosis? Did their cancers metastasize? Global, instantaneous data sharing is the story.

In a second phase, we are going to see if we can plot all the information just like Waze does for traffic. Our role is to synthesize the information and plot it in ways that ordinary people can understand. Think of it this way – patients want to be able to chart their treatment path. Who went straight, who went left? People just getting on the highway are curious about what people did who came before them, and what happened to those people. Did they arrive at the destination easily and promptly? We’re a real-time diagnosis and therapy mapping service for cancer.

You say that giving patients a way to share their health data is important to help finding cures. Why?

Let me give you a specific example. At Stanford, I’m part of a team of cancer biologists and clinicians funded by the Stanford Cancer Institute to think about the next generation of screening for breast cancer in the U.S. Every year, the U.S. uses mammography to screen more than 40 million women for breast cancer. In this project, it quickly became clear that there is currently no central, easy-to-access repository of mammograms for research use.

That’s a major lost opportunity – our nation spends billions on screening, but we don’t store, share and analyze this information in a scalable and simple manner. In the traditional approach, our team would spend several hundred thousand dollars, and about three years, to assemble perhaps 1,000 mammograms. We would then use this tiny dataset to try to find something interesting, but since the dataset is so small, we would be blind to rare features of breast cancer and its predictors. It clearly makes a lot more sense to compare and explore 100 million images.

This sounds completely impossible until you realize that Instagram users upload 58 million images every day. Once you start to think about supposedly intractable research problems from a web or social networking perspective, new possibilities open. Imagine, for example, if there were a simple way for every single woman on Earth to upload and share her de-identified mammogram? Or more generally, imagine a world in which patients have the tools to globally share de-identified health data, if they want to. That’s exactly the idea behind CancerBase – let’s just give people those tools and see what happens.

How much data and how many people are needed to make this viable?

We think we are going to need several tens-of-thousands of members. There are approximately 50 million people on Earth with a cancer diagnosed in the last five years, and 200 million more people have an immediate family member with cancer. Almost 2 billion people are active on Twitter and Facebook – a quarter of the world’s population. If just a few percent of those people sign up, we could do something no one on Earth has done before.

Are there hopes to create a “developer community,” people who find ways to use your data that you didn’t even think about or have the time to work on?

Definitely. As much as we think we can predict what these data are useful for, we don’t really know. By making the anonymized data available to everyone within one second, they might start to do things that we never dreamed of. The more eyes look at these data, the better off everyone will be. The dream is to have cancer-relevant medical data flow unimpeded around the world in seconds, so that everyone, wherever they are, can see and use this information.

Get Updates from Stanford Engineering