Why you should care
Because the next big cure could be sitting somewhere in all that stored data.
No one needed to drag 12-year-old Atul Butte to Macy’s. At the time, department stores carried some of the latest and greatest in technology, aka personal computers, and young Atul was smitten. He’d bring spiral notebooks from home filled with computer programs and type the programs in code while his parents — from India, settled in South Jersey — did their shopping.
After Butte’s parents splurged on a new Apple II Plus, the rest was history: a computer science degree from Brown, summers at Apple and Microsoft, an M.D. (inspired by National Geographic articles on medical technology). Throw in a Ph.D. from MIT, for a whopping 17 years of post–high school education, and you’ll find that Butte is still tinkering. Only now he’s at the cutting edge of medicine and big data. After a decade at Stanford, Butte was tapped last year to head clinical informatics for all five of the University of California’s medical systems, and to lead UCSF’s new Institute for Computational Health Sciences. UCSF Chancellor Sam Hawgood calls Butte “a visionary leader” who will “revolutionize” the conception of wellness and medicine.
Butte’s mandate? To change the way the health care system uses and interprets — yes, you guessed it — “big data.” And by big data, we mean BIG data: the health records of over 14 million people in the UC system and countless research studies. If record keeping doesn’t sound sexy, it’s probably because you don’t realize that sloppy records could keep the best cures available from being found. Butte’s history of extracting insight from massive, forgotten messes made him a logical choice for the role.
Then there’s the fact that Butte has proved himself adept at monetizing said data insights. Last year, he sold his 2-year-old company, Carmenta Bioscience, to prenatal testing company Progenity for an undisclosed amount. (Butte puts it this way: “The inventors and the investors are really happy.”) The sale followed Butte’s discovery of a key diagnostic for preeclampsia, the sudden and life-threatening spike in blood pressure that, along with other pregnancy-related hypertension disorders, kills 76,000 women and 500,000 infants each year. The data was hidden in a mix of old studies published online. “It’s as stupidly naive as that,” Butte says.
Butte’s work is part of a wider movement in open-source data for medical discovery, in which researchers extract insights from old, published studies. The ethos here is signature Silicon Valley: Shake up what already exists with some beautiful software, and circumvent slow-moving institutions like the Food and Drug Administration and Big Pharma altogether. A couple of handfuls of software biotech firms, mostly in the Valley, Boston and New York, are trying to pave the way. And there are also individuals, like Butte’s “hero,” Robert Langer, a bioengineer, M.D. and MIT professor who co-founded a medical venture fund. Langer in turn calls Butte “a star.”
Butte is a data evangelist, and he infuses the lyrical into his cause. That so few faculty start companies is a “tragedy,” he says. Were it up to him, his lab-based colleagues would treat entrepreneurialism as de rigueur; even his 13-year-old daughter watches Shark Tank and knows about valuing companies. Butte is emphatic that open data, and especially biomedical data, is the West’s best export. “We give away this data to the world for free and a kid in Bakersfield or Bangladesh can discover a brand-new drug,” he says in his singsongy cadence. Then that kid in Bangladesh can sell it back to the U.S. for a few million dollars, a fraction of what it would cost a pharmaceutical company to make the same drug.
Not everyone thinks it’s fair to use or profit off the research of others. In January, Jeffrey M. Drazen, editor-in-chief of the New England Journal of Medicine, penned a critical editorial about scientists who use data sets from others, calling them “research parasites” and warning that research may be misused. The research community exploded and the hashtag #IAmAResearchParasite emerged, with people like U.S. Chief Data Scientist DJ Patil and National Academy of Sciences President Marcia McNutt identifying themselves as research parasites in defiance. Four days later, the prestigious journal issued a clarifying editorial. The backlash speaks to the quickly evolving nature of the field and the ongoing conversation around work product and fair use. “We need people with [Butte’s] expertise to help maximize the information that can be derived from clinical trial data,” Drazen writes in an email to OZY, but, he adds, patients in these trials deserve “special consideration” since they are putting themselves “at risk for the data to be accrued.”
— DJ Patil (@DJ44) March 5, 2016
When Butte isn’t at his stand-up desk in his open office, a force field of computer monitors surrounding him (he’ll be moving into a new building that’s going up next door, at UCSF’s Mission Bay Campus, he tells me), he’s hiking, sailing or leveraging his own data to keep off the 50 pounds he recently lost. It’s all about that data. And that’s unlikely to stop anytime soon. When I ask Butte about the next several years, he instantly responds that it will include more inventions and more companies. And he hopes others catch on. Until then, he’ll keep giving away his “secrets,” imploring med students to study computer science and telling anyone who will listen about untapped potential sitting in the interweb. Because, as they (might soon) say, mo’ data, mo’ money.