image: MIGHT algorithm for AI-informed medical decisions and MIGHT-informed liquid biopsies for distinguishing cancer from inflammatory diseases
Credit: Elizabeth Cooke
**EMBARGOED FOR RELEASE UNTIL AUG. 18 AT 3 P.M. ET**
Two studies led by Johns Hopkins Kimmel Cancer Center, Ludwig Center, and Johns Hopkins Whiting School of Engineering researchers report on a powerful new method that significantly improves the reliability and accuracy of artificial intelligence (AI) for many applications. As an example, they apply the new method to early cancer detection from blood samples, known as liquid biopsy.
One study reports on the development of MIGHT (Multidimensional Informed Generalized Hypothesis Testing), an AI method that the researchers created to meet the high level of confidence needed for AI tools used in clinical decision making. To illustrate the benefits of MIGHT, they used it to develop a test for early cancer detection using circulating cell-free DNA (ccfDNA)—fragments of DNA circulating in the blood. A companion study found that ccfDNA fragmentation patterns used to detect cancer also appear in patients with autoimmune and vascular diseases. To develop a test with high sensitivity for cancer but reduced false-positive results, MIGHT was expanded to incorporate data from autoimmune and vascular diseases obtained from colleagues at Johns Hopkins and other institutions who treat and study these diseases.
The studies, supported in part by the National Institutes of Health, are to be published on Aug. 18 in the Proceedings of the National Academy of Sciences.
A related article, authored by three researchers from Johns Hopkins, Pixar co-founder Ed Catmull, Ph.D., and Microsoft chief data scientist of the AI for Good Lab Juan Lavista Ferres, was published concurrently in Cancer Discovery, a publication of the American Association for Cancer Research. It discusses the challenges of incorporating AI into clinical practice, including challenges addressed by MIGHT.
MIGHT fine-tunes itself using real data and checks its accuracy on different subsets of the data, using tens of thousands of decision-trees, and can be applied to any field employing big data, ranging from astronomy to zoology. It is particularly effective for the analysis of biomedical datasets with many variables but relatively few patient samples, a common situation in which traditional AI models often falter.
In tests using patient data, MIGHT consistently outperformed other AI methods in both sensitivity and consistency. It was applied to the blood of 1,000 individuals—352 patients with advanced cancers and 648 individuals without cancer. For each sample, the researchers evaluated 44 different variable sets, each consisting of a set of biological features, such as DNA fragment lengths or chromosomal abnormalities, and found that aneuploidy-based features (an abnormal number of chromosomes) delivered the best cancer detection performance with a sensitivity of 72% (ability to detect cancer) at 98% specificity (correctly identified those who were cancer free). This balance is critical in real-world medical applications where minimizing false positives is necessary to avoid unneeded procedures.
“MIGHT gives us a powerful way to measure uncertainty and increase reliability, especially in situations where sample sizes are limited but data complexity is high,” says Joshua Vogelstein, Ph.D., associate professor of biomedical engineering and a lead investigator.
MIGHT was also extended to a companion algorithm, called CoMIGHT, to determine whether combining multiple variable sets could improve cancer detection.
The researchers applied CoMIGHT to blood samples from 125 patients with early stage breast cancers and 125 patients with early-stage pancreatic cancer, which were analyzed along with 500 controls (participants without cancer). While pancreatic cancers were more often detected than breast cancers, CoMIGHT analysis suggested that early-stage breast cancer might benefit from combining multiple biological signals, highlighting the tool’s potential for tailoring detection strategies by cancer type.
In the companion study, researchers Christopher Douville, Ph.D., assistant professor of oncology, Samuel Curtis, Ph.D., postdoctoral fellow in the Ludwig Center, and their teams serendipitously discovered that ccfDNA fragmentation signatures previously believed to be specific to individuals with cancer also occur in patients with other diseases, including autoimmune conditions such as lupus, systemic sclerosis and dermatomyositis, and vascular diseases like venous thromboembolism.
Among individuals with abnormal fragmentation signatures, they found an increase in inflammatory biomarkers in all patients, whether they had autoimmune diseases, vascular disease or cancer. Their results suggest that inflammation—rather than cancer per se— is responsible for fragmentation signals, complicating efforts to use ccfDNA fragmentation as a biomarker specific for cancer.
To address the challenge of misconstruing inflammation for cancer, the team added information characteristic of inflammation in its training data for MIGHT. The enhanced version reduced, but did not completely eliminate, the false-positive results from non-cancerous diseases. “Our main goal was to further investigate the biological mechanisms responsible for fragmentation signatures that have previously been thought to be specific for cancer,” says Curtis. “As the field moves to more complex biomarkers, understanding the underlying biological mechanisms leading to the results are critical to their interpretation, particularly to avoid false positive results. Our new data indicate that patients with diseases other than cancer can be mistakenly believed to have cancer unless appropriate safeguards are incorporated into the tests.”
Adds Douville, “A silver lining of this study is that reworking of MIGHT could result in a separate diagnostic test for inflammatory diseases.”
Together, the studies demonstrate the promise as well as the complexities of developing trustworthy clinical technologies using AI. In a related editorial, researchers noted several critical challenges that need to be addressed so that tools like MIGHT can be fully integrated into clinical practice.
They identified eight key barriers to bringing AI into routine clinical care. In simple terms, these include the false expectation that AI tools need to be flawless before they’re considered useful; the need to present results as probabilities rather than simple yes-or-no answers; making sure AI predictions match real-world probabilities; ensuring results are reproducible; training models on diverse populations; explaining how AI makes decisions; recognizing how test accuracy can change when diseases are rare; and avoiding over-reliance on computer-generated recommendations.
“MIGHT could be applied to any field where measuring uncertainty and having confidence in the reliability and reproducibility of findings is key. This could be in the natural sciences, social sciences, or medical sciences. Research across all fields of science requires confidence that what the algorithm is spitting out is real, reproducible, and reliable,” says Joshua Vogelstein.
The researchers say results obtained using AI technologies should be viewed as AI-informed data that can complement but not replace clinical judgment. Although MIGHT and CoMIGHT offer powerful new tools in cancer detection, and potentially inflammatory and vascular disease detection, they say that further clinical trials and validation are necessary before such tests can be extended to clinical use.
“Trust in the result is essential, and now that there is a reliable, quantitative tool in MIGHT, we and other researchers can use it and focus our efforts on studying more patients and adding statistically meaningful features to our tests for earlier cancer detection,” says Bert Vogelstein, M.D., Clayton Professor of Oncology, co-director of the Ludwig Center, Howard Hughes Medical Institute investigator, and study co-leader.
MIGHT and its companion algorithm, CoMIGHT, are now publicly available at treeple.ai.
The study is a collaborative effort with researchers in Vietnam, led by Lan Ho-Pham and Tuan Nguyen, who provided critical clinical data, samples, and interpretation to the study.
In addition, to Joshua Vogelstein, Douville, Curtis, and Bert Vogelstein, researchers from Johns Hopkins were Tingshan Liu, Sambit Panda, Adam Li, Haoyin Xu, Yuxin Bai, Admin Li, Lisa Dobbyn, Maria Popoli, Janine Ptak, Natalie Stillman, Chris Thoburn, Maximillian Konig, Michelle Petri, Antony Rosen, Christopher Mecoli, Ami Shah, Itsuki Ogihara, Eliza O’Reilly, Yuxuan Wang, Michael Goggins, Tian-Li Wang, Ie-Ming Shih, Amanda Fader, Anne Marie Lennon, Ralph Hruban, Chetan Bettegowda, Kenneth Kinzler, and Nickolas Papadopoulos. The research team also included investigators from the University of Pittsburgh, the University of Texas MD Anderson Cancer Center and NYU Langone in the U.S., and the University of Melbourne in Australia, Saigon Precision Medicine Research Center, Pham Ngoc Thach University, and Tam Anh Research Institute in Ho Chi Minh City, Vietnam; the University of New South Wales; McGill University Health Centre in Montreal; and Amsterdam University Medical Centers. In addition to Catmull and Ferres, the editorial was authored by Elliot Fishman, Bert Vogelstein, and Joshua Vogelstein.
These studies were supported by National Institutes of Health grants R21NS113016, RA37CA230400 U01CA230691, U01CA230691, 5P50CA062924-22, T32GM119998, Oncology Core CA 06973, Ovarian Cancer SPORE, DRP 80057309 and 1R21A1766764-01; the Virginia and D.K. Ludwig Fund for Cancer Research; the Lustgarten Foundation, the Commonwealth Fund; the Thomas M. Hohman Memorial Cancer Research Fund; the Sol Goldman Sequencing Facility at Johns Hopkins; the Conrad R. Hilton Foundation; the Benjamin Baker Endowment 80049589; Swim Across America/Baltimore; JHTV Innovation Grant, the Burroughs Wellcome Career Award for Medical Scientists; the Thomas M. Hohman Memorial Cancer Research Fund; the National Health and Medical Research Council Investigator Grant APP1194970; the National Science Foundation NSF Computing Innovation Fellowship 2127309 and award DMS-1921310; the Rheumatology Research Foundation Investigator Award; the Harrington Discovery Institute Scholar-Innovator Awardl the Jerome L. Greene Foundation; the Cupid Foundation; and the Stephen & Renee Bisciotti Foundation.
Bert Vogelstein, Kenneth Kinzler, and Nickolas Papadopoulos are founders of Thrive Earlier Detection, an Exact Sciences Company. Kinzler, Papadopoulos, and Christopher Douville are consultants to Thrive Earlier Detection. B. Vogelstein, Kinzler, Papadopoulos, and Douville hold equity in Exact Sciences. B. Vogelstein, Kinzler, and Papadopoulos are founders of and own equity in Haystack Oncology and ManaT Bio. Kinzler and Papadopoulos are consultants to Neophore. Kinzler, B. Vogelstein, and Papadopoulos hold equity in and are consultants to CAGE Pharma. B. Vogelstein is a consultant to and holds equity in Catalio Capital Management. Chetan Bettegowda is a consultant to Depuy-Synthes, Bionaut Labs, Haystack Oncology and Galectin Therapeutics and is a co-founder of OrisDx. Bettegowda and Douville are co-founders of Diagnostics. The companies named above, as well as other companies, have licensed previously described technologies related to the work described in this paper from The Johns Hopkins University. B.Vogelstein, Kinzler, Papadopoulos, Bettegowda, and Douville, are inventors on some of these technologies. Licenses to these technologies are or will be associated with equity or royalty payments to the inventors as well as to The Johns Hopkins University. Patent applications on the work described in this paper may be filed by The Johns Hopkins University. The terms of all these arrangements are being managed by The Johns Hopkins University in accordance with its conflict-of-interest policies.
Journal
Proceedings of the National Academy of Sciences