During a typical 15- to 20-minute clinic visit with patients, oncologist Dr. Rom Leidner opens around 20 different files on his computer. They are pieces of a puzzle, creating a picture of the patient’s cancer – blood-test results, weight trends, radiology images, microbiology, pathology, cardiology, electronic messages from other doctors, electronic messages from patients, text pages from nurses and clinic staff, prescriptions, chemotherapy orders, insurance forms.
For some files, he has to log in to specialized software for access. And after reviewing all that, he can actually examine the patient and discuss their cancer care. To keep to his schedule for seeing patients, Dr. Leidner and many doctors resort to spending their weekends copying and pasting all this information into medical charts ahead of time.
“The last hope of our profession may well be AI-assisted curation of information streams that converge in the exam room,” says Dr. Leidner, a medical oncologist specializing in hematology at Providence Cancer Institute Franz Clinic in Portland, Oregon.
Providence, a 51-hospital healthcare organization serving seven western states in the U.S., is working to do just that, developing research prototype AI tools to sort through growing mountains of patient data, in time-of-flight, to improve therapies and to advance the treatment of cancer, which accounts for nearly one in six deaths worldwide.
More refined lab tests, scans and genetic analyses can help promote a better understanding of each patient’s case, resulting in personalized therapies that adapt treatment and medication to each patient’s genetic biomarkers. Assessing all that information is an unfathomably huge task.
“It’s a fortunate convergence in the last two years, that just as we’re approaching a bottleneck with hyper-plexed cellular and molecular data from clinical trials that will exceed human capacity to readily analyze, we appear to be at the doorstep of a transformative technology capable of handling the data streams we foresee in next-gen medical science,” Dr. Leidner says.
Providence is working with Microsoft on prototype AI tools to improve care for cancer patients, accelerate progress in understanding cancer and perhaps find treatments or cures. The project is part of Microsoft’s commitment to apply generative AI to precision health.
“Our research collaboration at Providence is bearing fruits in end-to-end real-world applications toward precision medicine. The underlying technological advances can empower clinical practitioners and researchers to unlock more and more high-value applications for improving patient care and accelerating biomedical discovery,” says Hoifung Poon, Ph.D., general manager of Health Futures at Microsoft Research, who has been collaborating with Providence on these prototype AI tools.
Patient information exists in a variety of formats – electronic medical records (EMRs), imaging scans, genomics and all manner of lab tests. The same information might be noted using different terms or different formats and the core information requires synthesizing a large amount of unstructured data. This is just the kind of job AI can do well – summarize unstructured data in text form.
For cancer patients who have exhausted the first line of treatments, a clinical trial, in which new treatments are tested, can offer a best last hope. The hard part is finding one. The percentage of U.S. cancer patients participating in a trial is in the single digits, yet, ironically, lack of enrollment is a key reason clinical trials fail.
This problem has nothing to do with a scarcity of clinical trials. In fact, the number of registered clinical trials (for all treatments, globally), increased 59-fold between 2000 and 2021, according to the World Health Organization. In the U.S., the number of registered studies grew 289-fold in that same period.
Instead, one of the biggest hurdles to clinical trial enrollment is data. Again, the issue isn’t scarcity but the opposite – mountains of records.
“A patient’s performance status, location of the cancer, blood counts, critical organ function such as heart, liver and kidneys, and numerous other criteria must all be carefully assessed for every potential patient, in every clinical study,” says Dr. Leidner, who is running a clinical trial of gene-modified TCR-transduced T-cell adoptive therapy targeting KRAS neoantigen. This is a kind of immunotherapy in which a patient’s own T-cells are engineered to specifically recognize and eradicate cancer cells with mutations in the KRAS gene associated with cancer.
As if the medical puzzle pieces weren’t challenging enough to fit together, basic logistical information may also be missing from patient records. “It may be surprising, but identifying a patient’s oncologist or other specialists that have been involved isn’t necessarily straightforward. This kind of information should be readily organized in the patient record, but in reality it’s often fragmented or absent,” Dr. Leidner says.
Where AI shines
The challenge isn’t just multiple varieties of information for each patient, but also fragmentation in EMR formats from one clinic or healthcare system to another. AI, however, can summarize this information quickly. Most importantly, it doesn’t require information to be formatted – it can vacuum up lab results, doctors’ notes and digitized scans as they are. It also can figure out that two different terms refer to the same thing, because it can work with natural language.
“AI is very useful in going through the databases of research trials, gathering multiple trial eligibility criteria and matching that to each individual patient by culling that information from the digital medical record,” Dr. Leidner says. “As a clinician, there simply aren’t enough hours in the day to sift through the trials matching process and still see patients.”
One thing that will become less important in clinical research is the site of origin of a patient’s cancer or the morphologic categorization – “this is quite a difficult thing for even the medical profession to grasp,” Dr. Leidner says. In some advanced clinical trials, “it’s more a question of ‘do they have the right immune system and the right gene mutation rather than which type of cancer?’”
Genetic testing of cancers is now routine, but HLA typing (for human leukocyte antigen, a set of genes that regulate the immune system), while routine for organ transplants, isn’t yet common in oncology. Providence has made HLA typing standard for cancer patients to enable personalized medicine and to be able to quickly find trials that offer a potential match.
Personalized medicine in oncology “is based on the presence or absence of genomic alterations. Each therapy is specific for those alterations, and those can be very rare,” says Dr. Carlo Bifulco, chief medical officer of Providence Genomics, a division of Providence that is using AI to transform health care.
Think of all these pieces as pixels in a photograph. In a low-resolution image, it might be possible to guess what kind of bird is in a photo, but with high resolution, it’s easier to recognize the species thanks to specific details.
Treating cancer by the organ where it occurs, such as lung cancer, is a low-resolution analogy. By increasing the resolution, it becomes clear that one patient’s cancer is driven by a set of genetic aberrancies and another patient’s is driven by a different set – they are different even though they are both lung cancer.
Biomarkers are not the end of the story. Other attributes, such as overall health, tolerance to cancer drugs, age and a patient’s other health problems further sharpen the resolution. Such high-resolution, holistic representation is called “patient embedding.” To find enough patients with similar patient embedding for a new treatment undergoing clinical trial requires starting from a huge pool of patients.
The National Institutes of Health maintains a voluntary database of clinical trials, “but it only has a rudimentary search interface,” Poon says. In the U.S. alone, there are two million new cancer patients every year. Meanwhile, at any given moment, there are thousands of active trials.
“Today’s manual process is hopelessly non-scalable,” he adds. “Our dream is to structure all medical information and create a high-fidelity patient embedding to automatically match against trials continuously, thus enabling just-in-time clinical trial matching and democratizing this very important source of high-quality health care. As a research team, it has been exciting to work with Providence. Developing prototype AI tools using the principles of responsible AI like fairness, privacy and security, and reliability and safety is important when we look to improve patient outcomes in the future.”
With the latest advances in generative AI and the promising initial proof point at Providence, one can already imagine creating a population-scale dashboard for clinical researchers to find potential trial candidates in real time.
While AI untangles the logistics of clinical trial matching, it could have an even bigger role in medical discovery. “We are looking for ever more and new ways to understand the biology of cancer, and through that, to discover ways we can eradicate cancer,” Dr. Leidner says.
The goal is to develop computer models that can take the enormous amounts of data generated in clinical trials and real-world data to spot a trend and then prove that the therapy caused the trend. “Today, from one biopsy, we’re getting gigabytes of data at the cellular and molecular level,” Dr. Leidner says. “There can be tens of thousands of variables from every patient visit on a clinical trial.
These datasets are simply beyond human capacity to analyze. Given the scale, it would conceivably require scores of Ph.D.’s, working for years, to complete the analysis of one clinical trial.”
The promise of multimodal machine learning
Providence and Microsoft are working together on multimodal machine learning, trained on diverse data generated and managed by Providence – text, images or genomics and, in the future, spatial biology, proteomics (the study of proteins in our bodies), transcriptomics (the study of the body’s RNA) and epigenomics (the study of the regulatory superstructure of the genome).
“We put together very complex data sources, which may be images, genomic datasets or just text, and there are gigantic data streams which can benefit from this approach,” Dr. Bifulco says. “The technique is matching solutions already. We have it already facing oncologists, research nurses and pathologists and we use it every day.”
The progress is remarkable on several levels. The diagnostic machines that are generating such huge amounts of data didn’t exist a couple of years ago. AI has also improved in that time – the joint Providence and Microsoft research team is exploring the cutting edge of foundation models and going further to bridge the competency gap in precision health. Oncology may be the tip of the spear, but exponentially expanding data streams, converging at every visit to the doctor’s office, will eventually impact all areas of medicine.
The AI prototypes for Providence were trained on such holistic, multimodal patient data. Microsoft helped Providence process legacy radiology images – more than two million studies with 600 million images. All computation was conducted within Providence’s private tenant and approved by the Providence Institutional Review Board (IRB), adhering to appropriate standards of privacy and compliance.1
Microsoft also helped Providence digitize all cancer pathology slides (100,000-plus whole-slide images) as ultra-high-resolution images to become another research AI training set. The joint team has since made great strides in pretraining powerful biomedical large multimodal models (LMMs) from such large-scale, multimodal, real-world data.
“The resulting multimodal patient embedding can serve as a digital twin for the patient and enable patient-like-me reasoning at scale,” Poon says. “Such population-level real-world evidence can improve patient care by identifying what works and accelerate biomedical discovery by pinpointing where and how it doesn’t work.”
Many pieces of the puzzle, or pixels that would increase resolution of the picture, are still missing – or they exist as data that isn’t being analyzed.
“All the other things that we are still not capturing are influencing and impacting the outcomes of the patient,” Dr. Bifulco says. “Currently in the experimental setting, you’re limited by the computational aspects. AI can play a major role.”
1Providence IRB protocols #2019000204 and 2019000206.
Top photo: More complex medical tests are generating enormous amounts of data. AI is helping analyze that data much faster – and time is of the essence for patients with cancer. (Photo by sinology/Getty Images)