research | Korem Lab

Our research

The goal of our research is to create microbiome-based diagnostics, clinical practices, and therapeutics, through the development and application of novel data analysis methods.

A major challenge in studying the microbiome lies in our limited ability to characterize, sample and manipulate it. In many cases, such as in the gut or lungs, it is difficult or impossible to sample these communities directly, forcing us to rely on dirty proxies such as stool; the intricate interactions with the host limit the applicability of animal models (and in some cases, no good models exist); and most microbes are unculturable, preventing detailed characterization in vitro. We therefore focus on multi-omic data from observational studies, which allow us to pursue the development of diagnostics, as well as combine these high-resolution data with new analysis methods that provide us with specific mechanistic hypothesis that could then be validated experimentally. We often conduct our own studies, particularly in area of women's health.

Research funding

We are grateful to the American taxpayer and other foundations and institutions who supported our work.

Interested in supporting our work?

Private funding is critical for fostering innovative research that is often considered risky by traditional funders. If you are interested in supporting our work, please contact Dr. Korem directly (tal.korem@columbia.edu).

Research interests

Comparative metagenomics

Comparative genomics is a key strategy for identifying the genetic basis of biological mechanisms, using isolate sequencing data to link genomic variability in related strains to differences in phenotypes. We are developing frameworks and algorithms that use sequence graphs to scale this up to comparative metagenomics: studying variations in the genome of multiple genomes across ecosystems.

Knowledge-driven metabolic analysis

We are adapting knowledge driven metabolic modelling approaches to the complex setting of the microbiome in order to predict the profile of microbially generated- and modulated- metabolites in a given interface with the host, identifying putative mechanistic pathways for host-microbiome interaction and potential intervention points.

Low-biomass microbiome analysis

Studying ecosystems with few microbes, such as the endometrium or tumor microenvironment, has distinct challenges, including contamination and batch effects. We are developing both methods and study design practices that make such analyses more robust and generalizable.

Women & reproductive health

We apply our research in diverse clinical settings, with a focus on adverse pregnancy outcomes and antibiotic resistance. We are searching for way to utilize the microbiome in personalizing medical care, and are seeking putative mechanisms through which microbes affect disease that could be manipulated by therapeutics.

Recent and past projects

Bespoke modeling of contamination and bias facilitates improved microbiome-based prediction models

Contamination and bias in microbiome studies has been a cause of major controversies in the field, for example with respect to the placental and tumor microbiomes. SCRuB performs high-precision identification and removal of contamination (microbial material originating in irrelevant sources, such as reagents) using process controls. SCRuB addresses several pressing challenges in decontamination: it is the first method that handles cross-contamination ("leakage") of genuine biological material into controls; performs partial decontamination of taxa that are both contaminants and genuinely present in samples; and incorporates information across multiple samples and controls to improve performance, rather than operating separately on each sample (Austin et al., Nature Biotechnology 2023). DEBIAS-M models technical variability in microbiome data under a quasi-mechanistic model of taxon-specific multiplicative biases. It performs "batch correction", removing differences between studies or batches that are due to variable efficiency of different experimental procedures. Importantly, it goes beyond simple correction to facilitate domain adaptation and development of models that generalize across datasets without leakage of testing data (Austin et al., Nature Microbiology 2025).

SCRuB iteratively uses the shared information across samples and controls to estimate well-to-well leakage and the composition of the contamination source. It then uses estimates of the contamination sources to infer the underlying composition of the samples, and so on until convergence.

Studying the interaction between the vaginal microbiome, metabolome, and adverse pregnancy outcomes

Despite many advances in diagnostics and precision medicine, most adverse pregnancy outcomes are still a "surprise" - they are typically detected late, when symptoms are overt; it is then often also too late to administer effective treatment. We are studying how the vaginal microbiome affects and predicts these outcomes.

We have recently conducted a large-scale analysis of the early-pregnancy vaginal microbiome and metabolome in a case-control study of preterm birth, the leading cause of neonatal morbidity and mortality. We found intriguing associations between several xenobiotics and subsequent preterm delivery, suggesting a role for exogenous exposures in its pathogenesis. We built supervised machine learning models able to predict subsequent prematurity, weeks-to-months in advance, with high accuracy, using measurements of the vaginal metabolome (Kindschuh et al., Nature Microbiology 2023). In another recent investigation, we found that strain diversity in the vaginal microbiome is also associated with preterm birth (Liao et al., Nature Communications 2023).

We have recently provided the first evidence that the deadly syndrome of preeclampsia, typically considered a late-pregnancy cardiovascular disease, is associated with the molecular milieu of the vaginal ecosystem as early as in the first trimester. We showed that a combination of distinct microbial signatures, lower presence of host immune factors, and imparied metabolic health, can predict preeclampsia over 6 months before diagnosis (Kindschuh et al., bioRxiv 2024).

We accurately predict the risk for subsequent preterm birth in held-out samples from metabolomics data collected early in pregnancy.

A data-driven approach reveals robust associations between serum metabolites and host, microbial, and environmental factors

The serum metabolome contains multiple biomarkers and causal agents that are important for a variety of human diseases. But even if these are identified, designing interventions that affect their levels remains a significant challenge. We took a data-driven approach to this problem, designing and contrasting machine-learning models that predict metabolite levels in held out data using different feature types - microbiome, diet, clinical data, etc. Our results show a strong contribution by microbiome and diet, with distinct contributions for each - identifying starting points for hundreds of potential interventions (Bar* & Korem* et al., Nature 2020). Our lab is currently combining this data-driven approach with knowledge-driven mechanistic methods to identify potential mechanisms by which the microbiome produces different metabolites.

The relative power of each feature type in predicting the serum metabolome.

Systematic examination of microbial structural variations facilitates mechanistic investigations

The same microbial strain could be very different in different people, and a difference in very few genes could have significant phenotypic effects. We analyzed sequencing coverage to systematically detect genomic structural variations across multiple microbes in two large cohorts. We found that these duplicated or deleted regions exhibit multiple novel associations with several disease risk factors. Better yet, examining genes coded in these regions allows us to raise hypotheses regarding potential mechanisms underlying these host-microbe interactions (Zeevi* & Korem* et al., Nature 2019).

Screen Shot 2019-03-22 at 10.41.05 PM.pn

Structural variation in Anaerostipes hadrus whose deletion is associated with higher disease risk. Genes in this region code a composite inositol catabolism - butyrate production pathway, potentially supplying the microbe with additional energy while supplying the host with beneficial butyrate.

Inferring dynamic microbial growth rates from a single, static snapshot

Most microbiome analysis approaches concentrate on compositions - which and how many bacteria or genes are present in a sample. We were able to infer the growth dynamics of multiple bacteria from just a single sample, by extracting and examining the DNA copy-number signal produced by bacterial DNA replication. We showed that these growth rates can potentially serve as biomarkers for disease, antibiotic treatment and pathogenic colonization (Korem et al., Science 2015, Joseph et al., Genome Research 2022).

Bacterial DNA replication generates a DNA copy number signal, which is discernible from coverage analysis (high and low coverage near the origin and terminus of replication).

Microbiome analysis informs personally tailored diets

The way our blood sugar levels respond to food is an important risk factor for diabetes, obesity, and other metabolic diseases. We profiled a large 800-participants cohort and showed that these sugar responses vary significantly between different individuals (even when they consume identical meals), and that using meal, lifestyle, and microbiome data allows us to accurately predict these responses personally for each individual and for any complex meal. We demonstrated the practical use of this approach, by demonstrating normalization of blood sugar levels in a prospective dietary intervention trial (Zeevi* & Korem* et al., Cell 2015). We later showed that for some dietary decisions, such as between white and whole-wheat bread, examining just the microbiome is probably enough to make an accurate prediction (Korem et al., Cell Metab. 2017)

Postprandial glycemic responses (PPGRs; y-axis) are reduced in a personally tailored glucose-lowering diet (green) compared to a glucose-increasing diet (red).