Machine-learning tools to better understand how genes and the environment affect health
Next-Generation Algorithms in Statistical Genetics Based on Modern Machine Learning
Researchers are building new computer methods to use very large genetic and health datasets to help predict disease risk and find genetic causes of illness.
Quick facts
| Grant type | NIH-funded research |
|---|---|
| Study type | NIH-funded research |
| Funding institution | Cornell University NIH-funded |
| Lab location | 1 site (Ithaca, United States) |
| Project ID | NIH-11159437 on NIH RePORTER |
What this research studies
This project develops modern machine-learning algorithms to analyze millions of human genomes alongside clinical data and environmental information. It focuses on modeling genetic variation, reanalyzing genome-wide association study data, and predicting health risk from genes plus environment. The team will create open-source software for tasks like genetic imputation, haplotyping, identifying likely causal variants, and computing genetic risk scores. These tools are meant to help researchers and clinicians turn big genetic datasets into insights that could inform prevention and personalized care.
Who could benefit from this research
Good fit: People who can provide or share genetic data and medical records—especially those with well-documented diseases or family histories—would be the most relevant contributors to this work.
Not a fit: Patients without available genetic or clinical data, or whose conditions are driven mainly by non-genetic factors, are less likely to see direct benefits in the near term.
Why it matters
Potential benefit: If successful, the work could produce more accurate genetic risk predictions and clearer identification of disease-related variants that support more personalized prevention and treatment strategies.
How similar studies have performed: Related statistical-genetics and machine-learning efforts have improved risk prediction and variant discovery, but applying large-scale modern ML across millions of genomes and clinical records is relatively new and still being tested.
Where this research is happening
Ithaca, United States
- Cornell University — Ithaca, United States (Active)
Researchers
- Principal investigator: Kuleshov, Volodymyr — Cornell University
- Study coordinator: Kuleshov, Volodymyr
About this research
- This is an active NIH-funded research project — typically early-stage science, not a clinical trial accepting patient enrollment.
- Some NIH-funded labs run parallel clinical studies or seek volunteers for related work. To check, contact the principal investigator or institution listed above.
- For full project details, budget, and progress reports, visit the official NIH RePORTER page below.