Using privacy-preserving AI to combine genetic data across ancestries and diseases
Federated and transfer learning methods for cross-ancestry and cross-phenotype integration of genomic datasets
This project will build privacy-preserving machine learning tools to make genetic risk predictions more accurate for people from non-European ancestries.
Quick facts
| Grant type | R01 grant |
|---|---|
| Study type | NIH-funded research |
| Funding institution | Harvard University D/b/a Harvard School of Public Health NIH-funded |
| Lab location | 1 site (Boston, United States) |
| Project ID | NIH-11232300 on NIH RePORTER |
What this research studies
Researchers are developing transfer learning and federated learning tools that let different biobanks share insights without moving private data. They plan to combine genetic results across ancestry groups, related health traits, and variant function information to improve polygenic risk scores. The work uses large datasets such as the All of Us program and other biobanks while protecting individual privacy. The goal is to reduce bias and make genetic risk information more useful for people of African and other under-represented ancestries.
Who could benefit from this research
Good fit: People from non-European ancestries (for example, African descent), especially those with or at risk for cardiovascular disease who are willing to share genetic and health data via a biobank, are the primary focus.
Not a fit: Individuals without available genetic data or those who cannot or will not share their data are unlikely to see direct benefit from this project in the short term.
Why it matters
Potential benefit: If successful, this could make polygenic risk scores more accurate for people from under-represented ancestries and help improve cardiovascular risk prediction for those groups.
How similar studies have performed: Previous studies show polygenic scores often work much better in European ancestry groups than others, and while some transfer-learning approaches have shown promise, federated, privacy-preserving integration across multiple biobanks is still a relatively new and developing approach.
Where this research is happening
Boston, United States
- Harvard University D/b/a Harvard School of Public Health — Boston, United States (Active)
Researchers
- Principal investigator: Duan, Rui — Harvard University D/b/a Harvard School of Public Health
- Study coordinator: Duan, Rui
About this research
- This is an active NIH-funded research project — typically early-stage science, not a clinical trial accepting patient enrollment.
- Some NIH-funded labs run parallel clinical studies or seek volunteers for related work. To check, contact the principal investigator or institution listed above.
- For full project details, budget, and progress reports, visit the official NIH RePORTER page below.