Using privacy-preserving AI to combine genetic data across ancestries and diseases

Federated and transfer learning methods for cross-ancestry and cross-phenotype integration of genomic datasets

NIH-funded research Harvard University D/b/a Harvard School of Public Health · NIH-11232300

This project will build privacy-preserving machine learning tools to make genetic risk predictions more accurate for people from non-European ancestries.

Quick facts

Grant typeR01 grant
Study typeNIH-funded research
Funding institutionHarvard University D/b/a Harvard School of Public Health NIH-funded
Lab location1 site (Boston, United States)
Project IDNIH-11232300 on NIH RePORTER

What this research studies

Researchers are developing transfer learning and federated learning tools that let different biobanks share insights without moving private data. They plan to combine genetic results across ancestry groups, related health traits, and variant function information to improve polygenic risk scores. The work uses large datasets such as the All of Us program and other biobanks while protecting individual privacy. The goal is to reduce bias and make genetic risk information more useful for people of African and other under-represented ancestries.

Who could benefit from this research

Good fit: People from non-European ancestries (for example, African descent), especially those with or at risk for cardiovascular disease who are willing to share genetic and health data via a biobank, are the primary focus.

Not a fit: Individuals without available genetic data or those who cannot or will not share their data are unlikely to see direct benefit from this project in the short term.

Why it matters

Potential benefit: If successful, this could make polygenic risk scores more accurate for people from under-represented ancestries and help improve cardiovascular risk prediction for those groups.

How similar studies have performed: Previous studies show polygenic scores often work much better in European ancestry groups than others, and while some transfer-learning approaches have shown promise, federated, privacy-preserving integration across multiple biobanks is still a relatively new and developing approach.

Where this research is happening

Boston, United States

Researchers

About this research

  1. This is an active NIH-funded research project — typically early-stage science, not a clinical trial accepting patient enrollment.
  2. Some NIH-funded labs run parallel clinical studies or seek volunteers for related work. To check, contact the principal investigator or institution listed above.
  3. For full project details, budget, and progress reports, visit the official NIH RePORTER page below.
Conditions Cardiovascular Diseases
Last reviewed 2026-06-13 by the Find a Trial editorial team. Information on this page is for educational purposes and is not medical advice. Always consult qualified healthcare professionals about clinical trial participation.