An atlas of genetic scores to predict multi-omic traits.
Xu Y., Ritchie SC., Liang Y., Timmers PRHJ., Pietzner M., Lannelongue L., Lambert SA., Tahir UA., May-Wilson S., Foguet C., Johansson Å., Surendran P., Nath AP., Persyn E., Peters JE., Oliver-Williams C., Deng S., Prins B., Luan J., Bomba L., Soranzo N., Di Angelantonio E., Pirastu N., Tai ES., van Dam RM., Parkinson H., Davenport EE., Paul DS., Yau C., Gerszten RE., Mälarstig A., Danesh J., Sim X., Langenberg C., Wilson JF., Butterworth AS., Inouye M.
The use of omic modalities to dissect the molecular underpinnings of common diseases and traits is becoming increasingly common. But multi-omic traits can be genetically predicted, which enables highly cost-effective and powerful analyses for studies that do not have multi-omics1. Here we examine a large cohort (the INTERVAL study2; n = 50,000 participants) with extensive multi-omic data for plasma proteomics (SomaScan, n = 3,175; Olink, n = 4,822), plasma metabolomics (Metabolon HD4, n = 8,153), serum metabolomics (Nightingale, n = 37,359) and whole-blood Illumina RNA sequencing (n = 4,136), and use machine learning to train genetic scores for 17,227 molecular traits, including 10,521 that reach Bonferroni-adjusted significance. We evaluate the performance of genetic scores through external validation across cohorts of individuals of European, Asian and African American ancestries. In addition, we show the utility of these multi-omic genetic scores by quantifying the genetic control of biological pathways and by generating a synthetic multi-omic dataset of the UK Biobank3 to identify disease associations using a phenome-wide scan. We highlight a series of biological insights with regard to genetic mechanisms in metabolism and canonical pathway associations with disease; for example, JAK-STAT signalling and coronary atherosclerosis. Finally, we develop a portal ( https://www.omicspred.org/ ) to facilitate public access to all genetic scores and validation results, as well as to serve as a platform for future extensions and enhancements of multi-omic genetic scores.