| Title | Variational autoencoders learn transferrable representations of metabolomics data. |
| Publication Type | Journal Article |
| Year of Publication | 2022 |
| Authors | Gomari DP, Schweickart A, Cerchietti L, Paietta E, Fernandez H, Al-Amin H, Suhre K, Krumsiek J |
| Journal | Commun Biol |
| Volume | 5 |
| Issue | 1 |
| Pagination | 645 |
| Date Published | 2022 Jun 30 |
| ISSN | 2399-3642 |
| Keywords | Diabetes Mellitus, Type 2, Humans, Metabolomics, Principal Component Analysis |
| Abstract | Dimensionality reduction approaches are commonly used for the deconvolution of high-dimensional metabolomics datasets into underlying core metabolic processes. However, current state-of-the-art methods are widely incapable of detecting nonlinearities in metabolomics data. Variational Autoencoders (VAEs) are a deep learning method designed to learn nonlinear latent representations which generalize to unseen data. Here, we trained a VAE on a large-scale metabolomics population cohort of human blood samples consisting of over 4500 individuals. We analyzed the pathway composition of the latent space using a global feature importance score, which demonstrated that latent dimensions represent distinct cellular processes. To demonstrate model generalizability, we generated latent representations of unseen metabolomics datasets on type 2 diabetes, acute myeloid leukemia, and schizophrenia and found significant correlations with clinical patient groups. Notably, the VAE representations showed stronger effects than latent dimensions derived by linear and non-linear principal component analysis. Taken together, we demonstrate that the VAE is a powerful method that learns biologically meaningful, nonlinear, and transferrable latent representations of metabolomics data. |
| DOI | 10.1038/s42003-022-03579-3 |
| Alternate Journal | Commun Biol |
| PubMed ID | 35773471 |
| PubMed Central ID | PMC9246987 |
| Grant List | / WT_ / Wellcome Trust / United Kingdom U19 AG063744 / AG / NIA NIH HHS / United States U24 CA196172 / CA / NCI NIH HHS / United States / DH_ / Department of Health / United Kingdom UG1 CA189859 / CA / NCI NIH HHS / United States / MRC_ / Medical Research Council / United Kingdom U10 CA180820 / CA / NCI NIH HHS / United States |