Title | Loss-of-function tolerance of enhancers in the human genome. |
Publication Type | Journal Article |
Year of Publication | 2020 |
Authors | Xu D, Gokcumen O, Khurana E |
Journal | PLoS Genet |
Volume | 16 |
Issue | 4 |
Pagination | e1008663 |
Date Published | 2020 Apr |
ISSN | 1553-7404 |
Keywords | Conserved Sequence, Disease, Enhancer Elements, Genetic, Gene Expression Regulation, Genetic Predisposition to Disease, Genome, Human, Humans, Loss of Function Mutation, Organ Specificity, Reproducibility of Results, ROC Curve, Supervised Machine Learning |
Abstract | Previous studies have surveyed the potential impact of loss-of-function (LoF) variants and identified LoF-tolerant protein-coding genes. However, the tolerance of human genomes to losing enhancers has not yet been evaluated. Here we present the catalog of LoF-tolerant enhancers using structural variants from whole-genome sequences. Using a conservative approach, we estimate that individual human genomes possess at least 28 LoF-tolerant enhancers on average. We assessed the properties of LoF-tolerant enhancers in a unified regulatory network constructed by integrating tissue-specific enhancers and gene-gene interactions. We find that LoF-tolerant enhancers tend to be more tissue-specific and regulate fewer and more dispensable genes relative to other enhancers. They are enriched in immune-related cells while enhancers with low LoF-tolerance are enriched in kidney and brain/neuronal stem cells. We developed a supervised learning approach to predict the LoF-tolerance of all enhancers, which achieved an area under the receiver operating characteristics curve (AUROC) of 98%. We predict 3,519 more enhancers would be likely tolerant to LoF and 129 enhancers that would have low LoF-tolerance. Our predictions are supported by a known set of disease enhancers and novel deletions from PacBio sequencing. The LoF-tolerance scores provided here will serve as an important reference for disease studies. |
DOI | 10.1371/journal.pgen.1008663 |
Alternate Journal | PLoS Genet |
PubMed ID | 32243438 |
PubMed Central ID | PMC7159235 |
Grant List | R01 CA218668 / CA / NCI NIH HHS / United States |