This is an excerpt of a story that appeared on GenomeWeb. Read the full article here.

Investigators involved in the WCM-NYGC Weill Cornell Medicine-New York Genome Center (WCM-NYGC) for Functional and Clinical Interpretation of Tumor Profiles collaboration recently received just shy of $490,000 from the National Cancer Institute to help further data analysis for the Cancer Genome Atlas project.

Under the grant, the WCM-NYGC collaborators will look to handle coding mutations in clinical contexts including relevance to immunotherapies. They’ll also explore the role of driver non-coding mutations in transcriptional regulation, as well as the driving role of structural variations as one of 11 specialized genomic data centers that will be responsible for analyzing genomic, epigenomic, transcriptomic, and other kinds of data for the next phase of the Cancer Genome Atlas.

Investigators at the institutions submitted an application for the center last year in response to an NCI funding opportunity that called for applications to establish up to 14 specialized genomic data centers. The NCI ultimately approved 11 applications to implement computational tools and pipelines for processing, integrating, and visualizing genomic data. Teams were asked to focus on at least one of the following areas: coding mutations, non-coding mutations, expression/mRNA analysis, copy number analysis, miRNA analysis, long non-coding RNA analysis, batch effects, methylation analysis, pathway analysis, and protein expression analysis.

Weill Cornell Medicine’s Dr. Olivier Elemento

Olivier Elemento, associate director of WCM’s Institute for Computational Biomedicine and one of three co-principal investigators on the project, told GenomeWeb that the two institutions chose to submit a joint application to the FOA because they both saw an opportunity to bring their experiences in clinical variant interpretation and reporting as well as their computational infrastructure to bear on a large number of samples across many different tumor types. Researchers from both institutions have collaborated on several projects in the past and published a number of papers together including one published last year in JAMA Oncology that described an assessment of treatment response biomarkers for a range of metastatic cancers.

The sheer size of the data that will likely come out of this phase of the TCGA also offered a compelling reason for collaboration. Both centers have sizable computational in-house infrastructure but their individual resources may not be sufficient on their own to handle what will likely be petabytes of data generated by this phase of the project. It’s not clear yet exactly what the number will be but Elemento expects that it will be substantially more than was generated in the previous iteration of the project. “We will need a tremendous amount of resources to be able to do the analysis that we propose to do in the grant,” he said. “Working with the NYGC will make us be able to cope with the data analysis challenges that the TCGA is going to have.”

The collaborators will make use of the Precision Medicine Knowledgebase, a database of clinical-grade tumor mutations, annotations, and interpretations gleaned from patient samples, that was developed in Elemento’s laboratory. The database supports Weill-Cornell’s Exome Cancer Test, (EXaCT-1) which is used to detect point mutations, insertions and deletions, and copy number variations in patient samples. On one hand the investigators plan to use existing information within the PMKB to annotate variants identified in the TCGA samples but they also plan to develop a new module for the resource through which they intend to crowdsource variant curations.

The way this will work, Elemento explained, is that he and his colleagues will upload variants identified from the TCGA samples to the PMKB and then reach out to experts in the biomedical community and ask them to submit clinical-grade interpretations based on peer-reviewed literature. A pre-selected team of board-certified pathologists will evaluate the submissions and modify or approve them as they see fit.

“[We realize] that interpretation of mutations in the clinical context is very labor intensive and hard for a single site to do,” he said. “So we’ll do this across at least [our] two sites but we’ll also make it possible for other sites to contribute interpretations.” They will also pull in additional data from other repositories, such as mutational frequency data, to provide stronger support for clinical interpretations as needed, he added.


Other Cornell investigators include Mark Rubin, M.D., Marcin Imielinski, M.D., Ph.D., and Ekta Khurana, Ph.D.