Englander Institute for Precision Medicine

CAMP: a modular metagenomics analysis system for integrated multistep data exploration.

TitleCAMP: a modular metagenomics analysis system for integrated multistep data exploration.
Publication TypeJournal Article
Year of Publication2026
AuthorsMak L, Tierney B, Wei W, Ronkowski C, Toscan RBrizola, Turhan B, Toomey M, Andrade-Martinez JSebastian, Fu C, Lucaci AG, Solano AHenrique B, Setubal JCarlos, Henriksen JR, Zimmerman S, Kopbayeva M, Noyvert A, Iwan Z, Kar S, Nakazawa N, Meleshko D, Horyslavets D, Kantsypa V, Frolova A, Kahles A, Danko D, Elhaik E, Labaj P, Mangul S, Mason CE, Hajirasouliha I
Corporate AuthorsInternational MetaSUB Consortium
JournalNAR Genom Bioinform
Volume8
Issue1
Paginationlqaf172
Date Published2026 Mar
ISSN2631-9268
KeywordsComputational Biology, Metagenomics, Microbiota, Software, Workflow
Abstract

Computational analysis of large-scale metagenomics sequencing datasets provides valuable isolate-level taxonomic and functional insights from complex microbial communities. However, the ever-expanding ecosystem of metagenomics-specific methods and file formats makes designing scalable workflows and seamlessly exploring output data increasingly challenging. Although one-click bioinformatics pipelines can help organize these tools into workflows, they face compatibility and maintainability challenges that can prevent replication. To address the gap in easily extensible yet robustly distributable metagenomics workflows, we have developed the Core Analysis Modular Pipeline (CAMP), a module-based metagenomics analysis system written in Snakemake, with a standardized module and directory architecture. Each module can run independently or in sequence to produce target data formats (e.g. short-read preprocessing alone or followed by de novo assembly), and provides output summary statistics reports and Jupyter notebook-based visualizations. We applied CAMP to a set of 10 metagenomics samples, demonstrating how a modular analysis system with built-in data visualization facilitates rich seamless communication between outputs from different analytical purposes. The CAMP ecosystem (module template and analysis modules) can be found at https://github.com/Meta-CAMP.

DOI10.1093/nargab/lqaf172
Alternate JournalNAR Genom Bioinform
PubMed ID41551931
PubMed Central IDPMC12809600
Grant ListR01 AI151059 / AI / NIAID NIH HHS / United States
R35 GM138152 / GM / NIGMS NIH HHS / United States
T32 GM083937 / GM / NIGMS NIH HHS / United States
U54 AG089334 / AG / NIA NIH HHS / United States

Weill Cornell Medicine Englander Institute for Precision Medicine 413 E 69th Street
Belfer Research Building
New York, NY 10021