With the advent of next generation sequencing technologies, conventional data analysis methods are becoming inadequate for the analysis of highly complex and voluminous microbiome data. Furthermore, current population-scale microbiome studies usually deal with complex variations of microbiome associated with large-scale host phenotypes or environment types.
To enable the discovery of underappreciated but significant host-associated patterns from comprehensive microbiome sequencing and sample phenotype metadata, it is urgent to develop novel and effective methods suited for exploring population-scale microbiome datasets.
The research team led by Dr. ZHOU Haokui and supervised by Prof. ZHAO Guoping at the Shenzhen Institutes of Advanced Technology (SIAT) of the Chinese Academy of Sciences developed tmap as an integrative framework for both pattern discovery and hypothesis generation for population-scale microbiome studies (Fig. 1). Their study was published in Genome Biology on Dec. 23.
Fig. 1. Overview of the tmap workflow for integrative microbiome data analysis. (Image by SIAT)
Fig. 2. Stratification of the FGFP (Flemish Gut Flora project) microbiomes with host covariates. (Image by SIAT)
In tmap, the researchers adopted the Mapper algorithm for topological data analysis to construct an informative and compact network representation of high-dimensional microbiome dataset. tmap enabled them to identify association of taxa or metadata within the network and extract enrichment subnetworks of various association patterns.
They applied tmap on both synthetic microbiomes and real-world microbiome datasets from the studies of human populations (such as the FGFP study, Fig. 2) and earth environmental samples. They also compared tmap to conventional methods to demonstrate its superiority and effectiveness.
"tmap is able to integrate large-scale microbiome data with complex host phenotype metadata to systematically characterize the interrelations between host covariates and microbiome taxa, based on more efficient stratification and association analyses," said Dr. ZHOU.
Besides, tmap was capable in capturing both linear and nonlinear associations, which was superior to conventional methods. Based on a network representation of microbiome profiles, tmap was able to extract complex host-microbiome association patterns, providing insights to inform further hypothesis-driven studies.
tmap is provided as an open-source software at: https://github.com/GPZ-Bioinfo/tmap.
Detailed tutorials and online documents are available at: https://tmap.readthedocs.io/en/latest/.
52 Sanlihe Rd., Beijing,
Copyright © 2002 - Chinese Academy of Sciences