Differential Abundance
Last updated
Last updated
EzBioCloud© 2024. All Rights Reserved
Let’s run a Differential Abundance analysis of the profiles in this dataset from the Parkinson’s disease study.
Differential abundance analysis (DAA) finds candidate biomarkers for the group of interest. For example, several questions can be addressed, e.g. ‘Which microbes are more abundant in the IBD group compared to the healthy group?’. Tools for DAA can find taxa enriched in IBD or healthy groups (We can find candidate biomarkers based on (1) the fold change value between groups and (2) the p-value). However, results may vary depending on the DAA tool used. Therefore, seven DAA tools, including ANCOM-BC, ALDEx2, DESeq2, LEfSe, MaAsLin2, SIAMCAT and LinDA, were implemented to find candidate biomarkers, and the results of each DAA tool were compared to find consensus biomarkers.
taxonomic profiles and metadata with group information (this engine is currently for the two groups: e.g., case VS control)
ANCOM-BC/, ALDEx2/, DESeq2/, LEfSe/, MaAsLin2/, SIAMCAT/, LinDA/: Candidate biomarkers for each tool. *_raw.csv is the first output when people run that tool (not processed, columns are different between each tool). *.csv is the processed output from the *_raw.csv (Columns are matched between each tool).
Columns in the *.csv:
effect: effect size (for example, fold change values between the case and the control groups)
se: standard error of the effect size
pval: p-value
qval: q-value (p-value after multiple testing correction, we often use this value for selecting significant features)
direct: ‘+’ means enriched in the case group.
sig: ‘1’ means significant feature based on the q-value threshold (like 0.05), ‘0’ means ‘not significant’.
score: if direct is ‘+’, sig1. if direct is ‘-‘, sig-1.
Consensus/: a consensus table and a consensus plot (upsetPlot). ‘total_result.csv’ is a concatenated table from *.csv files of each tool. From ‘total_result.csv’, significant features selected from one or more DAA tools were collected (upsetPlot.csv). upsetPlot.csv was visualized (upsetPlot.png). In the plot, ‘red’ means case-enriched, and ‘blue’ means control-enriched.