> For the complete documentation index, see [llms.txt](https://kb.ezbiocloud.net/home/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://kb.ezbiocloud.net/home/protocols/shotgun-microbiome/analyze-datasets/differential-abundance.md). # Differential Abundance Let’s run a **Differential Abundance** analysis of the profiles in this dataset from the Parkinson’s disease study. ## Select ‘Differential Abundance’

## Select your dataset

## Fill in details for future referencing ## Input the arguments according to your desired output

## Select ‘Run Analysis’

## Wait for the analysis to complete

## Discover Differential Analysis Differential abundance analysis (DAA) finds candidate biomarkers for the group of interest. For example, several questions can be addressed, e.g. ‘Which microbes are more abundant in the IBD group compared to the healthy group?’. Tools for DAA can find taxa enriched in IBD or healthy groups (We can find candidate biomarkers based on (1) the fold change value between groups and (2) the p-value). However, results may vary depending on the DAA tool used. Therefore, seven DAA tools, including ANCOM-BC, ALDEx2, DESeq2, LEfSe, MaAsLin2, SIAMCAT and LinDA, were implemented to find candidate biomarkers, and the results of each DAA tool were compared to find consensus biomarkers. #### Input taxonomic profiles and metadata with group information (this engine is currently for the two groups: e.g., case VS control) #### Output ANCOM-BC/, ALDEx2/, DESeq2/, LEfSe/, MaAsLin2/, SIAMCAT/, LinDA/: Candidate biomarkers for each tool. \*\_raw\.csv is the first output when people run that tool (not processed, columns are different between each tool). \*.csv is the processed output from the \*\_raw\.csv (Columns are matched between each tool). Columns in the \*.csv: * effect: effect size (for example, fold change values between the case and the control groups) * se: standard error of the effect size * pval: p-value * qval: q-value (p-value after multiple testing correction, we often use this value for selecting significant features) * direct: ‘+’ means enriched in the case group. * sig: ‘1’ means significant feature based on the q-value threshold (like 0.05), ‘0’ means ‘not significant’. * score: if direct is ‘+’, sig*1. if direct is ‘-‘, sig*-1. Consensus/: a consensus table and a consensus plot (upsetPlot). ‘total\_result.csv’ is a concatenated table from \*.csv files of each tool. From ‘total\_result.csv’, significant features selected from one or more DAA tools were collected (upsetPlot.csv). upsetPlot.csv was visualized (upsetPlot.png). In the plot, ‘red’ means case-enriched, and ‘blue’ means control-enriched.