# Beta Diversity

Discover the taxonomic relationships between multiple sample profiles

## What is beta diversity in microbiome analysis?

Simply, beta diversity is the analysis of taxonomic differences (i.e. composition) between samples. Whereas, alpha diversity considers the taxa abundances of samples. These samples can be grouped together to compare features between them e.g., the microbes found in microbiome samples that underwent ‘antibiotic use’ versus ‘no antibiotic use’.

## How does profiling work?

Below is the general procedure of our profiling tool:

## Profiling Tool Workflow

NGS raw data (FASTQ or FASTA format) are uploaded to our servers. Our profiling pipeline will automatically process your data which are converted into a data unit called an STP. An STP represents the taxonomic composition of your sample. In addition, data from public sources including the Human Microbiome Project and Short Read Archive (SRA) have been processed in advance, so they can be grouped and compared with your own STPs. Each STP contains information about run QC such as read length and the number of reads matched. You also get alpha-diversity statistics along with taxonomic hierarchy and composition which can be explored in the outputs.

Your own STPs or those from the public domain database can then be grouped into STP sets for comparison. The best way to group samples is to use metadata tags.

In what we call secondary analysis, two or more STP sets can then be compared for beta-diversity analytics. For example, you may find differentially present bacterial species between healthy and unhealthy human subjects. We provide a variety of statistical algorithms and parameters that can be run instantly and interactively.

## Analytics

Beta diversity analytics allow us to understand the relationship of microbial communities stored in multiple standard taxonomic profiles (STPs). Currently, these are presented as either **ordination analysis** or **hierarchical clustering**.

The first step for beta diversity is to calculate the distances among the set of STPs.

In **ordination** (PCA, PCoA, and NMDS techniques), similar species and samples are plotted close together whereas dissimilar species and samples are far apart. It involves taking complex community data and projecting it onto a lower dimensional space for visualization of patterns. These patterns are observed over gradients, whether environmental or some other variable.

The same distance matrix can be used to carry out the **hierarchical clustering** using the UPMGA algorithm (Unweighted Pair Group Method with Arithmetic Mean) which we use in our dendrograms.

## PCoA (Ordination)

In ordination analysis, the data reduction of the distance matrix is performed by the principal coordinate analysis (PCoA) to give the major axes of principal components (PCs). We take either the first two or three PCs to draw the scattergrams. The first ordination (PC1) corresponds to the largest gradient in the dataset and so on. This means that the first plot displays the highest variance in this dataset.

There are several different algorithms to calculate beta-diversity distances and applicable parameters; we use Jaccard and Bray Curtis indices. Values closer to 0 are similar and 1 dissimilar.

## PERMANOVA (**Hierarchical Clustering)**

**Hierarchical Clustering)**

Permutational multivariate analysis of variance (PERMANOVA), a non-parametric multivariate statistical test, is used to test the null hypothesis that the centroids and dispersion of the STP sets as defined by measured space are equivalent for all sets. A rejection of the null hypothesis means that either the centroid and/or the spread of the STPs is different between the STP sets.

PERMANOVA test is first performed with all STP sets, then for each pair of STP sets.

Last updated