📖
Terms and Definitions
  • General Terminology
  • Bioinformatics
    • Analyses
    • reAnalyze #1 - Skin Disease
    • reAnalyze #2 - Skin Ageing
    • reAnalyze #3 - Scalp Dandruff
    • reAnalyze #4 - Vaginal Infection
  • Taxonomy
  • Genome Identification Report
  • Clinical Metagenomics Report
Powered by GitBook
On this page
  • Overview
  • Sample Information
  • ​Name
  • Pipeline & Databse
  • Input
  • Sequencing data statistics
  • Input file(s) & QC-passed & Host-depleted
  • Median length
  • Interquartile range (IQR)
  • Q20 rate
  • Q30 rate
  • Number of reads
  • Number of bases
  • Retained
  • Microorganisms detected with the highest coverage
  • MLST
  • Mapped reads
  • Coverage breadth
  • Coverage depth
  • Relative abundance
  • Recognized as pathogen
  • Antibiotic resistance genes detected
  • Gene name
  • Antibiotics (subclasses)
  • Breadth
  • Depth
  • Virulence factor genes detected
  • Gene Name
  • Virulence Factor
  • Breadth
  • Depth
  • Full lists
  • Full list of 'potential pathogens' detected from the sample
  • Full list of antibiotic resistance genes detected from the sample

Clinical Metagenomics Report

Beta-test users can use this page to answer any questions relating to our clinical metagenomics report, currently in its preliminary form.

PreviousGenome Identification Report

Last updated 11 months ago

Overview


Sample Information

​Name

This is the sample's name that was uploaded to EzBioCloud.

Pipeline & Databse

For consistency in analyses and referencing, the analytic engine and genome reference databases (genes and genomes) are provided.

Input

Uploaded sample input type: PAIRED_FQ: paired FASTQ / SINGLE_FQ: single FASTQ.


Sequencing data statistics

Input file(s) & QC-passed & Host-depleted

The input files are all the reads uploaded to EzBioCloud whereas QC-passed are the remaining reads that passed quality control standards. Host-depleted reads are the remaining reads after the removal of any that match the sample host e.g. removal of human-genome-associated reads.

Median length

The median length is the middle value of the read lengths when all reads are ordered by length. It indicates the typical length of the sequencing reads in the dataset.

Interquartile range (IQR)

The interquartile range is the range between the first quartile (25th percentile) and the third quartile (75th percentile) of read lengths. It measures the spread of the central 50% of the data, providing insight into the variability of read lengths.

Q20 rate

The Q20 rate represents the percentage of bases in the sequencing reads with a quality score of 20 or higher. A quality score of 20 corresponds to a 1% error rate, meaning there is a 99% chance the base is called correctly.

Q30 rate

The Q30 rate is the percentage of bases with a quality score of 30 or higher. A quality score of 30 indicates a 0.1% error rate, meaning there is a 99.9% chance the base is called correctly.

Number of reads

The number of reads refers to the total count of sequencing reads in the sample. Each read represents a fragment of the DNA or RNA that was sequenced.

Number of bases

The number of bases is the total number of nucleotide bases (A, T, C, G) present in all the reads combined. It is a measure of the total amount of sequenced data.

Retained

Retained refers to the percentage of sequencing data that remains after the processing steps, quality control and host depletion i.e. how much of the initial data was retained for analysis after filtering out low-quality reads and unwanted sequences.


Microorganisms detected with the highest coverage

MLST

Multi-Locus Sequence Typing (MLST) is a method for subtyping bacteria based on the sequence of several housekeeping genes. 'Unresolved' indicates there was not enough coverage of the MLST genes to determine the type.

Mapped reads

Mapped reads are the number of sequencing reads that have been successfully aligned to the reference genomes of the microorganisms listed. This indicates how many of the reads in your sample correspond to these particular microorganisms.

Coverage breadth

Coverage breadth refers to the proportion of the reference genome that is covered by the mapped reads. The percentage indicates how much of the genome is represented in the sequencing data.

Coverage depth

Coverage depth is the average number of times each base in the reference genome is covered by the mapped reads. It is a measure of the redundancy of the sequencing data and indicates the robustness of the sequence data. 'Over aligned sites' is the coverage depth only over the places where reads are covered in the reference genomes. 'Over whole reference' includes the whole reference genome which may only be partially covered by the reads in your sample. A low coverage breadth will lead to a lower over the whole reference genome coverage depth.

Relative abundance

Relative abundance is the proportion of the total sequencing reads that are attributed to a specific microorganism. It provides insight into the prevalence of each microorganism in the sample.

Recognized as pathogen

This term indicates whether the indicated microorganism is known to be pathogenic, meaning it can cause disease. Either 'yes' or 'no', it helps in assessing the potential clinical relevance of the detected microorganisms which can be tested by traditional clinical identification techniques.


Antibiotic resistance genes detected

Gene name

This is the specific name of the antibiotic resistance gene detected in the sample. These genes confer resistance to antibiotics and are identified by their specific sequences.

Antibiotics (subclasses)

This column lists the classes and subclasses of antibiotics to which the resistance gene confers resistance. For example, 'Macrolide (Macrolide)' means the gene provides resistance to antibiotics in the macrolide class which has been found in this example.

Breadth

In this context, breadth refers to the proportion of the resistance gene that is covered by the mapped reads. It is usually expressed as a percentage and indicates how much of the gene is represented in the sequencing data. A breadth of 100% means the entire gene sequence is covered by reads, while a lower percentage indicates partial coverage.

Depth

Depth here refers to the average number of times each base in the resistance gene sequence is covered by the mapped reads. It is a measure of the redundancy of the sequencing data for that particular gene. Higher depth values indicate more robust and reliable sequencing data, which can provide greater confidence in the detection and characterization of the resistance gene.


Virulence factor genes detected

Gene Name

This column lists the specific names of the virulence factor genes detected in the sample. These genes are associated with the ability of pathogens to infect and cause disease in a host.

Virulence Factor

This indicates the function or role of the gene in contributing to the virulence of the microorganism. Examples include:

  • Capsule (Immune modulation): Genes involved in the formation of a protective capsule around the bacterium, helping it evade the host's immune system.

  • Effector delivery system (T6SS, T2SS): Genes involved in the secretion systems that deliver toxins or effector molecules into host cells.

  • Exotoxin (ShET2): Genes that produce toxins secreted by bacteria to damage the host.

  • Adherence (Type 1 fimbriae, Curli fibers, ECP): Genes that help the bacteria adhere to host cells.

  • Regulation (Fur, SigA): Genes involved in regulatory functions that control the expression of other virulence factors.

  • Stress survival (KatA): Genes that help bacteria survive under stressful conditions, such as oxidative stress.

Breadth

Breadth refers to the proportion of the virulence factor gene that is covered by the mapped reads. It is expressed as a percentage, indicating how much of the gene sequence is represented in the sequencing data. For example, a breadth of 53.4% means that 53.4% of the gene is covered by the sequencing reads.

Depth

Depth refers to the average number of times each base in the virulence factor gene sequence is covered by the mapped reads. It is a measure of the redundancy and reliability of the sequencing data for that particular gene. Higher depth values indicate greater confidence in the detection and characterization of the gene. For instance, a depth of 1.033x means each base in the gene sequence is covered, on average, slightly more than once by the sequencing reads.


Full lists

Full list of 'potential pathogens' detected from the sample

​​

Full list of antibiotic resistance genes detected from the sample

Summary page from our preliminary clinical metagenomics report.
Information relating to the sample run through the clinical metagenomics tool.
Statistics detailing the size and quality of your sample, including read and base information.
Microorganisms in your sample with the highest coverage. Taxa listed in these top three Bacteria, Fungi, and Viruses coverage lists indicate a high relative abundance of genetic material in your sample.
Another example, two viruses that are putative pathogens were detected in this sample.
Antibiotic resistance genes found in your sample. These genes could potentially be part of any of the microbial genomes identified. The top three listed here have fully been mapped (100%) and at a high depth of coverage (>50x).
Virulence factor genes