Clinical Metagenomics Report

Beta-test users can use this page to answer any questions relating to our clinical metagenomics report, currently in its preliminary form.

Overview

Sample Information

Name

This is the sample's name that was uploaded to EzBioCloud.

Pipeline & Databse

For consistency in analyses and referencing, the analytic engine and genome reference databases (genes and genomes) are provided.

Input

Uploaded sample input type: PAIRED_FQ: paired FASTQ / SINGLE_FQ: single FASTQ.

Sequencing data statistics

Input file(s) & QC-passed & Host-depleted

The input files are all the reads uploaded to EzBioCloud whereas QC-passed are the remaining reads that passed quality control standards. Host-depleted reads are the remaining reads after the removal of any that match the sample host e.g. removal of human-genome-associated reads.

Median length

The median length is the middle value of the read lengths when all reads are ordered by length. It indicates the typical length of the sequencing reads in the dataset.

Interquartile range (IQR)

The interquartile range is the range between the first quartile (25th percentile) and the third quartile (75th percentile) of read lengths. It measures the spread of the central 50% of the data, providing insight into the variability of read lengths.

Q20 rate

The Q20 rate represents the percentage of bases in the sequencing reads with a quality score of 20 or higher. A quality score of 20 corresponds to a 1% error rate, meaning there is a 99% chance the base is called correctly.

Q30 rate

The Q30 rate is the percentage of bases with a quality score of 30 or higher. A quality score of 30 indicates a 0.1% error rate, meaning there is a 99.9% chance the base is called correctly.

Number of reads

The number of reads refers to the total count of sequencing reads in the sample. Each read represents a fragment of the DNA or RNA that was sequenced.

Number of bases

The number of bases is the total number of nucleotide bases (A, T, C, G) present in all the reads combined. It is a measure of the total amount of sequenced data.

Retained

Retained refers to the percentage of sequencing data that remains after the processing steps, quality control and host depletion i.e. how much of the initial data was retained for analysis after filtering out low-quality reads and unwanted sequences.

Microorganisms detected with the highest coverage

MLST

Multi-Locus Sequence Typing (MLST) is a method for subtyping bacteria based on the sequence of several housekeeping genes. 'Unresolved' indicates there was not enough coverage of the MLST genes to determine the type.

Mapped reads

Mapped reads are the number of sequencing reads that have been successfully aligned to the reference genomes of the microorganisms listed. This indicates how many of the reads in your sample correspond to these particular microorganisms.

Coverage breadth

Coverage breadth refers to the proportion of the reference genome that is covered by the mapped reads. The percentage indicates how much of the genome is represented in the sequencing data.

Coverage depth

Coverage depth is the average number of times each base in the reference genome is covered by the mapped reads. It is a measure of the redundancy of the sequencing data and indicates the robustness of the sequence data. 'Over aligned sites' is the coverage depth only over the places where reads are covered in the reference genomes. 'Over whole reference' includes the whole reference genome which may only be partially covered by the reads in your sample. A low coverage breadth will lead to a lower over the whole reference genome coverage depth.

Relative abundance

Relative abundance is the proportion of the total sequencing reads that are attributed to a specific microorganism. It provides insight into the prevalence of each microorganism in the sample.

Recognized as pathogen

This term indicates whether the indicated microorganism is known to be pathogenic, meaning it can cause disease. Either 'yes' or 'no', it helps in assessing the potential clinical relevance of the detected microorganisms which can be tested by traditional clinical identification techniques.

Antibiotic resistance genes detected

Gene name

This is the specific name of the antibiotic resistance gene detected in the sample. These genes confer resistance to antibiotics and are identified by their specific sequences.

Antibiotics (subclasses)

This column lists the classes and subclasses of antibiotics to which the resistance gene confers resistance. For example, 'Macrolide (Macrolide)' means the gene provides resistance to antibiotics in the macrolide class which has been found in this example.

Breadth

In this context, breadth refers to the proportion of the resistance gene that is covered by the mapped reads. It is usually expressed as a percentage and indicates how much of the gene is represented in the sequencing data. A breadth of 100% means the entire gene sequence is covered by reads, while a lower percentage indicates partial coverage.

Depth

Depth here refers to the average number of times each base in the resistance gene sequence is covered by the mapped reads. It is a measure of the redundancy of the sequencing data for that particular gene. Higher depth values indicate more robust and reliable sequencing data, which can provide greater confidence in the detection and characterization of the resistance gene.

Virulence factor genes detected

Gene Name

This column lists the specific names of the virulence factor genes detected in the sample. These genes are associated with the ability of pathogens to infect and cause disease in a host.

Virulence Factor

This indicates the function or role of the gene in contributing to the virulence of the microorganism. Examples include:

Capsule (Immune modulation): Genes involved in the formation of a protective capsule around the bacterium, helping it evade the host's immune system.
Effector delivery system (T6SS, T2SS): Genes involved in the secretion systems that deliver toxins or effector molecules into host cells.
Exotoxin (ShET2): Genes that produce toxins secreted by bacteria to damage the host.
Adherence (Type 1 fimbriae, Curli fibers, ECP): Genes that help the bacteria adhere to host cells.
Regulation (Fur, SigA): Genes involved in regulatory functions that control the expression of other virulence factors.
Stress survival (KatA): Genes that help bacteria survive under stressful conditions, such as oxidative stress.

Breadth

Breadth refers to the proportion of the virulence factor gene that is covered by the mapped reads. It is expressed as a percentage, indicating how much of the gene sequence is represented in the sequencing data. For example, a breadth of 53.4% means that 53.4% of the gene is covered by the sequencing reads.

Depth

Depth refers to the average number of times each base in the virulence factor gene sequence is covered by the mapped reads. It is a measure of the redundancy and reliability of the sequencing data for that particular gene. Higher depth values indicate greater confidence in the detection and characterization of the gene. For instance, a depth of 1.033x means each base in the gene sequence is covered, on average, slightly more than once by the sequencing reads.

Full lists

Full list of 'potential pathogens' detected from the sample

Full list of antibiotic resistance genes detected from the sample

PreviousGenome Identification Report

Last updated 1 year ago

hashtagOverview

hashtagSample Information

hashtag​Name

hashtagPipeline & Databse

hashtagInput

hashtagSequencing data statistics

hashtagInput file(s) & QC-passed & Host-depleted

hashtagMedian length

hashtagInterquartile range (IQR)

hashtagQ20 rate

hashtagQ30 rate

hashtagNumber of reads

hashtagNumber of bases

hashtagRetained

hashtagMicroorganisms detected with the highest coverage

hashtagMLST

hashtagMapped reads

hashtagCoverage breadth

hashtagCoverage depth

hashtagRelative abundance

hashtagRecognized as pathogen

hashtagAntibiotic resistance genes detected

hashtagGene name

hashtagAntibiotics (subclasses)

hashtagBreadth

hashtagDepth

hashtagVirulence factor genes detected

hashtagGene Name

hashtagVirulence Factor

hashtagBreadth

hashtagDepth

hashtagFull lists

hashtagFull list of 'potential pathogens' detected from the sample

hashtagFull list of antibiotic resistance genes detected from the sample

Overview

Sample Information

Name

Pipeline & Databse

Input

Sequencing data statistics

Input file(s) & QC-passed & Host-depleted

Median length

Interquartile range (IQR)

Q20 rate

Q30 rate

Number of reads

Number of bases

Retained

Microorganisms detected with the highest coverage

MLST

Mapped reads

Coverage breadth

Coverage depth

Relative abundance

Recognized as pathogen

Antibiotic resistance genes detected

Gene name

Antibiotics (subclasses)

Breadth

Depth

Virulence factor genes detected

Gene Name

Virulence Factor

Breadth

Depth

Full lists

Full list of 'potential pathogens' detected from the sample

Full list of antibiotic resistance genes detected from the sample