Clinical Metagenomics Report
Beta-test users can use this page to answer any questions relating to our clinical metagenomics report, currently in its preliminary form.
Last updated
Beta-test users can use this page to answer any questions relating to our clinical metagenomics report, currently in its preliminary form.
Last updated
This is the sample's name that was uploaded to EzBioCloud.
For consistency in analyses and referencing, the analytic engine and genome reference databases (genes and genomes) are provided.
Uploaded sample input type: PAIRED_FQ: paired FASTQ / SINGLE_FQ: single FASTQ.
The input files are all the reads uploaded to EzBioCloud whereas QC-passed are the remaining reads that passed quality control standards. Host-depleted reads are the remaining reads after the removal of any that match the sample host e.g. removal of human-genome-associated reads.
The median length is the middle value of the read lengths when all reads are ordered by length. It indicates the typical length of the sequencing reads in the dataset.
The interquartile range is the range between the first quartile (25th percentile) and the third quartile (75th percentile) of read lengths. It measures the spread of the central 50% of the data, providing insight into the variability of read lengths.
The Q20 rate represents the percentage of bases in the sequencing reads with a quality score of 20 or higher. A quality score of 20 corresponds to a 1% error rate, meaning there is a 99% chance the base is called correctly.
The Q30 rate is the percentage of bases with a quality score of 30 or higher. A quality score of 30 indicates a 0.1% error rate, meaning there is a 99.9% chance the base is called correctly.
The number of reads refers to the total count of sequencing reads in the sample. Each read represents a fragment of the DNA or RNA that was sequenced.
The number of bases is the total number of nucleotide bases (A, T, C, G) present in all the reads combined. It is a measure of the total amount of sequenced data.
Retained refers to the percentage of sequencing data that remains after the processing steps, quality control and host depletion i.e. how much of the initial data was retained for analysis after filtering out low-quality reads and unwanted sequences.
Multi-Locus Sequence Typing (MLST) is a method for subtyping bacteria based on the sequence of several housekeeping genes. 'Unresolved' indicates there was not enough coverage of the MLST genes to determine the type.
Mapped reads are the number of sequencing reads that have been successfully aligned to the reference genomes of the microorganisms listed. This indicates how many of the reads in your sample correspond to these particular microorganisms.
Coverage breadth refers to the proportion of the reference genome that is covered by the mapped reads. The percentage indicates how much of the genome is represented in the sequencing data.
Coverage depth is the average number of times each base in the reference genome is covered by the mapped reads. It is a measure of the redundancy of the sequencing data and indicates the robustness of the sequence data. 'Over aligned sites' is the coverage depth only over the places where reads are covered in the reference genomes. 'Over whole reference' includes the whole reference genome which may only be partially covered by the reads in your sample. A low coverage breadth will lead to a lower over the whole reference genome coverage depth.
Relative abundance is the proportion of the total sequencing reads that are attributed to a specific microorganism. It provides insight into the prevalence of each microorganism in the sample.
This term indicates whether the indicated microorganism is known to be pathogenic, meaning it can cause disease. Either 'yes' or 'no', it helps in assessing the potential clinical relevance of the detected microorganisms which can be tested by traditional clinical identification techniques.
This is the specific name of the antibiotic resistance gene detected in the sample. These genes confer resistance to antibiotics and are identified by their specific sequences.
This column lists the classes and subclasses of antibiotics to which the resistance gene confers resistance. For example, 'Macrolide (Macrolide)' means the gene provides resistance to antibiotics in the macrolide class which has been found in this example.
In this context, breadth refers to the proportion of the resistance gene that is covered by the mapped reads. It is usually expressed as a percentage and indicates how much of the gene is represented in the sequencing data. A breadth of 100% means the entire gene sequence is covered by reads, while a lower percentage indicates partial coverage.
Depth here refers to the average number of times each base in the resistance gene sequence is covered by the mapped reads. It is a measure of the redundancy of the sequencing data for that particular gene. Higher depth values indicate more robust and reliable sequencing data, which can provide greater confidence in the detection and characterization of the resistance gene.
This column lists the specific names of the virulence factor genes detected in the sample. These genes are associated with the ability of pathogens to infect and cause disease in a host.
This indicates the function or role of the gene in contributing to the virulence of the microorganism. Examples include:
Capsule (Immune modulation): Genes involved in the formation of a protective capsule around the bacterium, helping it evade the host's immune system.
Effector delivery system (T6SS, T2SS): Genes involved in the secretion systems that deliver toxins or effector molecules into host cells.
Exotoxin (ShET2): Genes that produce toxins secreted by bacteria to damage the host.
Adherence (Type 1 fimbriae, Curli fibers, ECP): Genes that help the bacteria adhere to host cells.
Regulation (Fur, SigA): Genes involved in regulatory functions that control the expression of other virulence factors.
Stress survival (KatA): Genes that help bacteria survive under stressful conditions, such as oxidative stress.
Breadth refers to the proportion of the virulence factor gene that is covered by the mapped reads. It is expressed as a percentage, indicating how much of the gene sequence is represented in the sequencing data. For example, a breadth of 53.4% means that 53.4% of the gene is covered by the sequencing reads.
Depth refers to the average number of times each base in the virulence factor gene sequence is covered by the mapped reads. It is a measure of the redundancy and reliability of the sequencing data for that particular gene. Higher depth values indicate greater confidence in the detection and characterization of the gene. For instance, a depth of 1.033x means each base in the gene sequence is covered, on average, slightly more than once by the sequencing reads.
​​