LogoLogo
EzBioCloud
  • 📄Overview
  • 🔆Highlights
  • 🔬Science Blogs
    • Basics
      • Species
      • Species Taxonomy
      • Chimeras
      • Average Nucleotide Identity
      • OrthoANI
      • Genetic Resolution
    • Identify
      • 16S rRNA
      • Identification with 16S rRNA
      • 16S rRNA Resolution
      • 16S rRNA Database
      • Genome Identification
      • Genome Identification Process
      • Multi-Locus Sequence Typing
      • 16S vs Genome Identification
      • Subspecies
      • Phylogenomic Trees
      • Genome Database
      • Quality Control
    • Profile
      • Tetra-Nucleotide Frequencies
      • 16S Copy Number
      • Up-to-date Bacterial Core Genes
      • UBCG Technical Guide
      • UBCG Set
      • Depth of Sequencing
      • Metagenome-Assembled Genomes Suitability
      • 16S Versus Metagenomic Sequencing
      • Microbiomes
    • Detect
      • Clinical Metagenomics
      • Inferring with Amplicons
      • Pathogenicity Markers
      • Antimicrobial Resistance
      • Clinical Report Process
      • Defining a Pathogen
      • Human Pathogens
      • in silico Serotyping
    • Analyze
      • Alpha Diversity
      • Beta Diversity
      • Co-occurrence
      • Enterotyping
      • Taxonomic Composition
    • reAnalyze
      • reAnalyze #1 - Skin Disease
      • reAnalyze #2 - Skin Ageing
      • PreAnalyze #3 - Scalp Dandruff
  • ⚗️Protocols
    • 16S Identification
      • Get Started
      • Prepare Samples
        • Private Samples
        • Public Samples
      • Navigate Menu
      • Upload Data
        • Single Upload
        • Batch Upload
      • Download Report
    • Genome Identification
      • Get Started
      • Prepare Samples
        • Private Samples
        • Public Samples
        • SRA Samples
      • Navigate Menu
      • Upload Data
        • Whole Genome
        • Illumina
        • Nanopore
      • Download Report
    • Shotgun Microbiome
      • Get Started
      • Download Samples
        • NCBI Route
        • Linux Route
      • Navigate Menus
      • Create Studies
      • Profile Samples
      • Describe Profiles
        • Retrieve Metadata
        • Organize Metadata
        • Upload Metadata
      • Create Datasets
      • Analyze Datasets
        • Quality Check
        • Pie Chart Composition
        • Summary Statistics
        • Group Composition
        • Alpha Diversity
        • Beta Diversity
        • Differential Abundance
        • Enterotype
        • Co-occurrence
        • Co-occurrence Spearman
        • Statistical Matching
        • LEfSe
        • Metadata EDA
        • Profile EDA
    • Clinical Metagenomics
  • 🏛️Dr. Chun's Lectures
  • 🔧Tools
  • 🧫Taxonomy
  • ❔FAQs
    • Identification
    • Clinical Metagenomics
    • Privacy Policy
    • Terms of Service
Powered by GitBook
LogoLogo

Legal

  • Terms of Service
  • Privacy Policy

EzBioCloud© 2024. All Rights Reserved

On this page
  • 16S identification algorithm for identification of a bacterium
  • The EzBioCloud 16S Identification engine works in the following steps:
  1. Science Blogs
  2. Identify

Identification with 16S rRNA

How can the 16S rRNA gene be used to identify bacterial species?

Previous16S rRNANext16S rRNA Resolution

Last updated 1 year ago

16S identification algorithm for identification of a bacterium

The most critical measurement for 16S-based species identification is pairwise sequence similarity. However, different sequence alignment algorithms may produce different similarity values. Therefore, it is important to use a taxonomically valid algorithm for alignment and similarity calculation. It is ideal if we calculate all similarities between the isolate and all type strains of the known species. This is doable, but not efficient as it will take very long for computing all pairs (>70,000) while we only need the values that are close enough (i.e., species with >98.7% similarity). For this reason, a two-step approach is devised for the EzBioCloud 16S Identification service. It is the same as the one used on our public 16S identification service (www.ezbiocloud.net), except that the reference database used in EzBioCloud 16S Identification is more stringently curated.

The EzBioCloud 16S Identification engine works in the following steps:

The query sequence is chopped into three fragments of equal length. If the length of the query sequence is > 1000 bp, the query is chopped into two fragments. If the length of the query sequence is > 500 bp, the query will not be chopped. The original full-length query and the fragmented sequences, four sequences in total, are used as the query sequence for a BLASTn-based search against the EzBioCloud 16S Identification Database. Using the different parts of the query sequences in the BLASTn search ensure the correct identification of all potentially similar reference sequences. Fifty hits are collected from each of the four BLASTn searches and combined. Because there are always duplicated hits, the final hit list contains much less than 200 hits. A robust pairwise sequence alignment (Myers and Miller, 1988) is carried out between all pairs, that is, the query sequence against all BLASTn hit species identified in the previous step. The alignment algorithm used in EzBioCloud 16S Identification service is same as the one used in defining the 16S cutoff (98.7%) for species definition (Kim et al., 2014) and used in the highly cited EzBioCloud (formerly EzTaxon) service. For more details about 16S similarity calculation, please read this article. Please note that BLASTn identity values are not used for taxonomic purposes [Learn more]. The completeness(%) of the query sequence is calculated [Learn more]. For example, 50% completeness means that the query sequence covers only half of the full-length 16S gene. The taxonomically meaningful 16S sequence similarity was proposed on the basis of full-length sequences. Therefore, similarity values based on partial sequences should be interpreted carefully. Finally, the hit species are sorted by the 16S similarities and displayed as a table and stored. Interpretation of 16S similarity values should be made carefully. For example, Bacillus cereus shows >99.8% 16S similarity to about ten species, implying that very similar 16S sequence does not always mean that the isolate belongs to the hit species.

🔬