# Tetra-Nucleotide Frequencies

## Tetra-Nucleotide Frequency Patterns

A tetra-nucleotide is a fragment of DNA sequence with 4 bases (e.g. AGTC or TTGG). Pride *et al.* (2003) showed that the frequency of tetra-nucleotides in bacterial genomes contain useful, albeit weak, phylogenetic signals. Even though tetra-nucleotide analysis (TNA) utilizes the information of whole genome, it is evident that it cannot replace other alignment-based phylogenetic methods such as [OrthoANI](https://help.ezbiocloud.net/orthoani-genomic-similarity/) or 16S rRNA phylogeny. However, TNA can be useful for phylogenetic characterization when whole genome or 16S rRNA gene information is not available. For example, a partial genomic fragment obtained from a metagenome can be identified by TNA (Teeling *et al.*, 2004). TNA is also fast enough that it can be used as a search engine against a large genome database.

## Algorithm <a href="#span-classeztocsection-idalgorithmspanalgorithmspan-classeztocsectionendspan" id="span-classeztocsection-idalgorithmspanalgorithmspan-classeztocsectionendspan"></a>

Information contained in a genome sequence can be transformed into an array of tetra-nucleotide frequencies (See the below figure).

<figure><img src="/files/O6pal7EMe8HNdunHB3OF" alt=""><figcaption></figcaption></figure>

Information of each genome sequence is now stored as counts of 256 tetra-nucleotides. When two genome sequences are similar, the more correlated these tetra-nucleotide patterns are. Therefore, statistical measures of tetra-nucleotide frequency correlation between two genome sequences can be roughly used to determine the genome-relatedness of two genomes.

Tetra-nucleotide correlation coefficient ranges from 0 to 1, and two identical genomes would produce 1.0.

## References <a href="#span-classeztocsection-idreferencesspanreferencesspan-classeztocsectionendspan" id="span-classeztocsection-idreferencesspanreferencesspan-classeztocsectionendspan"></a>

1. Pride, D. T., Meinersmann, R. J., Wassenaar, T. M. & Blaser, M. J. Evolutionary implications of microbial genome tetranucleotide frequency biases. Genome Res [13, 145-158 (2003)](http://genome.cshlp.org/content/13/2/145.long).
2. Teeling, H., Meyerdierks, A., Bauer, M., Amann, R. & Glockner, F. O. Application of tetranucleotide frequencies for the assignment of genomic fragments. Environ Microbiol [6, 938-947 (2004)](http://onlinelibrary.wiley.com/doi/10.1111/j.1462-2920.2004.00624.x/abstract;jsessionid=B9A8B9CFC9F73D55F2F18895A669BCD5.f01t04).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://kb.ezbiocloud.net/home/science-blogs/profile/tetra-nucleotide-frequencies.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
