NCBI Route

NCBI provides multiple options for retrieving metagenomic sequence data and its associated metadata. These include direct downloads of individual files, Galaxy, and cloud delivery. However, they share the initial steps of locating sequencing data.


Visit the NCBI Sequence Read Archive (SRA) website (

Input Project ID

In the search bar, input “PRJNA834801” and press Enter.

Access Project

The search results will display the project page for PRJNA834801. Click on it to access the project details.

Review Details

Look for information regarding the experimental design, sample annotations, or metadata that differentiate Control and Non-Control groups within the project.

Filter & Select

Use the SRA Toolkit or the NCBI SRA Run Selector ( to filter and download ten random Control and Non-Control FASTQ files. Utilize filters such as sample attributes, experimental factors, or other annotations to select the specific files corresponding to the Control and Non-Control groups of interest.

Download Files

Once you have selected the FASTQ files of interest from the NCBI Sequence Read Archive (SRA), there are several ways to download them:

  1. SRA Toolkit Command Line: If you’re familiar with the command line, you can use the SRA Toolkit. After identifying the accession numbers of the chosen FASTQ files, employ commands like fastq-dump to download them directly to your local machine.

  2. NCBI SRA Run Selector: On the project page, you might see a “Send to” button. Click on it, select “File,” and then choose your desired format for downloading, such as FASTQ. This will prepare a package for download that contains the selected files.

  3. SRA Run Selector Web Interface: Using the NCBI SRA Run Selector interface (, you can select the desired files and then click on the “Accession List” button to access a list of accession numbers. Afterward, utilize the fastq-dump command from the SRA Toolkit or the NCBI SRA Toolkit web browser to download the files directly.

Sample File

Provided below is the accession list (list of sequence identifiers) of a randomized selection of sequence runs within the selected project.

Last updated


EzBioCloud© 2024. All Rights Reserved