> For the complete documentation index, see [llms.txt](https://kb.ezbiocloud.net/home/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://kb.ezbiocloud.net/home/protocols/shotgun-microbiome/download-samples/linux-route.md). # Linux Route Retrieving sequencing data via Linux requires finding the accession numbers and setting up a suitable environment. If you’re familiar with the command line, you can use the SRA Toolkit. After identifying the accession numbers of the chosen FASTQ files, employ commands like **fastq-dump** to download them directly to your local machine. ## Retrieve an SRA accession list 1. Visit the NCBI Sequence Read Archive (SRA) website (). 2. In the search bar, input “PRJNA834801” and press Enter. 3. The search results will display the project page for PRJNA834801. Click on it to access the project details. 4. Look for information regarding the experimental design, sample annotations, or metadata that differentiate Control and Non-Control groups within the project. 5. Use the SRA Toolkit or the NCBI SRA Run Selector () to filter and download ten random Control and Non-Control FASTQ files or go ahead and download them all if you have the space. Utilize filters such as sample attributes, experimental factors, or other annotations to select the specific files corresponding to the Control and Non-Control groups of interest. 6. Download the list of selected accession numbers list for further analysis or processing in your research. Once you have selected the FASTQ files of interest from the NCBI Sequence Read Archive (SRA), there are several ways to download them. ## Download the accession list In this example, I will use Linux that I have set up with fastq-dump and the prerequisites. **SRA Run Selector Web Interface**: Using the NCBI SRA Run Selector interface (), you can select the desired files and then click on the “Accession List” button to access a list of accession numbers. Afterward, utilize the **fastq-dump** command from the SRA Toolkit or the NCBI SRA Toolkit web browser to download the files directly. Provided below is the accession list (list of sequence identifiers) of a randomized selection of sequence runs within the selected project. {% file src="/files/rvfzGBlgtNrC7tDN191O" %} ## Set up SRA Toolkit on Linux To set up the SRA Toolkit and its prerequisites on Linux, follow these steps: ### **For Debian/Ubuntu-based systems** ``` sudo apt-get update sudo apt-get install sra-toolkit ``` ### **For Red Hat-based systems** ``` sudo yum install sra-toolkit ``` ## **Install prerequisites** Ensure you have the required dependencies installed. These typically include **wget**, **curl**, and **libxml-libxml-perl**: For Debian/Ubuntu-based systems ``` sudo apt-get install wget curl libxml-libxml-perl ``` For Red Hat-based systems ``` sudo yum install wget curl perl-libxml-perl ``` ## **Verify installation** 1. After installation, verify if the SRA Toolkit is correctly installed by running the command: ``` fastq-dump --version ``` 3. This command should display the installed SRA Toolkit version information if the installation was successful. Once installed, you can use the **fastq-dump** command to download FASTQ files from the NCBI Sequence Read Archive (SRA) using their accession numbers or URLs. ## Download selected accession files After verifying the installation of the SRA Toolkit, to download the selected SRA files using an accession list, follow these steps: 1. **Accession List Preparation**: * Create a text file containing the accession numbers of the SRA files you want to download. Each accession number should be on a separate line in the file. 2. **Download SRA Files**: * Open a terminal window and navigate to the directory where you have the accession list text file. 3. **Use SRA Toolkit to Download**: * Run the **prefetch** command from the SRA Toolkit, providing the accession list file as input. For example: ``` prefetch --option-file accession_list.txt ``` Replace **accession\_list.txt** with the actual filename containing your accession numbers. For example: PRJNA834801\_accession\_subselection.txt. 4. **Conversion to FASTQ**: * Once the prefetch command completes, use the **fastq-dump** command to convert the downloaded SRA files to FASTQ format. For instance: ``` fastq-dump --split-files SRRXXXXXX ``` Replace **SRRXXXXXX** with the specific accession numbers obtained from the SRA. This command will convert the downloaded SRA files to FASTQ format. Adjust as necessary for your downloaded files. These commands will download the selected SRA files based on the accession list you’ve prepared and subsequently convert them to FASTQ format for further analysis or processing in your research. ## Download selected accession files to a specific location To download the SRA files to a specific location, such as a directory named ‘sra\_downloads’ on a mounted D drive, follow these modified steps: 1. **Accession List Preparation**: * Create a text file containing the accession numbers of the SRA files you want to download. Each accession number should be on a separate line in the file. 2. **Download SRA Files to a Specific Location**: * Open a terminal window and navigate to the directory where you have the accession list text file. * Use the **prefetch** command from the SRA Toolkit, specifying the target directory using the **-O** option. For example: ``` prefetch --option-file accession_list.txt -O /mnt/d/sra_downloads ``` Replace **accession\_list.txt** with the actual filename containing your accession numbers. The **-O** flag followed by the directory path **/mnt/d/sra\_downloads** indicates the specific location where the downloaded files will be stored. ## Convert SRA files to Fastq to a specific location 3. **Conversion to Fastq**: * After downloading the SRA files, navigate to the directory where they are stored (**/mnt/d/sra\_downloads** in this example). * Use the **fastq-dump** command to convert the downloaded SRA files to fastq format. For instance: ``` fastq-dump --split-files -O /mnt/d/sra_downloads SRRXXXXXX ``` Replace **SRRXXXXXX** with the specific accession number obtained from the SRA. This command will convert the downloaded SRA file to FASTQ format and place it in the specified directory. These commands will download the selected SRA files based on the accession list to the specified location on your mounted D drive and convert them to FASTQ format for further analysis. Adjust paths and commands according to your system setup and requirements. ## Download and convert multiple SRA files at once To convert multiple downloaded SRA files to FASTQ format within the same **fastq-dump** command, you can do the following: 1. **Download SRA Files**: * Use **prefetch** to download the SRA files to the specified directory on your D drive: ``` prefetch --option-file accession_list.txt -O /mnt/d/sra_downloads ``` 2. **Conversion to FASTQ for Multiple Files**: * Navigate to the directory where the downloaded SRA files are stored (**/mnt/d/sra\_downloads**). * Use the **fastq-dump** command with the **–split-files** option to convert multiple SRA files to FASTQ format in one command: ``` fastq-dump --split-files *.sra ``` This command utilizes the wildcard **\*.sra** to specify that all SRA files present in the current directory should be converted to FASTQ format. Make sure to navigate to the correct directory where your downloaded SRA files are stored before running the **fastq-dump** command using the wildcard to convert all the SRA files to FASTQ simultaneously. Adjust paths and file extensions according to your specific setup if needed. --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://kb.ezbiocloud.net/home/protocols/shotgun-microbiome/download-samples/linux-route.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.