2.1.3 - Mapping
Mapping aligns the reads of each fastq library to the respective regions on the reference genome where the reads likely originated. Mapping the reads to the reference genome typically involves the alignment of millions of short reads to the genome using algorithms for fast alignment implemented using mapper tools.
To complete the mapping step via Tophat/Hisat2 & Cufflinks with the GTF annotation file, we map the preprocessed fastq files (from BI1-BI5 and BC1-BC4) on the fSpaAur1.1 reference genome. The Step-by-Step menu offers you three mappers TopHat (Kim et al., 2013; Trapnell et al., 2012), Hisat2 (Kim et al., 2015) and STAR (Dobin et al., 2013). In this tutorial, we will use TopHat. To start, select the Step-by-Step menu path, Tophat/Hisat2 & Cufflinks → Mapping → Tophat and proceed as indicated in Video 4.
Video 4. Mapping fastq libraries on the fSpaAur1.1 reference genome using Tophat and the GTF annotation file.
Expected results from mapping analysis
When TOPHAT is completed, you will receive a bam file for each sample with the reads mapped against the reference genome. The expected results are available at the following link Mapping You can check how the job was completed by accessing the job tracking panel. Pay particular focus on the log file metrics showing the % or reads successfully mapped. An acceptable value is more than 80% of reads mapped per fastq library. If the % is lower than 70% try to preprocess the samples again for better cleaning of the fastq libraries. To learn more about TOPHAT see, https://ccb.jhu.edu/software/tophat/index.shtml |
---|