Course: Tutorial RNASeq of comparative transcriptomics based on the species Sparus aurata, Topic: 1.2

1.2 - Tutorial material and case study

Within this tutorial, we use data from a case study of comparative transcriptomics based on the species Sparus aurata that was previously published in Pérez-Sánchez et al. (2019). The tutorial material consists of nine RNAseq samples from spleen biopsies from specimens of S. aurata. Specimens were separated into two groups: control (BC) (n = 4) and parasite-infected fishes (BI) (n = 5). In Table 1, we provide the nine fastq files with the following SRA Accessions, a summarization of each group and the assignation of samples per group.

Table 1: Samples and case study groups

SRA accession	Library Names	Tags
SRR8255970	ZFG-17-12_03_26333_S7_R1_001.fastq	BC1
SRR8255963	ZFG-17-12_06_26336_S10_R1_001.fastq	BC2
SRR8255962	ZFG-17-12_09_26339_S13_R1_001.fastq	BC3
SRR8255949	ZFG-17-12_12_26342_S16_R1_001.fastq	BC4
SRR8255945	ZFG-17-12_16_26346_S2_R1_001.fastq	BI1
SRR8255941	ZFG-17-12_20_26350_S6_R1_001.fastq	BI2
SRR8255956	ZFG-17-12_24_26354_S10_R1_001.fastq	BI3
SRR8255952	ZFG-17-12_28_26358_S14_R1_001.fastq	BI4
SRR8255939	ZFG-17-12_32_26362_S18_R1_001.fastq	BI5

SRA accession Library Names Tags

SRR8255970 ZFG-17-12_03_26333_S7_R1_001.fastq BC1

SRR8255963 ZFG-17-12_06_26336_S10_R1_001.fastq BC2

SRR8255962 ZFG-17-12_09_26339_S13_R1_001.fastq BC3

SRR8255949 ZFG-17-12_12_26342_S16_R1_001.fastq BC4

SRR8255945 ZFG-17-12_16_26346_S2_R1_001.fastq BI1

SRR8255941 ZFG-17-12_20_26350_S6_R1_001.fastq BI2

SRR8255956 ZFG-17-12_24_26354_S10_R1_001.fastq BI3

SRR8255952 ZFG-17-12_28_26358_S14_R1_001.fastq BI4

SRR8255939 ZFG-17-12_32_26362_S18_R1_001.fastq BI5

*BC = control; BI = Infected fish.

The 9 fastq files can be downloaded from NCBI at https://www.ncbi.nlm.nih.gov/bioproject/PRJNA507368. If you need to help downloading this material from NCBI, contact us for support at https://forum.biotechvana.com.

RefSeq material: To complete the tutorial you will need the following reference sequences:

The genome assembly draft of S. aurata (fSpaAur1.1 Torre de la Sal release) will be used as a reference genome sequence in the Tophat/Hisat2 & Cufflinks protocol.
The GTF file associated with the coding genes of the fSpaAur1.1 release will be used as a reference genome sequence in the Tophat/Hisat2 & Cufflinks protocol.
The RefSeq file of transcripts of S. aurata (fSpaAur1.1 release) will be used as a transcriptome reference sequence in the Mapping & counting protocol.
A csv file with the functional descriptions and annotations for all gene features of S. aurata (fSpaAur1.1 release) that will be used to integrate functional information such as gene ontology (GO categories) descriptions or formal annotations to the results of differential expression.

You can download the RefSeq material from TorreLaSal CSIC Nutrigroup at https://nutrigroup-iats.org/welcome/request_file. For more details contact Professor Jaume Perez-Sanchez (jaime.perez.sanchez@csic.es).

Alternatively, you can also use the Refseq release provided by NBCI at https://www.ncbi.nlm.nih.gov/assembly/GCF_900880675.1. However, please note that NCBI release for S.aurata differ in size and annotations to the TorreLaSal release and so differential expression results would likely vary from the results presented in this tutorial (which is based on the TorreLaSal release.

GOSeq input material: The tutorial demonstrates how to execute GOseq analyses for DE using either: Tophat/Hisat2 & Cufflinks” or “Mapping & Counting”. Enrichment analysis are performed with the software GOseq ( Young et al., 2010). As S. aurata is a customized species for this software, you need 4 input files for the analysis; 1) assayed genes; 2) differential expressed genes; 3) gene sizes, and 4) GO terms per gene.

To facilitate this tutorial, we provide you with the following material:

The contents of these four files will differ if the analysis is performed via the “Tophat/Hisat2 & Cufflinks” or “Mapping & Counting” protocols. Nevertheless, the procedure is identical in both cases. For this reason, we provide you these four files pre-created for the “Tophat/Hisat2 & Cufflinks” simply to show you the format of each input file.

Remember that these files are only valid for GOseq analyses performed in “Tophat/Hisat2 & Cufflinks”. If you want to complete a GOseq analysis under the “Mapping & Counting” protocol, you need to prepare these four files yourself. Similarly, the pre-prepared files for GOSeq analyses will not be valid if you use the NCBI release and so they must be prepared seperately.

Tutorial RNASeq of comparative transcriptomics based on the species Sparus aurata

1.2 - Tutorial material and case study

Biotechvana

Esta plataforma forma parte de: IVACE PROJECT IMDIGB/2020/56