2.1.6 - GOSeq
The next step of this protocol is a GOSeq analysis to determine the enrichment of Gene Ontology features using the software GOSeq (Young et al., 2010). Note that this should also be done for any other annotation line metabolic pathways, SignalP domains, etc. For simplicity, we will only focus on differential expression results at the gene level. The GOSeq analysis can be performed using reference data from native or customized species. Native species in GOSeq analysis have gene lengths and gene categories automatically because they are stored in the GOSeq local database. Thus, the user only needs to provide the assayed genes and differentially expressed genes files. In contrast, for customized species in GOSeq analysis, the user must provide gene lengths and gene categories because they are not in the native GOSeq database. Also, the user must provide the assayed genes and differentially expressed genes files.
As the Goseq analysis is customized, you must prepare four files individual files for the assayed genes, differential expressed genes, gene sizes, and GO terms per genes.You may use the prepared files (See Section 1.2 ).
If you used the NCBI genome release, you must prepare these four files yourself. When these files are ready, move them from your directory browser to the folder 08_GOSeq of your server-side user account using the FTP browser.
To run PATHSeq analysis the steps are the same, only change the correspondent files. Specifically, you must replace gos_homo_sapiens.txt file in the GO terms/categories section with maps_homo_sapiens.txt file
When you are ready to perform the GO enrichment, go to the Step-by-Step menu path, Tophat/Hisat2 & Cufflinks → GOseq → GOseq, and follow this instructions in Video 8.
Video 8. GO enrichment analysis with GOSeq from results obtained from differential expression analyses with Cufflinks.
Expected results from GOSeq and PATHSeq analysis: When GOSeq is complete, you will receive the results of the differential expression sample with the reads mapped against the reference genome. The expected results of this step are available in the following link GOSeq The expected results of this step are available in the following link PATHSeq Remember you can check if the job was successfully completed by accessing the job tracking panel of RNASeq. To learn more about Goseq and their outputs see https://bioconductor.org/packages/release/bioc/html/goseq.html |
---|