Tamaño de fuente
  • A-
  • A
  • A+
Color del sitio
  • R
  • A
  • A
  • A
eCampus
  • INICIO
  • eCAMPUS
    AULA
  • FORO
  • English ‎(en)‎ Español - Internacional ‎(es)‎
  • Acceder
Salta al contenido principal

VariantSeq: tutorial for usage with case study.

  1. Página Principal
  2. Nuestros cursos
  3. VariantSeq Tutorial
  4. 1.2 - Tutorial material and case study
◄1. PRELIMINARY INFORMATION2. STEP-BY-STEP MODE TUTORIAL►
  • CONTENTS
  • 1. PRELIMINARY INFORMATION
    • 1.1 - Tutorial objective
    • 1.2 - Tutorial material and case study
    • 1.3 - Experiment design and support
    • 1.4 - Installing and activating VariantSeq and the Server-Side
  • 2. STEP-BY-STEP MODE TUTORIAL
    • 2.1 - Preparing your experiment
    • 2.2 - Quality analysis and preprocessing
    • 2.3 - Mapping
    • 2.4 - Postprocessing
    • 2.5 - Variant calling
    • 2.6 - Variant filtering
    • 2.7 - Annotation
  • 3. PIPELINE MODE TUTORIAL
  • 4. BIBLIOGRAPHY
  • CITE US
  • 1.2 - Tutorial material and case study


    In this tutorial, we will use data from a case study of variant analysis published by (Trilla-Fuertes, et al. 2020). based on whole-exome NGS data sequenced from samples of human anal cancer. The tutorial material consists of five formalin-fixed, paraffin-embedded (FFPE) samples from patients diagnosed with localized anal squamous cell carcinoma (ASCC). These samples were analyzed by whole-exome sequencing (NextSeq500) via Illumina pair end. The sample names are provided in Table 1.

    NGS Data: Five fastq files with the following SRA Accessions (Table 1):


                                                                                      Table 1: Exome samples

    SRA accession Library Names
    SRR10164002 CAN2
    SRR10163991 CAN3
    SRR10163980 CAN4
    SRR10163969 CAN5
    SRR10163960 CAN12


    The 5 fastq files can be downloaded from the NCBI at the following URL https://www.ncbi.nlm.nih.gov/bioproject/PRJNA573670. For questions regarding how to download this material, contact us for support in our forum at https://forum.biotechvana.com.

    RefSeq material

    In this tutorial, we used the Resource Bundle of GATK that is based on the Hg19 release of the human genome as a source of RefSeq. For training material, we used an interval file based on the seqCap VCRome V2 for human exome. The interval file can be downloaded by clicking seqCap_VCRome_V2_intervals_list.intervals.

    For the cancer variant analysis, you need a panel of normal (PON) to filter all possible germline variants. This can be downloaded by clicking HPON.vcf and HPON.vcf.idx.

    This PON was created using 11 human Iberian exome samples sequenced via Illumina technology (Illumina HiSeq 20) and Spanish populations HapMap provided by the 1000 genomes project (1000 Genomes whole exome sequencing of IBS population). The 11 samples can be downloaded from the SRA archive (http://www.ncbi.nlm.nih.gov/sra/) of NCBI with the following accessions SRR768531, SRR768530, SRR768529, SRR766062, SRR766027, SRR766011, SRR766005, SRR765982, SRR765992, SRR764760, SRR764761.

    - Reference genome

    For this experiment, you need the following training material from the hg19 release:

    • ucsc.hg19.dict.gz
    • ucsc.hg19.fasta.fai.gz
    • ucsc.hg19.fasta.gz

    - Training sets and known site files

    The fastq libraries must be mapped on a reference genome (ucsc.hg19.fasta) as a RefSeq sequence . The additional files .dict and .fai are the dictionary and index files, respectively, that are associated with that sequence.

    • dbsnp_138.hg19.vcf.gz
    • dbsnp_138.hg19.vcf.idx.gz
    • hapmap_3.3.hg19.sites.vcf.gz
    • hapmap_3.3.hg19.sites.vcf.idx.gz
    • 1000G_phase1.snps.high_confidence.hg19.sites.vcf.gz
    • 1000G_phase1.snps.high_confidence.hg19.sites.vcf.idx.gz
    • Mills_and_1000G_gold_standard.indels.hg19.sites.vcf.gz
    • Mills_and_1000G_gold_standard.indels.hg19.sites.vcf.idx.gz
    • af-only-gnomad.raw.sites.hg19.vcf
    • af-only-gnomad.raw.sites.hg19.vcf.idx

    The material (reference genome and training sets) can be downloaded from the Broad Institute FTP site at https://gatk.broadinstitute.org/hc/en-us/articles/360035890811-Resource-bundle. In particular you will need the following from the hg19 release folder.

    The training sets and known site resources are files including lists of variants that created with machine-learning algorithms to model the properties of true variation vs. artifacts. They are required in several steps of the SPMI protocol to help the caller distinguish true variants from false positives. For more details see the this URL at the GATK forum https://gatk.broadinstitute.org/hc/en-us/articles/360035890831-Known-variants-Training-resources-Truth-sets

    ◄1.1 - Tutorial objective1.3 - Experiment design and support►
    • Página Principal
    • Calendario
    • Secciones del curso
      • CONTENTS
      • 1. PRELIMINARY INFORMATION
      • 1.1 - Tutorial objective
      • 1.2 - Tutorial material and case study
      • 1.3 - Experiment design and support
      • 1.4 - Installing and activating Variantseq and the Server-Side
      • 2. STEP-BY-STEP MODE TUTORIAL
      • 2.1 - Preparing your experiment
      • 2.2 - Quality analysis and preprocessing
      • 2.3 - Mapping
      • 2.4 - Postprocessing
      • 2.5 - Variant calling
      • 2.6 - Variant filtering
      • 2.7 - Annotation
      • 3. PIPELINE MODE TUTORIAL
      • 4. BIBLIOGRAPHY
      • Pipeline mode: SNP/Indels results
      • CITE US
      • 4. BIBLIOGRAPHY
    Configuraciones de accesibilidad
    About us
    Team
    Publications
    R&D
    Patents & trademarks
    Announcements
    Careers
    Journal Sequencing Partners

    Biotechvana


    Valencia Lab
    Parc Cientific Universitat de Valencia
    Carrer del Catedràtic Agustín Escardino, 9. 46980 Paterna (Valencia) Spain
    Madrid Lab
    Parque Científico de Madrid
    Campus de Cantoblanco
    Calle Faraday 7, 28049 Madrid Spain
    Contact us
    Phone: +34 960 06 74 93
    Email: biotechvana@biotechvana.com

    Esta plataforma forma parte de: IVACE PROJECT IMDIGB/2020/56


    Projectes de Digitalizació de PIME (DIGITALIZA-CV TELETREBALL)2020
    IVACE PROJECT IMDIGB/2020/56

    Biotechvana © 2021
    eCampus Privacy policy    eCampus Terms of use