Changes between Version 3 and Version 4 of GoNL_Immunochip_Data_Preparation


Ignore:
Timestamp:
Apr 21, 2011 4:39:30 PM (14 years ago)
Author:
laurent
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GoNL_Immunochip_Data_Preparation

    v3 v4  
    11= GoNL Immunochip Data Preparation for Concordance =
    2 
    32[[TOC]]
    43
    54This page describes the necessary steps to get a VCF Hg19 file containing the GoNL Immunochip data from the raw/QC'ed Immunochip data in PED format. This is using tools as available in early 2011 and should get much simpler when PLINK/Seq is released.
    65
    7 Here, the procedure is shown for a FORWARD strand PED file. If you have a TOP/TOP PED file, you will still need to correct for strand. 
     6Here, the procedure is shown for a FORWARD strand PED file. If you have a TOP/TOP PED file, you will still need to correct for strand.
    87
    9 = PED to VCF=
     8= PED to VCF =
    109The following steps explain how to produce a VCF file from PLINK ped files. It is a rather cumbersome process at the moment and should be streamlined when PLINK/Seq is released.
    1110
     
    3534=== [Optional] Flip Strand ===
    3635A small script, ''flip-vcf-snp.pl'', is available in case you have the following:
    37 * A VCF file coming from a TOP/TOP PLINK file set
    38 * A BIM file corresponding to the same dataset but in forward strand
     36
     37 * A VCF file coming from a TOP/TOP PLINK file set
     38 * A BIM file corresponding to the same dataset but in forward strand
     39
    3940The script can be used to flip the strand according to the BIM file.
    4041
    4142= Liftover file =
    4243The last step in preparing the immunochip data for comparison with the sequence data is to liftover the VCF file to the same Human Genome Reference as the Sequence data so that comparisons can be made. Here is how:
    43 # Get the appropriate chain files
    44 #* From the [ftp://gsapubftp-anonymous@ftp.broadinstitute.org Broad GSA ftp] (password is blank)
    45 #* From the [http://hgdownload.cse.ucsc.edu/downloads.html#human USCS Genome Browser]
    46 # Get the appropriate fasta files (you'll need both from-and-to builds fasta files)
    47 #* From [ftp://ftp.ncbi.nlm.nih.gov/genomes/H_sapiens/ NCBI]
    48 #* From [http://hgdownload.cse.ucsc.edu/downloads.html#human USCS Genome Browser]
    49 # Index the fasta files appropriately to get .fai (samtools) and .dict files (Picard) as described on the [http://www.broadinstitute.org/gsa/wiki/index.php/Preparing_the_essential_GATK_input_files:_the_reference_genome GATK wiki]
    50 # Get and run the [http://www.broadinstitute.org/gsa/wiki/index.php/LiftOverVCF.pl GATK LiftOver tool]
    5144
    52 Note that once the liftover VCF has successfully been created, it can be used to liftover the PLINK files. To do so:
    53 # Remove all SNPs that are not present in the new reference VCF file (using plink --extract)
    54 # Use the liftover VCF as an input to the ''liftover-bim.pl'' tool .
     45 1. # Get the appropriate chain files
     46   * From the [ftp://gsapubftp-anonymous@ftp.broadinstitute.org Broad GSA ftp] (password is blank)
     47   * From the [http://hgdownload.cse.ucsc.edu/downloads.html#human USCS Genome Browser] #
     48 1. Get the appropriate fasta files (you'll need both from-and-to builds fasta files)
     49   * From [ftp://ftp.ncbi.nlm.nih.gov/genomes/H_sapiens/ NCBI]
     50   * From [http://hgdownload.cse.ucsc.edu/downloads.html#human USCS Genome Browser]
     51 1. Index the fasta files appropriately to get .fai (samtools) and .dict files (Picard) as described on the [http://www.broadinstitute.org/gsa/wiki/index.php/Preparing_the_essential_GATK_input_files:_the_reference_genome GATK wiki]
     52 1. Get and run the [http://www.broadinstitute.org/gsa/wiki/index.php/LiftOverVCF.pl GATK LiftOver tool]
     53
     54Note that once the liftover VCF has successfully been created, it can be used to liftover the PLINK files. To do so:
     55
     56 1. Remove all SNPs that are not present in the new reference VCF file (using plink --extract)
     57 1. Use the liftover VCF as an input to the ''liftover-bim.pl'' tool .