wiki:Test imputation pipeline

Version 4 (modified by a.kanterakis, 13 years ago) (diff)

--

Introduction

The purpose if this run is to test the efficiency of the existing imputation pipelines in the Grid.

Datasets

The reference dataset has been created from the raw VCF data of 1000 Genomes data.

  • Download VCF files from : ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521
  • Export only the SNPs (filter out the indels and SVs) from VCF data by using vcftools
    vcftools \
    --gzvcf ALL.chr1.phase1_release_v2.20101123.snps_indels_svs.vcf.gz \
    --keep-INFO LCSNP --keep-INFO EXSNP --keep-INFO SNP \
    --IMPUTE \
    --out ALL.chr1.phase1_release_v2.20101123.snps_indels_svs.
    

The study panel is an artificial genotype dataset. Created by The study panel is an artificial genotype dataset. Created by The study panel is an artificial genotype dataset. Created by The study panel is an artificial genotype dataset. Created by