Context Navigation

Changes between Version 40 and Version 41 of ImputationPipeline

Timestamp:: Dec 6, 2011 1:03:11 PM (14 years ago)
Author:: a.kanterakis
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

ImputationPipeline

-                      v40
+                      v41
  * chromosome : The chromosome of this study
  * r2_threshold : The R2 threshold
+=== Statistics_of_imputation_results ===
+ * Location: http://www.bbmriwiki.nl/svn/Imputation/alex/scripts/Statistics_of_imputation_results.ftl
+Computes several statistics of imputation results. This is suitable when we have "real" genotype data to benchmark our imputation pipeline. The computed statistics are:
+ * Allelic R2 : according to  http://www.sciencedirect.com/science/article/pii/S0002929709000123#sec2.7.2
+ * Real_Allelic_R2 : Computes the R2 (or coefficient of determination) between a real and an imputed genotype.
+ * Imputation_Allele_Frequency and Standardized_allele_frequency_error :  (From: http://www.sciencedirect.com/science/article/pii/S0002929709000123) Allele-frequency error is the difference between the true allele frequency in the sample and the estimated allele frequency in the sample computed from the posterior genotype probabilities. If the three posterior genotype probabilities for an individual are denoted pAA, pAB, and pBB, then the estimated A allele frequency is found by summing (2pAA + pAB) over all individuals and dividing by twice the number of individuals. However, allele-frequency error is difficult to interpret unless the true allele frequency and sample size are known. abs(p - q) / sqrt( ( p * (1-p))/ (2*n)). p is the allele frequency in the sample of n individuals from a population in Hardy-Weinberg equilibrium. q is the estimated allele frequency obtained from the imputed posterior genotype probabilities.
+[[BR]]
+Options:
+ * input_beagle_dosage_filename : The output of the beagle imputation
+ * input_beagle_unimputed_filename : The beagle file with the "real", un-imputed genotypes
+ * output_filename : Output filename for the stats
 == Complete pipelines ==
 == Results ==