Changes between Version 9 and Version 10 of DataConcordance
- Timestamp:
- Apr 21, 2011 4:31:15 PM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
DataConcordance
v9 v10 65 65 Below is a chart showing the shared and unique SNPs in the two datasets regardless of their genotypes. As expected, the vast majority of the SNPs are shared between the datasets, a relatively high number of SNPs are only found in Groningen (amongst them a majority of unfiltered false positives) and a small number of SNPs unique to the BGI dataset (to be investigated). 66 66 67 [[Image(bgi _groningen_loci_concordance.jpg)]]67 [[Image(bgi.snps.comparison.jpg)]] 68 68 69 69 After investigation, the three least concordant individuals encountered a problem while processing one of their lanes, thus leading to 2/3 of the normal coverage. The figures should be updated when the lanes have been processed and these individuals corrected. … … 72 72 The following chart shows the genotype concordance on the shared SNPs between BGI and Groningen datasets. 73 73 74 [[Image( bgi_groningen_concordance.jpg)]]74 [[Image(pilot.bgi.nosex.concordance.jpg)]] 75 75 76 76 Note: The chart above does not take sex chromosomes into account as an artifact introduced by the way the Y-chrom was mapped by BGI was showing all males as completely discordant over the sex chromosomes. … … 90 90 The following chart shows the genotype concordance on the 165K Immunochip loci left after QC. 91 91 92 [[Image( groningen_immunochip_concordance.jpg)]]92 [[Image(pilot.immuno_seq.concordance.v2.jpg)]] 93 93 94 94 The 5 least concordant individuals can be explained as follow: … … 101 101 The graph below shows a preliminary analysis of the "types" of discordance observed. An important caveat has to be taken into account: VCFTools only reports sites where the alleles perfectly match. This means that all monomorphic sites in one dataset that are polymorphic in the other will not appear. This was especially problematic since we compared each sequenced sample separately against the whole Immunochip dataset. As a result almost all homozygous reference sites in the sequence data were not reported by VCFTools. All the discordant sites that did not have perfectly matching alleles are reported below as 'unknown' as it has yet to be investigated what discordance "type" they belong to. 102 102 103 [[Image( groningen_immunochip_discordance_matrix.jpg)]103 [[Image(pilot.immuno.seq.gen.concordance.test.jpg)] 104 104 105 105 == BGI / Immunochip ==