wiki:MendelianQcPipelineIdea

Version 1 (modified by Yurii Aulchenko, 14 years ago) (diff)

--

Idea of Mendelian QC

In trios, Mendelian errors can only be deduced from incompatibility between the genotypes of the parents and the child.

Allowed and prohibited genotypic configurations can be summarized in the table:

AA AB BB missing
AA (AA,oo,oo) (AA,AB,oo) (oo,AB,oo) (AA,AB,oo)
AB (AA,AB,oo) (AA,AB,BB) (oo,AB,BB) (AA,AB,BB)
BB (oo,AB,oo) (oo,AB,BB) (oo,oo,BB) (oo,AB,BB)
missing (AA,AB,oo) (AA,AB,BB) (oo,AB,BB) (AA,AB,BB)

where column and row margins show genotypes of parents (AA, AB, BB, or 'missing'); each cell show possible and impossible (oo) genotypes for offspring.

Above table may serve as the starting point for Mendelian QC.

In classical Mendelian checks, it is not possible to tell, what person is more likely to be erroneously typed, and genotypes of all pedigree members are set to missing. With sequencing data, because coverage and quality info is available, it is natural an possible to estimate the chances that a particular person's genotype is likely to be wrong, and set this particular genotype to missing.

These considerations bring the idea that Mendelian checks can further be incorporated into the TrioAwareVariantDiscoveryPipeline? by addressing the questions like:

  • when one of parental genotypes is missing, could we infer his or her genotype? Clearly, if a child is AB, an one of the parents is AA and other is missing, this other parent should carry 'B' allele
  • for some 'Mendelian impossible' configurations, e.g. if both parents are AA, and the child is AB, given the depth and phasing info, could that be that actually one of the parents is AB
  • ...