Context Navigation

Changes between Version 60 and Version 61 of SnpCallingPipeline

Timestamp:: Jan 24, 2011 5:51:39 PM (15 years ago)
Author:: laurent
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SnpCallingPipeline

-                      v60
+                      v61
 The current important values discussed for the quality control along with their thresholds are the following:
 * RawData
 ** FastQC report (per mate of the pair)
 *** Manual look at files and check:
 **** Avg Quality per read > 30
 **** Num sequences ~60Mio
 **** Sequence quality should look OK
+        * FastQC report (per mate of the pair)
+                * Manual look at files and check:
+                        * Avg Quality per read > 30
+                        * Num sequences ~60Mio
+                        * Sequence quality should look OK
 * Alignment (per lane)
 ** Picard Alignment Summary Metrics
 *** %Purified reads aligned > 90%
 *** Purified High Quality Error Rate < 1%
 *** Purified reads aligned > 150Mio
 ** Picard GC Bias Metrics
 *** GC Curve should look OK
 *** Median GC% windows between 30 and 40
 *** Avg Mean Base Quality should be OK
 ** Picard Insertsize Metrics
 *** Peak should be ~500
 *** Peak should be narrow
 *** Should have few outliers
 ** Picard BAM Index Stats
 *** Should be uniform by Chromosome
 ** GATK or Picard (currently testing) Coverage Metrics
 *** Should correspond to a Poisson curve with peak at 12x
 ** Picard Mark Duplicates
 *** %duplicates between 5% and 8%
+        * Picard Alignment Summary Metrics
+                * %Purified reads aligned > 90%
+                * Purified High Quality Error Rate < 1%
+                * Purified reads aligned > 150Mio
+        * Picard GC Bias Metrics
+                * GC Curve should look OK
+                * Median GC% windows between 30 and 40
+                * Avg Mean Base Quality should be OK
+        * Picard Insertsize Metrics
+                * Peak should be ~500
+                * Peak should be narrow
+                * Should have few outliers
+        * Picard BAM Index Stats
+                * Should be uniform by Chromosome
+        * GATK or Picard (currently testing) Coverage Metrics
+                * Should correspond to a Poisson curve with peak at 12x
+        * Picard Mark Duplicates
+                * %duplicates between 5% and 8%
 * Recalibration
 ** GATK Analyze Covariate
 *** No output currently; should revisit when working
 ** Picard Quality by Cycle
 *** To be determined once data is produced
 ** Picard Quality Distribution
 *** To be determined once data is produced
+        * GATK Analyze Covariate
+                * No output currently; should revisit when working
+        * Picard Quality by Cycle
+                * To be determined once data is produced
+        * Picard Quality Distribution
+                * To be determined once data is produced
 * Initial SNP Calling
 ** To be determined once data is produced and analyzed. A first basis for it should be derived from the difference between chipdata and sequence data and the %of SNPs found in dbSNP.
+        * To be determined once data is produced and analyzed. A first basis for it should be derived from the difference between chipdata and sequence data and the %of SNPs found in dbSNP.