Changes between Version 1 and Version 2 of ChipBasedQcPipelineIdea
- Timestamp:
- Sep 26, 2010 7:30:57 PM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
ChipBasedQcPipelineIdea
v1 v2 3 3 Below a detailed outline of steps which should be included in ChipBasedQcPipeline is provided. This document also suggest the CipBasedQcPipelineWorkflow sequence and principal ideas for solutions. This document does not specify exact tools to be used; as most of operations are data manipulations, it will be up to the involved analysts to decide what tool may be more convenient for them and formulate the CipBasedQcPipelineWorkflow. The actual implementation of the workflow should allow automatic reproduction of the results and application of the same workflow to new data. This is not only important from good practice point of view, but also keeping in mind that more data will come to the same pipeline in the future. 4 4 5 The document assumes V CF v.4 format (http://1000genomes.org/wiki/doku.php?id=1000_genomes:analysis:vcf4.0) is used; “+” strand is used in VCF. It is assumed that chip data come in ???WHAT??? format (WE NEED TO COME UP WITH STANDARD FORMAT WE CAN GET ALL CHIP DATA IN, OR A NUMBER OF FORMATS).5 The document assumes VcfGtDataFromat (in particular, VCF v.4 format) is used; “+” strand is used for sequencing data. It is assumed that chip data come in ChipGtDataFormat. 6 6 7 7 == CHIP-VCF BUILD AND DBSNP MATCHING TABLE ==