[[TOC]] == Groningen cluster == '''People''' UMCG: Morris, Freerk, more? '''Description''' Description here about code template and automatic PBS script generation. Job submission/monitoring == Port applications to Dutch Life Science Grid == '''People''' * AMC: Antoine van Kampen, Barbera van Schaik, Silvia D Olabarriaga, Mark Santcroos * Sara/BiGGrid: Tom Visser, more? * UMCG: Morris, Freerk '''Description''' Software is going to be implemented as workflow components. The workflows will run on the Dutch life science grid. * Information about the infrastructure: http://www.bioinformaticslaboratory.nl/twiki/bin/view/EBioScience/ * Getting started: http://www.bioinformaticslaboratory.nl/twiki/bin/view/EBioScience/GettingStarted '''Implemented workflow components at AMC''' This list of workflow components are already available. We can expand it with Pindel and (parts of) the GATK pipeline. * Splitting of fastq files * Building a BWA index on the genome sequence (base space and color space) * BWA for shotgun reads (base space and color space) It is possible to do parameter sweeps. Output is in bam format * Merge bam results * Samtools pileup * Varscan (pileup to snp, indel and cns) * Bam2coverage creates a UCSC wiggle file to display the genome coverage (per 50kbp) * Coverage-per-base determines the coverage for every base in the genome and it summarizes the results (coverage versus frequency) * Annovar (currently working on the implementation). This is a pipeline to annotate variants (gene, dbsnp, hapmap, 1000g, conservation, etc) '''Implemented components of the Groningen pipeline''' ''A more detailed description will follow later'' * !BwaIllumina (''test phase'') - pe00-bwa-align-pair1.ftl, pe01-bwa-align-pair2.ftl, pe02-bwa-sampe.ftl, pe03-sam-to-bam.ftl, pe04-sam-sort.ftl * !MarkDuplicates (''test phase'') - pe05-mark-duplicates.ftl * !PicardQC (''test phase'') - pe04b-picardQC.ftl '''To be implemented''' * The components of the Groningen pipeline that not implemented as a workflow component yet * Pindel '''Data access rights''' To ensure that the most limited group of people has access to the data we have created a subgroup "gvnl" within the "vlemed" Virtual Organisation (VO). For people to become part of this group, it is required that they have a Grid certificate and that they are part of the "vlemed" VO. On the following page there is information on how to get a certificate, how to get into the "vlemed" VO: http://www.bioinformaticslaboratory.nl/twiki/bin/view/EBioScience/EBioInfra#Access For more information about data access see http://www.bioinformaticslaboratory.nl/twiki/bin/view/EBioScience/DataManagement '''Things to address''' * Available disk space on the grid storage elements / worker nodes == Alternatives == === Clusters === * Groningen * Leiden * Huygens * Lisa * Philips * DAS === Grid === * EBioInfra http://www.bioinformaticslaboratory.nl/twiki/bin/view/EBioScience/ * BiGGrid Cloud http://www.cloud.sara.nl/ * Topos https://grid.sara.nl/wiki/index.php/Using_the_Grid/ToPoS