Changes between Version 1 and Version 2 of DataManagement/SftpServer


Ignore:
Timestamp:
Aug 17, 2011 11:37:38 PM (13 years ago)
Author:
laurent
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • DataManagement/SftpServer

    v1 v2  
    11== UMCG SFTP (application20.target.rug.nl) ==
    2 === raw_data ===
    3 Contains all the data coming from BGI, including their variant calls.
    4 * /fastq
    5 **  A4 trio fasta files
    6 */hg18
    7 ** Pilot snp, cnv, indel, stat files sent by BGI at the beginning of 2011
    8 * /hg19
    9 ** A4 trio bam, snp, cnv files sent by BGI in April 2011
    102
    11 === resources ===
    12 * GoNL resources tarball (Thanks Freerk!)
     3The SFTP server can be used to access most of the data on the UMCG cluster. Please note that since bandwidth is limited, you should only download the minimum files you need and should download compressed version of the files when available (usually available for all plain text files).
    134
    14 === results ===
    15 Here is all the data that has gone through any kind processing at UMCG
    16 */bam/umcg/
    17 ** A4 trio complete bam files
    18 ** pilot chromosomes 19, 20, X, Y, MT bam files
    19 * /snp/hg18
    20 ** Pilot cleaned up VCF files from the BGI on hg18(sorted, updated to VCF4.0)
    21 * /snp/hg19
    22 ** Pilot initial unfiltered calls from UMCG
    23 ** Lifted-over files from BGI
     5=== /target/gpfs2/gcc/groups/gonl/sftp/ ===
     6Root of the SFTP.
     7
     8=== /target/gpfs2/gcc/groups/gonl/sftp/A4 ===
     9Contains all the information about the A4 test trio, including all the raw and aligned data.
     10
     11=== /target/gpfs2/gcc/groups/gonl/sftp/BGI ===
     12Contains all the data coming from BGI, including their variant calls. The data is organized by batch in the batchX subfolders. Each of the subfolders typically contains the following:
     13* batchX/
     14** A set of compressed files containing the plain text data and md5 files for downloading purpose. These are named as follows: timestamp.BGI.batchX.data_type.hg1X.data_format.tar.bz2. All plain text data should be available as a compressed file, including but not limited to: CNV, InDel, InDel annotations, SNP, SNP annotation. Some of these are available in multiple formats; see BGI data page for more explanation about the BGI data and its formats.
     15** md5 checksum files for all files.
     16* batchX/bam OR batchX/alignment
     17** The BAM files aligned by BGI
     18* batchX/CNV
     19** CNVs in CNV Detector format. If you want to download for all samples, please download the compressed archive from batchX/
     20* batchX/indel
     21** InDels in samtools pileup format. If you want to download for all samples, please download the compressed archive from batchX/
     22* batchX/indel_annotation
     23** Indels annotations in GFF format. If you want to download for all samples, please download the compressed archive from batchX/
     24* batchX/SNP
     25** SNP in SOAPsnp format. If you want to download for all samples, please download the compressed archive from batchX/
     26* batchX/SNP_annotation
     27** SNP annotations in GFF format. If you want to download for all samples, please download the compressed archive from batchX/
     28* batchX/vcf_format/CNV
     29** CNV in VCF format. If you want to download for all samples, please download the compressed archive from batchX/
     30* batchX/vcf_format/indel
     31** Indel in VCF format. If you want to download for all samples, please download the compressed archive from batchX/
     32* batchX/vcf_format/SNP
     33** SNP in VCF format. If you want to download for all samples, please download the compressed archive from batchX/
     34
     35NOTES:
     36* Unless specified otherwise, all data is aligned on hg19
     37* Some of the folder/filenames are inconsistent from one batch to the other. This is because the original names as found on the BGI HD have been kept.
     38
     39=== /target/gpfs2/gcc/groups/gonl/sftp/pilot ===
     40Data fro the pilot, including aligned BAMs and SNPs.
     41
     42=== /target/gpfs2/gcc/groups/gonl/sftp/resources ===
     43GoNL resources tarball (Thanks Freerk!)
     44
     45=== /target/gpfs2/gcc/groups/gonl/sftp/upload ===
     46This is where everyone has write permissions. This directory should be used for data exchange.