Version 10 (modified by 11 years ago) (diff) | ,
---|
Gene, exon and transcript counts
Counts and Spearman correlations for run 1
Date: 06-november-2013
Analysis by: Peter-Bram 't Hoen
The combined gene counts for the 2330 samples from run 1 are available on the VM: /virdir/Backup/run_1_gene_counts/combined_gene_count_run_1.txt and were generated using this script: R script for merging gene count tables
Subsequently, pairwise Spearman correlations were calculated: /virdir/Backup/run_1_gene_counts/Spearman_correlations_complete_gene_data_run_1.txt
From these the median Spearman correlation for each sample to each other sample was calculated. This is also called the D-statistic. The D-statistics (ranked from low to high) can be found in this file Median Spearman correlations
Boxplot of median Spearman correlations grouped by flowcell (Martijn Vermaat)
Boxplot of median Spearman correlations grouped by biobank
After removing the two samples with very low Spearman correlations to all other samples, the distance matrix was calculated (1 - correlation matrix), and a two-dimensional MDS plot was created using the R function cmdscale. This is the resulting mdsplot. The plot was colored according to the following color scheme:
"LL" - gold
"RS" - blue
"CODAM" - orange
"LLS" - pink
"Amsterdam" - darkred
Same mds plot but now colored according to mean GC percentage: mdsplot GC
Attachments (7)
- merge_count_script.r (756 bytes) - added by 11 years ago.
- Median_pairwise_spearman_correlations_complete_gene_data_run_1.txt (54.0 KB) - added by 11 years ago.
- Median_pairwise_spearman_correlations_by_flowcell_complete_gene_data_run_1.pdf (22.9 KB) - added by 11 years ago.
- mdsplot_filt_colored_biobank.pdf (21.1 KB) - added by 11 years ago.
- Dstat_biobank_boxplot.pdf (5.6 KB) - added by 11 years ago.
- VM_QC_correlations_only_expressed_genes.R (1.7 KB) - added by 11 years ago.
- mdsplot_filt_colored_gc.pdf (24.1 KB) - added by 11 years ago.
Download all attachments as: .zip