| 1 | = Meta-exon annotation = |
| 2 | |
| 3 | |
| 4 | To create the meta-exon annotation the following steps were taken: [[BR]] |
| 5 | 1. The exon annotation from Ensembl Biomart v.71 was downloaded. The file contained the following columns: [[BR]]chromosome, exon start, exon end, Ensembl exon id, Ensembl gene id, gene name, strand.[[BR]] |
| 6 | 2. All additional contigs (GL*, LRG* etc) were removed, so that only ordinary chromosomes (1-22, X, Y, MT) remained. This was done by a custom script cutStrangeChr.py (see attachment). [[BR]] |
| 7 | 3. The Biomart file was converted to bed format and sorted by start coordinate:[[BR]] |
| 8 | 4. Exons were merged using mergeBed tools from BEDTools suite:[[BR]] |
| 9 | 5. The resulting file was converted to gtf format, retaining the strand information by a custom script mergedBed_to_gtf.py (see attachment).[[BR]] |
| 10 | |
| 11 | |
| 12 | The final commands to generate the meta-exon annotation were the following:[[BR]][[BR]] |
| 13 | {{{./cutStrangeChr.py biomart_export.txt | awk 'BEGIN {FS="\t"}; {OFS="\t"}; {if ($7 == "-1") $7 = "-"; else $7 = "+"}; {print $1, $2 - 1, $3, $4 ":" $5 ":" $6, ".", $7}' | sort -k1,1n -k2,2n | mergeBed -nms -d -1 -i stdin > biomart_export.merged.tmp}}}[[BR]][[BR]] |
| 14 | {{{./mergedBed_to_gtf.py biomart_export.merged.tmp biomart_export.txt | sort -k1,1n -k4,4n > meta-exons_v71_cut_sorted_18-04-14.gtf}}} |
| 15 | |
| 16 | |
| 17 | |