wiki:BIOS_ReferenceFiles/MetaExonAnnotation-05-06-13

Version 2 (modified by jamverlouw, 8 years ago) (diff)

--

Meta-exon annotation 05-06-13

To create the meta-exon annotation we merged all overlapping exons from Ensembl version 71 (see [BIOS_ReferenceFiles Transcript annotation section]) using mergeBed tool from BEDTools suite. Overlapping exons belonging to different genes or different strands were also merged into one meta-exon. This was done using the commands described in Meta-exon annotation documentation

Location: srm://srm.grid.sara.nl/pnfs/grid.sara.nl/data/bbmri.nl/RP3/dzhernakova/meta-exons_v71_cut_sorted_05-06-13.gtf.gz
Contact: dasha.zhernakova@gmail.com

Issue: overlapping exons

The meta-exon annotation 05-06-13 track incorrectly contains some overlapping exons. Since this problem was only detected after the first freeze, all exon and gene counts for the freeze are still based on this track. We may correct the track in the future for later pipeline runs.

As an example, the following two entries are in the track:

MT 10470 10766
MT 10760 12137

However, they overlap, which should never be the case.

In the complete meta-exon annotation, 269 such regions were identified, originating from 730 exons which are not properly merged, all of them on MT and X chromosomes. You can easily do the same check with bedtools intersect -n on the meta exon track. This was solved by removing additional contigs(GL*, LRG* etc) from the Ensembl Biomart export before merging.

Affected exons (730) are attached (and could be used as a blacklist): meta-exons_v71_cut_sorted_05-06-13.unmerged.bed

Attachments (1)

Download all attachments as: .zip