Source data information
The current miRcode release is based on the hg19 genome assembly and relies on data sources listed below.
GENCODE transcripts/genes
Transcripts in the GENCODE 11 annotation were analyzed in all regions, and results aggregated on a per-gene basis.
For easy-of-use, we classify genes into a few broad categories, but all of GENCODE is included and searchable.
Total transcripts1 | 179,905 |
Total genes1 | 53,520 |
LncRNA genes2 | All | 10,419 |
Intergenic3 | 5,680 |
Overlapping | 4,739 |
Coding genes4 | 19,999 |
Pseudogenes5 | 12,549 |
Other6 | 10,553 |
1Ambigously mapped transcripts are excluded, leading to subtle differences compared to official counts.
2Having no coding spliceforms, and mature transcripts >200 nt.
3Not overlapping with any transcript of a coding gene.
4Genes producing at least one coding, non-NMD, isoform, although several non-coding transcripts may also be produced.
5GENCODE pseudogenes are also included in this miRcode release.
6Remaining genes (e.g. tRNAs, snoRNAs, all-NMD coding genes).
Multiz alignments
The Multiz 46-way vertebrate alignment was used for evaluating site conservation.
Primates | 9 |
Placental mammals | 23 |
Non-mammal vertebrates | 13 |
TargetScan microRNA family definitions
Analyses are based on microRNA seed families as defined by TargetScan 6, as these are widely adopted.
|
|