Tryporfs is a query tool to access a database of upstream open reading frames (uORFs) annotated in Trypanosoma congolense TcIL3000 genome. Genomic data and nucleotide sequences of CDSs were obtained from the TriTrypDB database (release 25). 5’ UTRs are derived from a T. congolense genome re-sequencing and re-annotation project (unpublished work, M. JÄ…kalski, Institute of Bioinformatics Münster). uORFs were computationally identified as a start codon in 5’ UTR followed by an in-frame stop codon, without minimum length restriction. For detailed information on data retrieval please see “usage”.
Philipp Fervers, Florian Fervers, Norbert Grundmann, Tabea Kischka and Marcin Jakalski
To access the entire set of 31149 annotated uORFs, the user can define queries based on the below mentioned distinct uORF features. Output can be ordered by a specified column and exported to an Excel file.
Selecting a given uORF ID in tryporf’s output redirects to a gene specific window providing the following data:
gene, chromosome, start and stop coordinates, strand, product, length, maximum 5’ UTR length, ASI and CAI of CDS, tandem IDs and # of tandem repeats, gene orthology group, stage specific splice sites (in the format: splice site ID, coordinates, # reads), maximum # of uORFs and stage specific mRNA normalized proteomic data.
To download the results of each query a link "Export Data to Excel" is provided on the result page, which allows to store the results in an Excel file. Additionally, a GFF and BED file are provided for the entire set of predicted uORFs of T. congolense, as well as for a filtered set of non-redundant uORFs, where in case of many consecutive in-frame start codons, only the longest uORF variant was selected.