#===============================================================================# # README #===============================================================================# This README contains more information on the columns in the uORFdb database dumps. #-------------------------------------------------------------------------------# # publication_dump_uORFdb.tsv #-------------------------------------------------------------------------------# Contains the authors, taxa and genes for the publications in the database. *) PubMedID The identifier from PubMed *) Authors The author(s) of the publication *) Title Title of the publication *) Abstract Abstract of the publication *) PublicationType The type of the publication according to the RIS specification. *) PublDate The publication date *) StartPage First page of the publication *) EndPage Last page of the publication *) Volume Volume of the journal in which the publication was published. *) Issue Issue of the journal in which the publication was published. *) DOI DOI identifier of the publication *) JournalName Name of the journal *) AbbrJournalName Abbreviated journal name *) Publisher The name of the publisher *) Country The country of the publisher For the following columns, please see the description in the uORFdb documentation (https://www.bioinformatics.uni-muenster.de/tools/uorfdb/documentation). Except for Taxa, GeneIDs, GenBankIDs, GeneSymbols and the GeneNameInPaper columns, all columns contain boolean values: 1 ("+" on web site) indicates positive evidence, 0 ("-" on web site) indicates negative evidence for a feature. No content translates to no information in the respective publication. *) AlternativePromotors *) AlternativeSplicing *) Tissue-specific_uORFs *) Non-AUGuORFs *) Number *) Length *) DistanceFrom5'Cap *) DistanceFrom_uORF-STOPtoCDS *) CDSoverlap *) CDSrepression *) CDSinduction *) StartSiteSelection *) mRNAdestabilization *) Nonsense-mediatedDecay *) RNAsecondaryStructure *) RibosomeLoad *) RibosomePausingStalling *) RibosomeShunting *) KozakConsensusSequence *) TranslationalStatus *) TerminationContext *) uORF_RNApeptideSequence *) RegulatorySequenceMotif *) Co-factorRibosomeInteraction *) Disease-related_uORFs *) AcquiredMutationsSNPs *) MouseModels *) RibosomeProfiling *) BioinformaticsArraysScreens *) Proteomics *) Methods *) Review *) Taxa *) GeneIDs Entrez gene ID(s) of the gene(s) connect to the publication. *) GenBankIDs NCBI nucleotide accession(s) of the gene(s) connect to the publication. *) GeneSymbols *) GeneNameInPaper #--------------------------------------------------------------------------------# # example.publication_dump_uORFdb.tsv #--------------------------------------------------------------------------------# Uses the same columns as publication_dump_uORFdb.tsv, but only contains the first 10 records. #--------------------------------------------------------------------------------# # uORF_dump_uORFdb.tsv #--------------------------------------------------------------------------------# Contains all transcripts and uORFs (if available) for each gene in the database. Please see the description in the uORFdb documentation (https://www.bioinformatics.uni-muenster.de/tools/uorfdb/documentation). In some cases, we changed the column names with regard to the web interface to make the content more clear. In this case, we provide the name in the web interface for reference. *) Taxon *) Assembly *) Chr See genes: Chromosome *) Symbol See genes: Gene symbol *) GeneID Entrez gene ID of the gene *) GenBankID NCBI nucleotide accession of the gene *) SymbolAliases *) GeneNames See genes: Names *) NCBIID See transcripts: NCBI ID *) TranscrStart See transcripts: Genomic start *) TranscrEnd See transcripts: Genomic end *) Strand *) TranscrLength See transcripts: Length *) TLSlength *) CDSstart The start position of the CDS (0-based, half-open) *) CDSend The end position of the CDS (0-based, half-open) *) TranscrKozakContext See transcripts: Kozak context *) TranscrKozakStrength See transcripts: Kozak strength *) ExonStarts List of Exon start positions (0-based, half-open and ascending) *) ExonEnds List of Exon end positions (0-based, half-open and ascending) *) uORF_ID *) uORFstart See uORFs: Genomic start *) uORFend See uORFs: Genomic end *) uORFstartCodon See uORFs: Start codon *) uORFstopCodon See uORFs: Stop codon *) uORFlength *) uORFCDSdistance See uORFs: CDS distance *) uORF5'-capDistance See uORFs: 5'-cap distance *) uORFkozakContext See uORFs: Kozak context *) uORFkozakStrength See uORFs: Kozak strength *) uORFtype See uORFs: Type *) uORFreadingFrame See uORFs: Reading frame *) uORFnucleotideSeq See uORFs: Exonic sequence *) uORFaminoSeq See uORFs: Amino acid sequence *) SharedStartCodon #--------------------------------------------------------------------------------# # example.uORF_dump_uORFdb.tsv #--------------------------------------------------------------------------------# Uses the same columns as uORF_dump_uORFdb.tsv, but only contains the first 10 records. #--------------------------------------------------------------------------------# # variant_sequence_dump_uORFdb.tsv #--------------------------------------------------------------------------------# Contains all human cancer variants in the database together with the effects on the associated uORFs. Please see the description in the uORFdb documentation (https://www.bioinformatics.uni-muenster.de/tools/uorfdb/documentation). In some cases, we changed the column names with regard to the web interface to make the content more clear. In this case, we provide the name in the web interface for reference. The results in this file are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. *) uORF_ID *) Chr The chromosome on which the variant resides. *) Position See variants: Genomic position. *) Ref See variants: Reference allele *) Alt See variants: Alternate alleles; one per line *) dbSNP_ID See variants: dbSNP IDs; one per line *) ClinVar_ID See variants: ClinVar IDs; one per line *) Location See variants: Locations; one per line *) AltCDSdistance See variants: Alternate CDS distances; one per line *) AltuORFlength See variants: Alternate uORF lengths; one per line *) EffectStartCodon See variants: Start codon effects; one per line *) AltStartCodon See variants: ALT start codons; one per line *) EffectStopCodon See variants: Stop codon effects; one per line *) AltStopCodon See variants: ALT stop codons; one per line *) EffectKozakContext See variants: Kozak effects; one per line *) AltKozakContext See variants: ALT Kozak contexts; one per line *) EffectSequence See variants: Sequence effects; one per line *) AltSequence See variants: ALT nucleotide sequences; one per line #--------------------------------------------------------------------------------# # example.variant_sequence_dump_uORFdb.tsv #--------------------------------------------------------------------------------# Uses the same columns as variant_sequence_dump_uORFdb.tsv, but only contains the first 10 records. #--------------------------------------------------------------------------------# # variant_frequency_dump_uORFdb.tsv #--------------------------------------------------------------------------------# Contains all human cancer variants and their allele frequencies in the database together with their associated uORF IDs. Please see the description in the uORFdb documentation (https://www.bioinformatics.uni-muenster.de/tools/uorfdb/documentation). In some cases, we changed the column names with regard to the web interface to make the content more clear. In this case, we provide the name in the web interface for reference. The reference allele frequencies of the six cancer types and over all cancers are NA by definition. In our graphs, we define the reference frequency as one minus the frequency of all alt alleles for the respective variant. The results in this file are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga. *) uORF_ID *) Chr The chromosome on which the variant resides. *) Position See variants: Genomic position *) Ref See variants: Reference allele *) Alt See variants: Alternate alleles; one per line *) dbSNP_ID See variants: dbSNP IDs; one per line *) ClinVar_ID See variants: ClinVar IDs; one per line *) Location See variants: Locations; one per line *) ExAC(ref; alt) The frequencies of the ref and alt alleles in ExAC (according to dbSNP). *) gnomAD(ref; alt) The frequencies of the ref and alt alleles in gnomAD - Genomes (according to dbSNP). *) TopMed(ref; alt) The frequencies of the ref and alt alleles in TopMed (according to dbSNP). *) BRCA_AF(ref; alt) The allele frequencies of the reference and alternate allele inferred from the tumor genotypes of breast cancer patients and the total number of breast cancer patients (117). *) COAD_AF(ref; alt) The allele frequencies of the reference and alternate allele inferred from the tumor genotypes of colon cancer patients and the total number of colon cancer patients (113). *) LAML_AF(ref; alt) The allele frequencies of the reference and alternate allele inferred from the tumor genotypes of blood cancer patients and the total number of blood cancer patients (48). *) LUAD_AF(ref; alt) The allele frequencies of the reference and alternate allele inferred from the tumor genotypes of lung cancer patients and the total number of lung cancer patients (142). *) PRAD_AF(ref; alt) The allele frequencies of the reference and alternate allele inferred from the tumor genotypes of prostate cancer patients and the total number of prostate cancer patients (122). *) SKCM_AF(ref; alt) The allele frequencies of the reference and alternate allele inferred from the tumor genotypes of skin cancer patients and the total number of skin cancer patients (135). *) Total_AF(ref; alt) The allele frequencies of the reference and alternate allele inferred from the tumor genotypes of all cancer patients and the total number of all cancer patients (677). #--------------------------------------------------------------------------------# # example.variant_frequency_dump_uORFdb.tsv #--------------------------------------------------------------------------------# Uses the same columns as variant_frequency_dump_uORFdb.tsv, but only contains the first 10 records.