Institute of Bioinformatics Münster
TEclass2 - Overview
TEclass2 classifies unknown transposable elements (TEs) consensus sequences into 16 classes taken from Wicket et al. classification system:
16 superfamilies model: Copia, Crypton, ERV, Gypsy, hAT, Helitron, Jockey, L1_L2, Maverick, Merlin, SINE, P, Pao, RTE, TcMar and Transib with a weighted average F1-score of 0.79.
Methods
The classification uses the deep learning model Transformers (Vaswani et al. 2017) and outputs the softmax score (Goodfellow et al. 2016) for each TE category that can be interpreted as probabilities. The input sequences must be TE models from consensus sequence in fasta format. You can either upload the file you want to process, or paste the sequences directly. Note that the tool cannot distinguish between TEs and non-TEs, thus every sequence will be classified into one of categories even if it is not a TE.
2023-08-28 12:50