Institute of Bioinformatics Münster
MetaG
About Usage Tutorial Preprint Run View results View testcases Download GitHub Contact
News
2024-01-10
Updates and new standards for non-viral databases
2023-09-18
Updated filter database to T2T-CHM13v2
2023-06-07
Updated RefSeq and BV-BRC (formerly: PATRIC)
2023-06-02
Updated ICTV
2023-03-17
Updated ICTV
2022-11-24
Updated ICTV
2022-09-16
Updated ICTV
2022-06-30
Query can be a nested archive
2022-05-14
Updated ICTV
2022-04-27
MTX now uses NCBI taxonomy
2022-03-23
Updated names of downloads in finished runs
Usage of MetaG

Start the Application

Click on "Run" to start the analysis. The input mask looks like this:
input
Every request is associated with a unique ID number. All running processes have their specific ID and are handled in order of the submission time. One can access previous runs from the "View results" menu by entering the request ID in the "ID" field. Links to your last 10 runs are provided under the "ID" field. You can explore the results for some test cases by clicking on "View testcases" in the menu.
Notes
  • The calculation of the results may take minutes up to hours. As long as the calculation is in process, you are redirected to a "waiting" page. It updates automatically every minute. Nevertheless, you can close the "waiting" page. MetaG will send you an email (if previously entered) when the calculations are finished. If you did not provide an email adress you have to remember your request ID!
  • Requests are kept for 30 days on the server. After this period all the data will be deleted.
  • There is a limit of 100 open requests. That means that if you try to submit your data and there are 100 requests running, you will have to wait until the queue is ready to accept requests again.
  • The maximum size of a query is 6GB.

Entering Data

Database

Select a database to analyze your reads.

Database Profile

The parameters (alignment + MetaG) used for the taxonomic assignment. You can use our standard parameters for the different sequencing technologies or upload your own data (see below). If you make no selection, you should modify the parameters in the "Advanced Parameters" menu.

Filter

Remove reads that match this database from the analysis.

Filter Profile

The standard alignment parameters for the filter. You can choose our standards for the different sequencing technologies. "Filter Profile" should match the profile selected as "Database Profile", unless you uploaded your own "Database Profile".

Query File

The query are your sequencing reads. It must always be one file (in fastq or fasta format). Please make sure that your sequences don't contain special characters like "-" or "?". To send several files in one request, you can archive them together and send the archive file. Archive formats are .tar, .gz, .zip and.bz2. To archive files (e.g. to zip) do this (in general - in Unix)
zip archive.zip file1 file2 ...
or use a Windows tool like WinZip. Then you can upload the file archive.zip to MetaG and proceed. It is also possible to submit nested archives: For example, you can compress multiple *.fastq.gz files into a single ZIP archive. Please make sure that all files in the archive(s) have unique names.
Note: Transfer and calculation times will generally increase when sending big files. The maximum size is 6 GB.

Minimum Sequence Length

Reads shorter than this threshold are ignored. No entry or "0" turns this filter off.

Email

Your email address, in case you want to be notified when your analysis finishes. If you don't provide your email, you have to remember your request ID.

Analysis Name

The name of your analysis. This helps you to recognize it later.

Advanced Parameters

MetaG Parameters

The parameters influence which taxa are assigned to your reads. The analysis is based on the LAST alignments. Please see the tutorial section for more details.
Only alignments with an e-value less than "E-Value Cutoff" are chosen for the analysis. This field gives the exponent to "e", meaning "-3" will be internally treated as "e^-3". If there are multiple alignments for a single read, they will be filtered based on the "Alignment Score Cutoff". This threshold is relative to the maximum alignment score for a single read. If you set it to 1, only the alignment with the highest score will be chosen. In case of ambiguous assignments for a read, the program will continue to find the best matching taxon, until the "Confidence Cutoff" is violated. The cutoff ranges between 0 and 1 (1 is strict). In the summary tab of the results, you will see the confidence per taxon. This is the average confidence over all matching reads. MetaG will use your "Method for Average Confidence" to calculate the value. The choice will only influence the display in the summary tab of the results. It will not affect the calculations.

LAST Parameters

If you are interested in the details behind LAST parameter choice, please have a look at its web site. Click on the "Advanced Parameters" menu on the "Run" web site to alter the LAST alignment parameters. Alternatively, you can upload a file, see below.
The substitution matrix and the fields "Match Score" and "Mismatch Cost" are mutually exclusive - meaning that you can only use one of them. If you do not enter values into the substitution matrix (meaning empty or "0"), you can select the "Match Score" and "Mismatch Cost" fields. Otherwise they are invisible.

Load your own parameters

For most cases, our standard profiles are a good fit. If you need different parameters for your analysis, you can change them one by one or load a file. Currently, this is only possible for the "Database Profile" which contains the alignment and MetaG parameters for the taxonomic assignment. The file should have this format:
#last -a 14
#last -A 15
#last -b 4
#last -B 4
#last -S 1
#lastsplit -m 0.95
#metag -e -6
#metag -ac 0.9
#metag -cc 1
#metag -m williams
       A      C      G      T
A      6    -13     -6    -16
C    -15      6    -23     -8
G     -5    -25      6    -16
T    -16    -10    -13      6
The kind of parameter setting should be easy to understand. Parameters are separated by blanks or tabs. Parameters not available will not be set/overwritten when the file is loaded. For MetaG, "-e" is the "E-Value Cutoff", "-ac" is the "Alignment Score Cutoff", "-cc" is the "Confidence Cutoff" and "-m" is the "Method for Average Confidence".

Results

When the request is finished, the info tab of the results page will be displayed first:
input
You can select a specific results page, presented on the header line (upper menu). For more information, take a look at the tutorial.
2024-11-04 16:25