The start page allows you to query the database. The menu on the left lets you access
functions of the web site, regardless, if you are on the start page or not. The most
important features that can be accessed from the menu are introduced in the following
sections.
Starting with a query
The central entry point of the database is the "Query" page. Here, you can search for
your topic of interest by selecting a search topic ("Patients", "Samples", and "Taxa")
and then entering data in the search bar(s) and/or ticking the checkbox(es) (Figure 1).
It is also possible to browse the database by submitting an empty search. The search topic
does not only determine the searchable fields, but also which content you will see first
after hitting the "Goto" button.
Figure 1:The search interface for "Patients".
The fields for free text search take different data types, as indicated by the letter
directly in front of each field: "F" = floating point number, "I" = integer, and "T" =
text. Text is automatically removed from fields of type "F" or "I". Integers and floats are
allowed in fields of type "T", as these are interpreted as character varying. To do an
advanced search, you have to enter special query terms: ">" = greater than,
"<" = smaller than, ">=" = greater than or equal to, "<=" = less than or equal to,
"," to separate multiple terms (internally "OR"), and "~" to conduct a pattern match.
For example: type ">20 <40" in the search bar for "Mother's Age at Delivery"
("Patients") to view all children from mothers which are between
20 and 40 years old. Type "1,2" in the search bar for "Pregnancy Order" ("Patients") to
see all first and second born children. "~Escherichia" in the search bar for "Taxon Name"
("Taxa") searches for all taxa whose names contain the term "Escherichia", e.g. Escherichia coli.
The "Clear" button resets all fields. Please note that it is not possible to combine searches
across search topics. So you cannot search for "Patient alias" ("Patients") and "Timepoint"
("Samples") at once. However, you can first search for a patient and restrict the time point in the
results (see below).
Navigate in the search results
If you have chosen to use the search topic "Patients", you will be redirected to the patients view
and see those patients that matched your query. We use the term view for a display of related
items (patients, samples, taxa). You can still navigate to any other view in the database
by using the buttons at the top of the page.
Figure 2: Controls in views at the example of the patients view.
You can select one or multiple rows by ticking the check boxes in front of the rows and then
clicking "Apply" in the top panel. This will remove any other rows from your current view
(use "Undo" to restore the display of rows). If you navigate to any other view, all items
will be related to your selected rows. You also have designated buttons to tick all check boxes
on the current page ("Page+") or on all pages ("All"), to uncheck all boxes ("None") or to
uncheck all boxes on the current page ("Page-"). The number of checked boxes is shown next
to these buttons.
On the top right, there are the pagination controls ("<<": first, "<":
one page backward, ">": one page forward, ">>": last page), followed by the
current page and total page count.
In the taxa view, click on the "?" to see all lineages that the current taxon belongs to
(see also Column Descriptions) or use the ">>" buttons in the designated
fields to query external resources.
Filters
The filters described in the following only affected the current view. Unlike the section
of a row, they don't directly affect the other views! However, you can make a selection based
on the filtered rows which will alter the display in the other views. For example, the goal is
to find all patients that have samples containing Salmonella enterica at a "significant"
frequency. First, use the search interface to search the taxa view for
Salmonella enterica. In the view, you will see some low number hits. Using the
filter panel (see below), you can now choose samples where the minimum read count for this
taxon is 200. Filtering has now only affected the taxa view. Click "All" and then
"Apply" to make a selection based on the filtered taxa. After switching to the patients view,
you will only see those patients that matched your search and filter criteria.
Samples
Samples can be filtered using a minimum and/or maximum sequence count by entering the respective
values in the "# Sequences" filter and clicking the "Filter" button. The "Clear" button removes
the filter.
Taxa
Taxa can be filtered by rank, average read length and/or quality (minimum and/or maximum), and/or
read count (minimum and/or maximum) using the filter panel at the top of the page as described for the
samples.
Exports
Tick the check boxes in front of the entries that you wish to export. Select the desired export option
from the drop-down menu at the top of the page and hit the "GO" button.
Sequences
Export sequences in FASTQ format for the selected samples (samples view).
MicrobiomeAnalyst
Export the classifications and metadata for the selected patients (patients view) / samples (samples view) for
analysis in the
MicrobiomeAnalyst Marker Data Profiling workflow. A ZIP archive will be automatically downloaded
to your computer and a tab with the upload page of the MicrobiomeAnalyst will open in your browser.
Extract the archive and upload the files as shown in Figure 3. Please note that "Taxonomy labels" needs
to be set to "Not Specific / Other".
Figure 3: Upload to the MicrobiomeAnalyst.
Depending on your sample/patient choice, a WARNING file can appear in the ZIP archive. It contains sample IDs
which were not exported, as no classifications were available. Samples without (with incomplete) metadata,
however, are exported and missing values are set to "NA". Please read the instructions on the first output page of
the MicrobiomeAnalyst carefully. They explain, why not all of the exported metadata is visible inside the tool.
In order to comply with the formatting requirements of the MicrobiomeAnalyst, the taxonomy only includes the ranks
"domain", "phylum", "class", "order", "family", "genus", and "species". If a sequence is not classified at the current
rank and all following, the taxon is set to "NA" (database: "UNMATCHED") and classifications with no name at the
current rank are indicated by the special taxon "NoName" (database: "NA").
Column descriptions
In the following you will find an in-depth description of all columns in the MetagenomicsDB web interface.
The documentation is split in parts according to the views.
Patients
Patient alias
Pseudonym of the patient in the study.
Sex
Sex of the patient: "m" for male, "f" for female.
Pregnancy Order
The patient is the mother's nth child where n is the pregnancy order.
Birth Mode
How the patient was born: "natural" or "caesarean section".
Mother's Age at Delivery
The age of the mother in years when giving birth to the patient.
Mother's pre-Pregnancy BMI
The body mass index (BMI) of the mother before getting pregnant
with the patient. The BMI is calculated as [weight in kg] / [height in m] ^ 2
Mother's pre-Pregnancy BMI Category
Rating of "Mother's pre-Pregnancy BMI":
BMI <18.5: "underweight"
BMI >=18.5 and <25: "normal weight"
BMI >=25 and <30: "overweight"
BMI >=30: "obesity"
Maternal Illness during Pregnancy
Mother's illness while being pregnant with the patient. One of:
"diabetes", "thyroid disease", "hypertension", "diabetes + thyroid disease",
"diabetes + hypertension", "thyroid disease + hypertension",
"diabetes + thyroid disease + hypertension"
Maternal Antibiotics during Pregnancy
Did the mother receive antibiotics while pregnant with the patient: "yes", "no".
Difference in Body Mass at Delivery
The increase/decrease in body weight (kg) of the mother during pregnancy.
Category in Difference in Body Mass at Delivery
Rating of "Difference in Body Mass at Delivery" depending on "Mother's pre-Pregnancy BMI":
"not enough":
BMI <18.5 and weight difference <12.5
BMI >=18.5 and <25 and weight difference <11.5
BMI >=25.0 and <30.0 and weight difference <7.0
BMI >=30.0 and weight difference <5.0
"appropriate"
BMI <18.5 and weight difference >= 12.5 and <=18.0
BMI >=18.5 and <25.0 and weight difference >=11.5 and <=16.0
BMI >=25.0 and <30 and weight difference >=7.0 and <=11.5
BMI >=30.0 and weight difference >=5.0 and <=9.0
"too much"
BMI <18.5 weight difference >18.0
BMI >=18.5 and <25.0 and weight difference >16.0
BMI >=25.0 and lt;30.0 and weight difference >11.5
BMI >=30.0 and weight difference >9.0
Samples
Patient alias
See patients view.
Timepoint
The sampling time point relative to the date of birth. Abbreviations are:
"d" = day(s), "w" = week(s), "m" = month(s), "y" = year(s).
Control
The sample is a water control sample: "yes", "no". Control samples
don't have measurement values.
# Sequences
The number of sequences for a sample.
Weight-for-age category
Based on the sampling date, weight, sex, and the
WHO weight-for-age metrics (l,m,s),
a z-score is calculated according to the formulas provided in the
WHO manual, page 302f.
A child is appropriate for gestational age (AGA), if its z-score at meconium is ≥-2.
A child is small for gestational age (SGA), if its z-score at meconium is <-2.
Weight-for-age sub-category
Related to SGA "Weight-for-age category", but provides an indication, if SGA children
catched up in growth with their AGA peers. Catch-up only occurs from the second sample on, so
at the first sample, children are classified according to "Weight-for-age category".
Catch-up children must have a z-score > -2 and the difference between the minimum z-score
at a previous measurement and the current z-score must be ≥ 0.67. If the catch-up event
happened within the first 6 month, it was called "early catch-up", else "late catch-up". Prior
to the catch-up event (if it happened at all), the sub-category was "no catch-up".
Feeding Mode
How the patient was fed: "breastfed", "formula", "mixed", or "diet extension"
Probiotics
Did the patient receive probiotics: "yes", "no".
Antibiotics
Did the patient receive antibiotics: "yes", "no".
Taxa
Patient alias
See patients view.
Timepoint
See samples view.
Control
See samples view.
Count
The number of reads in the given sample that support the respective taxon. Read counts
are determined by taxon name and rank (irrespective of lineage). In some classification
databases, the same taxon name is used multiple times at the same rank, even though the
organisms come from different phylogenetic lineages (see also "Taxon Name").
Taxon Name
The name of the taxon. Use the "?" button to see, if taxa from different lineages share
the same name at the given rank and thus contribute together to the read count for the taxon
(see also "Count"). There are some special taxa: "UNMATCHED" means unclassified at this rank and
all following; "FILTERED" reads were removed during the filtering of human contamination;
"NA" are taxa with no name.
Rank
The taxonomic rank of the taxon.
Avg. read length
The average length of reads supporting this taxon.
Avg. read quality
The average quality of reads supporting this taxon.
Program
The program that was used to classify the reads.
Database
The classification database that was used with the
"Program" to classify the reads.
PubMed
Conduct a search on PubMed with the
current taxon and its connection to SGA/birth weight. The exact query is:
"(TAXON NAME[Title/Abstract]) AND ((birth weight[Title/Abstract]) OR (SGA[Title/Abstract])
OR (small for gestational age[Title/Abstract])) AND
((newborn[Title/Abstract]) OR (infant[Title/Abstract]) OR (child[Title/Abstract])
OR (baby[Title/Abstract]))" . No link for special taxa
(see "Taxon Name")