Annotation Database Import for Annotators

From MediaWiki
Revision as of 22:40, 30 November 2023 by Rob (talk | contribs)
Jump to navigation Jump to search

Background

This application produces a configuration for annotation data that will be used by the importer to import annotation data into the database. The importer will download the information produced by this app and combine it with other sources of data to assemble the import.

The database contains a standardized set of fields and tags related to observations of species in real time from video feeds, or from video and photo annotations, in particular, those generated using Biigle. The labels and descriptions used in the field or during annotation have to be mapped to entities in the database in order to create a coherent dataset across multiple surveys.

The labeling strategy is documented in the annotation protocol -- configured on the Annotation Protocol page -- which tells future consumers of this data something about the annotators' objectives and provides some context to the results.

The Database Import for Annotators app consists of several sequential pages which are completed in order. When the current page's requirements are satisfied, the Next button becomes enabled. The Next, Previous and intermediate buttons can be used to navigate through the app.

Note: this application is under development and changes will probably occur. Feedback is always welcome.

Start

The start page provides a brief description of the app and what is required from the user.

Sign In

Here, the user will provide their login credentials and their Biigle username and API token.

Biigle API tokens are created on the Access Tokens page of the Biigle user profile.

Once the user logs in, a list of previously-completed import jobs is displayed at the bottom of the page. They can be viewed, edited and deleted.

To create a new project, click the New Import Job button.

Select Project

On this page, if the user has provided correct Biigle credentials, the Select Biigle Project drop-down will be populated with all Biigle projects available to the user. (Otherwise, a pop-up will appear with the message, "Unauthenticated." This can be dismissed and will be removed in the future.)

The Load a Label Tree button allows the user to load a JSON file representing an artificial label tree derived from a database or spreadsheet file. More information about this can be found on the Label Trees page. This is not used for Biigle annotations.

Label Mapping

This is the page where most of the work is done.

The first section, Tag Documentation, provides a listing of the tags available for label mapping. Mouse over any of the tags to see a brief description of its purpose.

The Project Label Trees drop-down provides a listing of label trees in the selected Biigle project. If a non-Biigle label tree file has been loaded, there will be one tree in the list. Select a label tree from the list and click the + button to add it to the app. In most cases it will be desirable to add all of the available label trees for the project.

The All Label Trees list contains all available label trees in Biigle. In instances where labels have been used in a project but then removed, it will be necessary to map that tree even though it is no longer in the project label trees list. You may add it from this list if it still exists. (There is no current solution for missing label trees. It is a bad idea to delete them before this process is complete!)

If it is not clear which label tree a specific label comes from, copy the label's ID into the Find a Label's Tree text box and click the search button. This will return the name of the tree that owns the label, if it exists. You can then add that tree using the All Label Trees drop-down.

The next section of the page contains the label trees and labels. Each label tree is listed in order, beginning with its title. If the tree has more than 50 labels, a page navigation strip is displayed below the title. Below the page navigation, the Show hidden checkbox can be used to display individual labels that have been hidden (more on that below); the Show label hierarchy checkbox toggles whether the full hierarchical path of a label is shown, or just its name; the Delete button removes the tree from the label mapping app and the Save button saves your current progress (this also happens automatically every thirty seconds).

Note that if you leave the page or delete a label tree from the app, the app will remember the whatever progress has been made on that label: if the tree is added again, the previous configurations should appear. When a new label tree is loaded, if any labels have been mapped before, the most recently-used mapping will be pre-populated. Once the mapping is edited, it will maintain the edited state.

The labels are displayed in a table with columns displayed depending on which tags are selected in the Tag(s) drop-down. The static columns are:

  • Hide -- If the button is clicked, the row is hidden.
  • Label ID -- The Biigle label ID.
  • Source Label -- The Biigle label text. If Show label hierarchy is checked, the full path will be shown.
  • Tag(s) -- A multiple drop-down to which the user will add tags to perform label mapping.
  • General -- An experimental feature which will be documented or removed in the future.

To begin the mapping process, choose tags from the Tag(s) drop-down.

If the label is an observation of a species:

  1. Select the Observation tag. The list of available tags will update to contain choices relevant to that top-level label (see the tags list at the top of the page to see the tag relationships).
  2. Select the Species tag. The Observation > Species field set will be added to the table. This includes the scientific and common names of the species, the Aphia ID, iNatualist ID and Hart code, and an OTU.
  3. Click the search button to the right of the label. You may also select text within the label and then click search -- the search will be performed only on the selected text.
  4. If the searched text represents the name, or part of the name, of a species, a list of results will be shown, containing the scientific and common names of the species (or, genus, class, etc.), and the Aphia ID, iNatualist ID and the Hart code, if available. If the search results are unsatisfactory, the user can type any text into the search field and try again, or click the search button next to any of the names in the results list to search on those names. Clicking on the common and scientific names and IDs will place those values into the appropriate fields. The fields can also be populated manually with any appropriate value.
  5. Choose additional tags from the Tag(s) drop-down to modify or specialize the label mapping. If the organism is dead, the Dead tag can be applied; if the organism appears as a grouping or school that is too large to count, the Group tag will indicate as much. These labels appear in the database as a list of tags.

If the label is a habitat:

  1. Select the Habitat tag.
  2. Select the Substrate tag. Substrate is generally used to describe the underlying substrate, while Biocover is used to describe living material covering the substrate.
  3. Select the Type tag. This will cause a new fieldset, Habitat > Substrate > Type to appear with one field, Substrate. This field is a drop-down containing existing substrate types. Select one to match the substrate in the label.

It may happen that a label contains multiple characteristics. For example, the label, "Habitat (Subdominant) > Substrate: Mud," describes a substrate with the type, "mud," but but describes it as subdominant. Many survey protocols capture both the dominant and subdominant habitat substrate, so the user can select the Subdominant or Dominant tags to flag the habitat. A label may also include complexity or relief information. Selecting the Complexity or Relief tags will display a drop-down, allowing the user to configure those characteristics as well.

Personnel

On this page, a list of annotators will appear alongside a list of matching users already stored in the database. If the user doesn't yet exist in the database, they can be added. The page will make an attempt to automatically map names, but the user may have to verify and adjust the result.

Note: there are no personnel for label trees loaded through a label tree file (yet).

Annotation Protocol

This section describes the protocol used for annotation. Fields are available for the protocol name, the person that originated the protocol, the observation interval, etc. There is space at the bottom of the form for uploading annotation protocol documents, species guides or other files.

Complete

Here, the user can enter a unique name for the annotation job, and the name of the cruise to which it applies. Notes can be added for future reference or as a guide to the person who does the final import into the database.

Click Submit to submit the job. The app will switch back to the sign-in page, where the job will appear in the list.