Annotation Database: Difference between revisions
No edit summary |
No edit summary |
||
Line 6: | Line 6: | ||
There are two databases on the server, <code>msea</code> and <code>msea_stage</code>. The former is the production database. The latter is a copy of the production database used for staging changes and upgrades: once the upgrades are applied and tested successfully, they can be applied to the production database. | There are two databases on the server, <code>msea</code> and <code>msea_stage</code>. The former is the production database. The latter is a copy of the production database used for staging changes and upgrades: once the upgrades are applied and tested successfully, they can be applied to the production database. | ||
= Design = | = Design = | ||
Line 17: | Line 13: | ||
Full database documentation, generated from the production database, can be found here: [[Database Entity Documentation]]. | Full database documentation, generated from the production database, can be found here: [[Database Entity Documentation]]. | ||
= Connecting | = Connecting = | ||
There are a variety of scripts for accessing, importing and managing data, in addition to the Website itself, which is based on Django. All of these load connection parameters from configuration files in the <code>configs</code> directory in the user's home directory, whether on Windows or Linux. Typically, the configuration files are accessed by a path similar to <code>~/configs/db_prod.conf</code>, were <code>~</code> is the standard shorthand for the user's home directory and <code>db_prod.conf</code> is the file containing connection parameters for the production database's non-administrative user. A standard set of files in the <code>configs</code> might look like: | There are a variety of scripts for accessing, importing and managing data, in addition to the Website itself, which is based on Django. All of these load connection parameters from configuration files in the <code>configs</code> directory in the user's home directory, whether on Windows or Linux. Typically, the configuration files are accessed by a path similar to <code>~/configs/db_prod.conf</code>, were <code>~</code> is the standard shorthand for the user's home directory and <code>db_prod.conf</code> is the file containing connection parameters for the production database's non-administrative user. A standard set of files in the <code>configs</code> might look like: | ||
Line 37: | Line 33: | ||
The database name will be one of <code>msea_dev</code> for the local development database (though the developer can choose any name); <code>msea</code> for the production database; or <code>msea_stage</code> for the staging database. | The database name will be one of <code>msea_dev</code> for the local development database (though the developer can choose any name); <code>msea</code> for the production database; or <code>msea_stage</code> for the staging database. | ||
= | = Upgrades = | ||
As above, the database is upgraded by running a script, which applies the necessary DDL files. | As above, the database is upgraded by running a script, which applies the necessary DDL files. | ||
Line 58: | Line 48: | ||
In both of these invocations, the configuration file is for the production database with admin permissions (which are required to alter the structure). The <code>-c</code> flag commits the changes and the <code>-p</code> flag applies the correct permissions to any created entities. | In both of these invocations, the configuration file is for the production database with admin permissions (which are required to alter the structure). The <code>-c</code> flag commits the changes and the <code>-p</code> flag applies the correct permissions to any created entities. | ||
= Backups = | |||
A [https://en.wikipedia.org/wiki/Cron cron] job runs a backup on the production database every night, saving a compressed dump to a designated folder. The backup is actually performed by the [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/utilities/maintenance/db_maintenance.sh db_maintenance.sh] script, which is installed by the database install script, [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/install.sh install.sh]. More information can be found on the [[Deployment]] page. | |||
= Helpful Database Tools = | |||
A few tools that can be useful for navigating and visualising the database: | |||
* [https://www.pgadmin.org/ pgAdmin] | |||
* [https://dbeaver.io/ DBeaver] |
Revision as of 01:51, 10 November 2023
Software
The database runs on PostgreSQL with the PostGIS spatial extension. The database and Web server (Nginx) are hosted on a cloud-based server running the Debian Linux operating system.
Organisation
There are two databases on the server, msea
and msea_stage
. The former is the production database. The latter is a copy of the production database used for staging changes and upgrades: once the upgrades are applied and tested successfully, they can be applied to the production database.
Design
The database is relational and normalized which maximizes data integrity while minimizing duplication. The database was originally designed using the industry-standard Visual Paradigm software, but is now updated by the incremental application of DDL by a script, db_manager.py (see below). The database was developed alongside the Website backend using Django but the Django models do not manage the table structure directly: the database is intended to be framework-agnostic, so the Django models must be manually updated to agree with the database structure.
Full database documentation, generated from the production database, can be found here: Database Entity Documentation.
Connecting
There are a variety of scripts for accessing, importing and managing data, in addition to the Website itself, which is based on Django. All of these load connection parameters from configuration files in the configs
directory in the user's home directory, whether on Windows or Linux. Typically, the configuration files are accessed by a path similar to ~/configs/db_prod.conf
, were ~
is the standard shorthand for the user's home directory and db_prod.conf
is the file containing connection parameters for the production database's non-administrative user. A standard set of files in the configs
might look like:
~/configs/db_prod.conf
-- The production database non-administrative user. Used by the Website to access data.~/configs/db_prod_admin.conf
-- The production database administrative user. Used bydb_manager.py
and other scripts that can change the structure of the database or its contents.~/configs/db_stage.conf
-- Similar todb_prod.conf
but used for the staging database.~/configs/db_stage_admin.conf
-- Similar todb_prod_admin.conf
but used for the staging database.~/configs/db_dev.conf
-- Used on the developer's local machine for access to a development database.
The configuration files will contain colon-delimited pairs of strings on each line:
host:localhost port:5432 dbname:[database name] username:[user name] password:[password]
The database name will be one of msea_dev
for the local development database (though the developer can choose any name); msea
for the production database; or msea_stage
for the staging database.
Upgrades
As above, the database is upgraded by running a script, which applies the necessary DDL files.
With no switches, the script will perform a dry run, applying the DDL to the database but not committing the changes. If it succeeds, the upgrade can be applied permanently by using the -c
switch. The script checks a table to determine the current database version and applies upgrades from that version forward (the versions are encoded in the DDL file names). If the version is given as an argument, upgrades are applied only up to that version (inclusive). The final argument to the script is the path to a configuration file which contains colon-delimited database connection parameters.
An example invocation of the script for a dry run is,
./db_manager.py ~/configs/db_prod_admin.conf
To apply the changes permanently, one would invoke,
./db_manager.py -c -p ~/configs/db_prod_admin.conf
In both of these invocations, the configuration file is for the production database with admin permissions (which are required to alter the structure). The -c
flag commits the changes and the -p
flag applies the correct permissions to any created entities.
Backups
A cron job runs a backup on the production database every night, saving a compressed dump to a designated folder. The backup is actually performed by the db_maintenance.sh script, which is installed by the database install script, install.sh. More information can be found on the Deployment page.
Helpful Database Tools
A few tools that can be useful for navigating and visualising the database: