Annotation Database Deployment: Difference between revisions

From MediaWiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
= Installation =  
= Functional Overview =
 
This is a brief description of the services operating on the server, how they fit together and how they're configured. More detailed on each piece will eventually be available below.
 
=== Nginx ===
 
[https://nginx.org Nginx] is the front door of the server and all of the applications hosted there. Nginx can serve static files, redirect a request to other hosts, or behave as a proxy for other services on the same machine (this simply means that it forwards requests to another server while behaving as if it were the server itself). Most importantly, Nginx serves as a load-balancer: it is able to respond to high volumes of traffic, or failures in other services by redirecting or providing clients with information about failures.
 
By convention, Nginx configuration files are stored in <code>/etc/nginx/sites-available</code> and linked to <code>/etc/nginx/sites-enabled</code> to activate them. This provides an easy way to take services offline without deleting the original file, but also prevents the proliferation of different versions of a file. When the file is edited in <code>sites-available</code>, those changes are automatically reflected in the copy in <code>sites-enabled</code>.
 
One can also link files directly from the repository to <code>sites-enabled</code> but one should be very careful using this strategy, as breaking changes in the repo will automatically break the server -- a bad idea in production!
 
The files used by the ROV database site for configuration are <code>[https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/msea.nginx msea.nginx]</code> and <code>[https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/mseastage.nginx mseastage.nginx]</code>. The former is used for production, the latter for a staging site where changes are tested before deployment. These are renamed to <code>msea</code> and <code>mseastage</code> when they're deployed into the server. The <code>.nginx</code> extension is just for easy identification in the repo.
 
==== IP Filtering ====
 
Because some parts of the site are not open to the public, we need an IP filtering block that can be included into a location block in the configuration. The file is called <code>[https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/ipfilter.nginx ipfilter.nginx]</code>.
 
==== Wiki Configuration ====
 
This site uses [https://www.mediawiki.org/wiki/MediaWiki MediaWiki] and runs on an [https://en.wikipedia.org/wiki/Apache_HTTP_Server Apache] instance which receives requests through Nginx, for the sake of convenience. Configuration is described on [https://www.mediawiki.org/wiki/Manual:System_administration# the MediaWiki site].
 
=== uWSGI ===
 
[https://uwsgi-docs.readthedocs.io/en/latest/ uWSGI] (i.e., ''micro''-WSGI) is a Python application server which receives proxy requests from Nginx. Rather than running Django as a standalone application as one would in development, uWSGI configures an environment and loads the application into memory, reserving memory and processes to handle requests. uWSGI is configured to receive requests through a socket file stored in the <code>/tmp</code> directory.
 
The configuration path structure for uWSGI is similar to that of Nginx. Application configuration files are stored in <code>/etc/uwsgi/apps-available</code> and linked into <code>/etc/uwsgi/apps-enabled</code>. For each of the <code>msea</code> and <code>mseastage</code> sites mentioned in the Nginx section above, there is a corresponding Django app, configured by <code>[https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/msea_uwsgi.ini msea_uwsgi.ini]</code>, <code>[https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/mseastage_uwsgi.ini mseastage_uwsgi.ini]</code>.
 
The following example shows an Nginx configuration which forwards requests to the ReST API, a Django application:


The database, utilities and website are all stored in the same [https://gitlab.com/rskelly/msea-rov-db git repository], which is checked out on the server machine and deployed using a single script, [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/install.sh install.sh]. The script is run
        # API endpoint.
from the command line, at which point it will prompt the user to type <code>stage</code> or <code>prod</code>. If the former, the application is deployed to the staging environment; if the latter, it is deployed to the production environment.
        location /api {
                uwsgi_pass unix:/tmp/uwsgi_msea.sock;
                include uwsgi_params;
                uwsgi_read_timeout 300s;
                client_max_body_size 64m;
        }


Two switches are available:
The <code>uwsgi_pass</code> directive instructs the server to forward requests to the given socket file.  
* <code>-v</code> will recompile and install the Vue code.
* <code>-r</code> will reinstall the Python libraries required by the application.
If neither of these steps are required, the installation can be run without them. This is appropriate when only the backend code has been updated.


The database is not automatically updated by the install script. To deploy the database, follow the instructions on the [[Annotation_Database#Database_Upgrades|Annotation Database]] page.
The corresponding uWSGI configuration follows:


== NginxConfiguration ==
    [uwsgi]
    plugin = python3
    venv = /var/msea/.venv
    home = /var/msea/.venv
    chdir = /var/msea/
    env = DJANGO_SETTINGS_MODULE=main.settings.prod
    master = True
    log-master = True
    vacuum = True
    max-requests = 5000
    module = main.wsgi:application
    workers = 2
    socket = /tmp/uwsgi_msea.sock
    py-autoreload = 1
    uid = www-data
    gid = www-data


The server uses [https://en.wikipedia.org/wiki/Nginx Nginx] to proxy requests to other services, such as the Django applications, tile server and this Wiki. It also loads the default and error pages when necessary.  
Notice the <code>socket</code> directive, which is identical to the socket file in the Nginx location block.  


=== IP Filtering ===
The <code>venv</code> and <code>home</code> directives point to a Python [https://docs.python.org/3/library/venv.html virtual environment] directory, which must have Django and all other dependencies installed in it (a [[#Python Requirements|requirements file]] is a good idea). Note that the <code>plugin</code> directive points to <code>python3</code>, which is the system Python. This is used to set up the uWSGI environment, even though the user would use <code>python</code>/<code>pip</code> when using <code>venv</code> on the command line.


Because the site is not open to the public, we need an IP filtering block that will be loaded into each location in the server. This is included by the main configuration file, below. The file is called [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/ipfilter.nginx ipfilter.nginx].
The <code>module</code> directive points to the Django application. In this case, the Django project is called <code>main</code> (the project name is the name of the folder with the <code>wsgi.py</code> file in it). Within the <code>wsgi.py</code> file, the application instance is created as <code>application</code>. Note that in some documentation and example configurations, the instance is named <code>app</code>, which is incorrect. Open <code>wsgi.py</code> and verify the name of the application instance.


=== Server, Location Configurations ===
The <code>settings</code> directive points to the application settings. Like the <code>module</code> directive, it points to the project's root module (<code>main</code>) and the settings submodule (<code>settings</code>). The ROV database application has several sub-submodules, but most applications will only have one, similar to <code>main.settings</code>.


This is the main server configuration file, called [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/msea.nginx msea.nginx]. There is a similar file for the staging server, and a further configuration for the Wiki site at [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/wiki.nginx wiki.nginx].
==== Python Requirements ====


== Tile Server Configuration ==
The [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/requirements_debian.txt requirements] file is used by the [[#Installation|install script]] to configure the virtual environment used by uWSGI to serve the application. This file is output by [https://pip-python3.readthedocs.io/en/stable/reference/pip_freeze.html pip freeze].


The tile server runs on an Apache instance and receives requests from Nginx. The tile server configuration is described [[Tile Server|here]].
= Installation =


== uWSGI Configuration ==
The database, utilities and website are all stored in the same [https://gitlab.com/rskelly/msea-rov-db git repository], which is checked out on the server machine and deployed using a single script, [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/install.sh install.sh]. The script is run
from the command line, at which point it will prompt the user to type <code>stage</code> or <code>prod</code>. If the former, the application is deployed to the staging environment; if the latter, it is deployed to the production environment.


The Django application is served by the [https://en.wikipedia.org/wiki/UWSGI uWSGI] application server, receiving proxied requests through Nginx.
Two switches are available:
* <code>-v</code> will recompile and install the Vue code.
* <code>-r</code> will reinstall the Python libraries required by the application.
If neither of these steps are required, the installation can be run without them. This is appropriate when only the backend code has been updated.


The configuration file, [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/msea_uwsgi.ini msea_uwsgi.ini], provides the location of a virtual environment (<code>.venv</code>), and environment variables, in particular the location of the application settings object. The socket through which Nginx communicates with uWSGI and the user under which it runs are also given. There is a similar file for the staging site.
The database is not automatically updated by the install script. To deploy the database, follow the instructions on the [[Annotation_Database#Database_Upgrades|Annotation Database]] page.


== Python Requirements ==
== Tile Server Configuration ==  


The [https://gitlab.com/rskelly/msea-rov-db/-/blob/main/app/server_config/requirements_debian.txt requirements] file is used by the install script to configure the virtual environment used by uWSGI to serve the application. This file is output by [https://pip-python3.readthedocs.io/en/stable/reference/pip_freeze.html pip freeze].
The map tile server runs on an [https://en.wikipedia.org/wiki/Apache_HTTP_Server Apache] instance which receives requests through Nginx. The tile server configuration is described [[Tile Server|here]].


== Firewall Configuration ==
== Firewall Configuration ==

Latest revision as of 20:01, 19 February 2024

Functional Overview

This is a brief description of the services operating on the server, how they fit together and how they're configured. More detailed on each piece will eventually be available below.

Nginx

Nginx is the front door of the server and all of the applications hosted there. Nginx can serve static files, redirect a request to other hosts, or behave as a proxy for other services on the same machine (this simply means that it forwards requests to another server while behaving as if it were the server itself). Most importantly, Nginx serves as a load-balancer: it is able to respond to high volumes of traffic, or failures in other services by redirecting or providing clients with information about failures.

By convention, Nginx configuration files are stored in /etc/nginx/sites-available and linked to /etc/nginx/sites-enabled to activate them. This provides an easy way to take services offline without deleting the original file, but also prevents the proliferation of different versions of a file. When the file is edited in sites-available, those changes are automatically reflected in the copy in sites-enabled.

One can also link files directly from the repository to sites-enabled but one should be very careful using this strategy, as breaking changes in the repo will automatically break the server -- a bad idea in production!

The files used by the ROV database site for configuration are msea.nginx and mseastage.nginx. The former is used for production, the latter for a staging site where changes are tested before deployment. These are renamed to msea and mseastage when they're deployed into the server. The .nginx extension is just for easy identification in the repo.

IP Filtering

Because some parts of the site are not open to the public, we need an IP filtering block that can be included into a location block in the configuration. The file is called ipfilter.nginx.

Wiki Configuration

This site uses MediaWiki and runs on an Apache instance which receives requests through Nginx, for the sake of convenience. Configuration is described on the MediaWiki site.

uWSGI

uWSGI (i.e., micro-WSGI) is a Python application server which receives proxy requests from Nginx. Rather than running Django as a standalone application as one would in development, uWSGI configures an environment and loads the application into memory, reserving memory and processes to handle requests. uWSGI is configured to receive requests through a socket file stored in the /tmp directory.

The configuration path structure for uWSGI is similar to that of Nginx. Application configuration files are stored in /etc/uwsgi/apps-available and linked into /etc/uwsgi/apps-enabled. For each of the msea and mseastage sites mentioned in the Nginx section above, there is a corresponding Django app, configured by msea_uwsgi.ini, mseastage_uwsgi.ini.

The following example shows an Nginx configuration which forwards requests to the ReST API, a Django application:

       # API endpoint.
       location /api {
               uwsgi_pass unix:/tmp/uwsgi_msea.sock;
               include uwsgi_params;
               uwsgi_read_timeout 300s;
               client_max_body_size 64m;
       }

The uwsgi_pass directive instructs the server to forward requests to the given socket file.

The corresponding uWSGI configuration follows:

   [uwsgi]
   plugin = python3
   venv = /var/msea/.venv
   home = /var/msea/.venv
   chdir = /var/msea/
   env = DJANGO_SETTINGS_MODULE=main.settings.prod
   master = True
   log-master = True
   vacuum = True
   max-requests = 5000
   module = main.wsgi:application
   workers = 2
   socket = /tmp/uwsgi_msea.sock
   py-autoreload = 1
   uid = www-data
   gid = www-data

Notice the socket directive, which is identical to the socket file in the Nginx location block.

The venv and home directives point to a Python virtual environment directory, which must have Django and all other dependencies installed in it (a requirements file is a good idea). Note that the plugin directive points to python3, which is the system Python. This is used to set up the uWSGI environment, even though the user would use python/pip when using venv on the command line.

The module directive points to the Django application. In this case, the Django project is called main (the project name is the name of the folder with the wsgi.py file in it). Within the wsgi.py file, the application instance is created as application. Note that in some documentation and example configurations, the instance is named app, which is incorrect. Open wsgi.py and verify the name of the application instance.

The settings directive points to the application settings. Like the module directive, it points to the project's root module (main) and the settings submodule (settings). The ROV database application has several sub-submodules, but most applications will only have one, similar to main.settings.

Python Requirements

The requirements file is used by the install script to configure the virtual environment used by uWSGI to serve the application. This file is output by pip freeze.

Installation

The database, utilities and website are all stored in the same git repository, which is checked out on the server machine and deployed using a single script, install.sh. The script is run from the command line, at which point it will prompt the user to type stage or prod. If the former, the application is deployed to the staging environment; if the latter, it is deployed to the production environment.

Two switches are available:

  • -v will recompile and install the Vue code.
  • -r will reinstall the Python libraries required by the application.

If neither of these steps are required, the installation can be run without them. This is appropriate when only the backend code has been updated.

The database is not automatically updated by the install script. To deploy the database, follow the instructions on the Annotation Database page.

Tile Server Configuration

The map tile server runs on an Apache instance which receives requests through Nginx. The tile server configuration is described here.

Firewall Configuration

A script, ufw.sh is run by the install script to reconfigure the firewall so that only necessary services are available to the world.

The exclusion of IPs external to DFO is not performed by the firewall because some services must be available to external users.