BDR – Brown University Library Digital Technologies

Deploying with shiv

I recently watched a talk called “Containerless Django – Deploying without Docker”, by Peter Baumgartner. Peter lists some benefits of Docker: that it gives you a pipeline for getting code tested and deployed, the container adds some security to the app, state can be isolated in the container, and it lets you run the exact same code in development and production.

Peter also lists some drawbacks to Docker: it’s a lot of code that could slow things down or have bugs, docker artifacts can be relatively large, and it adds extra abstractions to the system (eg. filesystem, network). He argues that an ideal deployment would include downloading a binary, creating a configuration file, and running it (like one can do with compiled C or Go programs).

Peter describes a process of deploying Django apps by creating a zipapp using shiv and goodconf, and deploying it with systemd constraints that add to the security. He argues that this process achieves most of the benefits of Docker, but more simply, and that there’s a sweet spot for application size where this type of deploy is a good solution.

I decided to try using shiv with our image server Loris. I ran the shiv command “shiv -o loris.pyz .”, and I got the following error:

User “loris” and or group “loris” do(es) not exist.
Please create this user, e.g.:
`useradd -d /var/www/loris -s /sbin/false loris`

The issue is that in the Loris setup.py file, the install process not only checks for the loris user as shown in the error, but it also sets up directories on the filesystem (including setting the owner and permission, which requires root permissions). I submitted a PR to remove the filesystem setup from the Python package installation (and put it in a script the user can run), and hopefully in the future it will be easier to package up Loris and deploy it different ways.

Checksums

In the BDR, we calculate checksums automatically on ingest (Fedora 3 provides that functionality for us), so all new content binaries going into the BDR get a checksum, which we can go back and check later as needed.

We can also pass checksums into the BDR API, and then we verify that Fedora calculates the same checksum for the ingested file, which shows that the content wasn’t modified since the first checksum was calculated. We have only been able to use MD5 checksums, but we want to be able to use more checksum types. This isn’t a problem for Fedora, which can calculate multiple checksum types, such as MD5, SHA1, SHA256, and SHA512.

However, there is a complicating factor – if Fedora gets a checksum mismatch, by default it returns a 500 response code with no message, so we can’t tell whether it was a checksum mismatch or some other server error. Thanks to Ben Armintor, though, we found that we can update our Fedora configuration so it returns the Checksum Mismatch information.

Another issue in this process is that we use eulfedora (which doesn’t seem to be maintained anymore). If a checksum mismatch happens, it raises a DigitalObjectSaveFailure, but we want to know that there was a checksum mismatch. We forked eulfedora and exposed the checksum mismatch information. Now we can remove some extra code that we had in our APIs, since more functionality is handled in Fedora/eulfedora, and we can use multiple checksum types.

Looking at the Oxford Common Filesystem Layout (OCFL)

Currently, the BDR contains about 34TB of content. The storage layer is Fedora 3, and the data is stored internally by Fedora (instead of being stored externally). However, Fedora 3 is end-of-life. This means that we either maintain it ourselves, or migrate to something else. However, we don’t want to migrate 34TB, and then have to migrate it again if we change software again. We’d like to be able to change our software, without migrating all our data.

This is where the Oxford Common Filesystem Layout (OCFL) work is interesting. OCFL is an effort to define how repository objects should be laid out on the filesystem. OCFL is still very much a work-in-progress, but the “Need” section of the specification speaks directly to what I described above. If we set up our data using OCFL, hopefully we can upgrade and change our software as necessary without having to move all the data around.

Another benefit of the OCFL effort is that it’s work being done by people from multiple institutions, building on other work and experience in this area, to define a good, well-thought-out layout for repository objects.

Finally, using a common specification for the filesystem layout of our repository means that there’s a better chance that other software will understand how to interact with our files on disk. The more people using the same filesystem layout, the more potential collaborators and applications for implementing the OCFL specification – safely creating, updating, and serving out content for the repository.

MySQL 5.7 migration

We recently migrated the BDR databases from MySQL version 5.5 to 5.7. Here are a couple benefits for us as application developers:

Stricter Data Handling

By default, MySQL 5.7 uses stricter data handling than 5.5, so we don’t have to manually put MySQL into strict mode.
MySQL 5.5’s loose data handling bit us last summer. We have an application where files can be uploaded, and the file names are stored in the database. A user started getting errors trying to upload new files, because the file names were duplicates (all the file names in the database are required to be unique). It turned out that the file names were too long for the field, so they were being truncated and put into the table anyway. Then, duplicate errors were thrown if a new file name truncated to the same name as another truncated file name. After that, we put MySQL into strict mode for some of our databases, but now it will be that way by default.

Support

The second benefit is that Django 2.1 won’t support 5.5 anymore, and MySQL 5.5 will be End-of-Life this year, so this migration gets us on a better-supported version of MySQL.

Now, if only ‘UTF-8’ in MySQL actually meant UTF-8… Actually, MySQL 8.0 was recently released, and it looks like it uses UTF8MB4 (ie. real UTF-8) by default, so that may be helpful in the future when we move to 8.0.

Python/Django warnings

I recently updated a Django project from 1.8 to 1.11. In the process, I started turning warnings into errors. Django docs recommend resolving any deprecation warnings with current version, before upgrading to a new version of Django. In this case, I didn’t start my upgrade work by resolving warnings, but I did run the tests with warnings enabled for part of the process.

Here’s how to enable all warnings when you’re running your tests:

From the CLI
- use -Werror to raise Exceptions for all warnings
- use -Wall to print all warnings
In the code
- import warnings; warnings.filterwarnings(‘error’) – raise Exceptions on all warnings
- import warnings; warnings.filterwarnings(‘always’) – print all warnings

If a project runs with no warnings on a Django LTS release, it’ll (generally) run on the next LTS release as well. This is because Django intentionally tries to keep compatibility shims until after a LTS release, so that third-party applications can more easily support multiple LTS releases.

Enabling warnings is nice because you see warnings from python or other packages, so you can address whatever problems they’re warning about, or at least know that they will be an issue in the future.

Fedora Functionality

We are currently using Fedora 3 for storing our repository object binaries and metadata. However, Fedora 3 is end of life and unsupported, so eventually we’ll have to decide what our plan for the future is. Here we inventory some of the functions that we use (or could use) from Fedora. We’ll use this as a start for determining the features we’ll be looking for in a replacement.

Binary & metadata storage
Binary & metadata versioning
Tracks object & file created/modified dates
Checksum calculation/verification (after ingestion, during transmission to Fedora). Note: in Fedora 3.8.1, Fedora returns a 500 response with an empty body if the checksums don’t match – that makes Fedora’s checking less useful, since the API client can’t tell why the ingest caused an exception.
SSL REST API for interacting with objects/content
Messages generated whenever an object is added/updated/deleted
Grouping of multiple binaries in one object
Works with binaries stored outside of Fedora
Files are human-readable
Search (by state, date created, date modified – search provided by the database)
Safety when updating the same object from multiple processes

Upgrades and Architecture changes in the BDR

Recently we have been making some architectural changes in the BDR. One big change was migrating from RHEL 5 to RHEL 7, but we also moved from basically a one-server setup to four separate servers.

RHEL 5 => RHEL 7

RHEL 5 support ended in March, so we needed to upgrade. We initially got a RHEL 6 server, but then decided to upgrade to RHEL 7, which will give us longer before we have to upgrade again. Moving to RHEL 7 lets us use more up-to-date software like Redis 2.8.19, instead of 2.4.10, but the biggest issue is that security updates are no longer available for RHEL 5.

Added a Server for Loris

We started using Loris back in the fall. We installed Loris on a new server, and eventually we shut down our previous image server that was running on the same server as most of our other services.

Added Servers for Fedora & Solr

We also added a new server for Solr, and then a new server for Fedora. These two services previously ran on the one server that handled almost everything for the BDR, but now each one is on its own server.

Our fourth server is also RHEL 7 now – that’s where we moved our internet-facing services.

Pros & Cons

One advantage of being on four servers is the security we get from having our services isolated. Processes can be firewalled and blocked on the same server based on different users, firewall rules, … but having our backend servers firewalled off from the Internet and separated from each other encourages better security practices.

Also, the resources our services use are separated. If one service has an issue and starts using all the CPU or memory, it can’t take resources from the other services.

One downside of using four servers is that it increases the amount of work to setup and maintain things. There are four servers to setup and install updates on, instead of one. Also, the correct firewall rules have to be setup between the servers.

Storing Embargo Data in Fedora

We have been storing dissertations in the BDR for a while. Students have the option to embargo their dissertations, and in that case we set the access rights so that the dissertation documents are only accessible to the Brown community (although the metadata is still accessible to everyone). The problem is that embargoes can be extended upon request, so we really needed to store the embargo extension information.

We wanted to use a common, widely-used vocabulary for describing the embargoes, instead of using our own terms. We investigated some options, including talking with Hydra developers on Slack, and emailing the PCDM community. Eventually, we opened a PCDM issue to address the question of embargoes in PCDM. As part of the discussion and work from that issue, we created a shared document that lists many vocabularies that describe rights, access rights, embargoes, … Eventually, the consensus in the PCDM community was to recommend the PSO and FaBiO ontologies (part of the SPAR Ontologies suite), and a wiki page was created with this information.

At Brown, we’re using the “Slightly more complex” option on that wiki page. It looks like this:

<pcdm:Object> pso:withStatus pso:embargoed .

<pcdm:Object> fabio:hasEmbargoDate “2018-11-27T00:00:01Z”^^xsd:dateTime .

In our repository, we’re not on Fedora 4 or PCDM, so we just put statements like these in the RELS-EXT datastream of our Fedora 3 instance. It looks like this:

<rdf:RDF xmlns:fabio=“http://purl.org/spar/fabio/#” xmlns:pso=“http://purl.org/spar/pso/#” xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”>
<rdf:Description rdf:about=“info:fedora/test:230789”>
<pso:withStatus rdf:resource=“http://purl.org/spar/pso/#embargoed”></pso:withStatus>
<fabio:hasEmbargoDate>2018-11-27T00:00:01Z</fabio:hasEmbargoDate>
<fabio:hasEmbargoDate>2020-11-27T00:00:01Z</fabio:hasEmbargoDate>
</rdf:Description>
</rdf:RDF>

In the future, we may want to track various statuses for an item (eg. dataset) over its lifetime. In that case, we may move toward more complex PSO metadata that describes various states that the item has been in.

Fedora 4 – testing

Fedora 4.7.1 is scheduled to be released on 1/5/2017, and testing is important to ensure good quality releases (release testing page for Fedora 4.7.1).

Sanity Builds

Some of the testing is for making sure the Fedora .war files can be built with various options on different platforms. To perform this testing, you need to have 3 required dependencies installed, and run a couple commands.

Dependencies

Java 8 is required for running Fedora. Git is required to clone the Fedora code repositories. Finally, Fedora uses Maven as its build/management tool. For each of these dependencies, you can grab it from your package manager, or download it (Java, Git, Maven).

Build Tests

Once your dependencies are installed, it’s time to build the .war files. First, clone the repository you want to test (eg. fcrepo-webapp-plus):

git clone https://github.com/fcrepo4-exts/fcrepo-webapp-plus

Next, in the directory you just created, run the following command to test building it:

mvn clean install

If the output shows a successful build, you can report that to the mailing list. If an error was generated, you can ask the developers about that (also on the mailing list). The generated .war files will be installed to your local Maven repository (as noted in the output of the “mvn clean install” command).

Manual Testing

Another part of the testing is to perform different functions on a deployed version of Fedora.

Deploy

One way to deploy Fedora is on Tomcat 7. After downloading Tomcat, uncompress it and run ./bin/startup.sh. You should see the Tomcat Welcome page at localhost:8080.

To deploy the Fedora application, shut down your tomcat instance (./bin/shutdown.sh) and copy the fcrepo-webapp-plus war file you built in the steps above to the tomcat webapps directory. Next, add the following line to a new setenv.sh file in the bin directory of your tomcat installation (update the fcrepo.home directory as necessary for your environment):

export JAVA_OPTS=”${JAVA_OPTS} -Dfcrepo.home=/fcrepo-data -Dfcrepo.modeshape.configuration=classpath:/config/file-simple/repository.json”

By default, the fcrepo-webapp-plus application is built with WebACLs enabled, so you’ll need a user with the “fedoraAdmin” role to be able to access Fedora. Edit your tomcat conf/tomcat-users.xml file to add the “fedoraAdmin” role and give that role to whatever user you’d like to log in as.

Now start tomcat again, and you should be able to navigate to http://localhost:8080/fcrepo-webapp-plus-4.7.1-SNAPSHOT/ and start testing Fedora functionality.

Django project update

Recently, I worked on updating one of our Django projects. It hadn’t been touched for a while, and Django needed to be updated to a current version. I also added some automated tests, switched from mod_wsgi to Phusion Passenger, and moved the source code from subversion to git.

Django Update

The Django update didn’t end up being too involved. The project was running Django 1.6.x, and I updated it to the Django LTS 1.8.x. Django migrations were added in Django 1.7, and as part of the update I added an initial migration for the app. In my test script, I needed to add a django.setup() for the new Django version, but otherwise, there weren’t any code changes required.

Automated Tests

This project didn’t have any automated tests. I added a few tests that exercised the basic functionality of the project by hitting different URLs with the Django test client. These tests were not comprehensive, but they did run a signification portion of the code.

mod_wsgi => Phusion Passenger

We used to use mod_wsgi for serving our Python code, but now we use Phusion Passenger. Passenger lets us easily run Ruby and Python code on the same server, and different versions of Python if we want (eg. Python 2.7 and Python 3). (The mod_wsgi site has details of when it can and can’t run different versions of Python.)

Subversion => Git

Here at the Brown University Library, we used to store our source code in subversion. Now we put our code in Git, either on Bitbucket or Github, so one of my changes was to move this project’s code from subversion to git.

Hopefully these changes will make it easier to work with the code and maintain it in the future.