Web – Brown University Library Digital Technologies

Monitoring Passenger’s Requests in Queue over time

As I mentioned in a previous post we use Phusion Passenger as the application server to host our Ruby applications. A while ago upon the recommendation of my coworker Ben Cail I created a cron job that calls passenger-status every 5 minutes to log the status of Passenger in our servers. Below is a sample of the passenger-status output:

Version : 5.1.12
Date : Mon Jul 30 10:42:54 -0400 2018
Instance: 8x6dq9uX (Apache/2.2.15 (Unix) DAV/2 Phusion_Passenger/5.1.12)

----------- General information -----------
Max pool size : 6
App groups : 1
Processes : 6
Requests in top-level queue : 0

----------- Application groups -----------
/path/to/our/app:
App root: /path/to/our/app
Requests in queue: 3
* PID: 43810 Sessions: 1 Processed: 20472 Uptime: 1d 7h 31m 25s
CPU: 0% Memory : 249M Last used: 1s ag
* PID: 2628 Sessions: 1 Processed: 1059 Uptime: 4h 34m 39s
CPU: 0% Memory : 138M Last used: 1s ago
* PID: 2838 Sessions: 1 Processed: 634 Uptime: 4h 30m 47s
CPU: 0% Memory : 134M Last used: 1s ago
* PID: 16836 Sessions: 1 Processed: 262 Uptime: 2h 14m 46s
CPU: 0% Memory : 160M Last used: 1s ago
* PID: 27431 Sessions: 1 Processed: 49 Uptime: 25m 27s
CPU: 0% Memory : 119M Last used: 0s ago
* PID: 27476 Sessions: 1 Processed: 37 Uptime: 25m 0s
CPU: 0% Memory : 117M Last used: 0s ago

Our cron job to log this information over time is something like this:

/path/to/.gem/gems/passenger-5.1.12/bin/passenger-status >> ./logs/passenger_status.log

Last week we had some issues in which our production server was experiencing short outages. Upon review we noticed that we were having a unusual amount of traffic coming to our server (most of it from crawlers submitting bad requests.) One of the tools that we used to validate the status of our server was the passenger_status.log file created via the aforementioned cron job.

The key piece of information that we use is the “Requests in queue” value highlighted above. We parsed this value of out the passenger_status.log file to see how it changed in the last 30 days. The result showed that although we have had a couple of outages recently the number of “requests in queue” dramatically increased about two weeks ago and it had stayed high ever since.

The graph below shows what we found. Notice how after August 19th the value of “requests in queue” has been constantly high, whereas before August 19th it was almost always zero or below 10.

We looked closely to our Apache and Rails logs and determined the traffic that was causing the problem. We took a few steps to handle it and now our servers are behaving as normal again. Notice how we are back to zero requests in queue on August 31st in the graph above.

The Ruby code that we use to parse our passenger_status.log file is pretty simple, it just grabs the line with the date and the line with the number of requests in queue, parses their values, and outputs the result to a tab delimited file that then we can use to create a graph in Excel or RAWGraphs. Below is the Ruby code:

require "date"

log_file = "passenger_status.log"
excel_date = true

def date_from_line(line, excel_date)
  index = line.index(":")
  return nil if index == nil
  date_as_text = line[index+2..-1].strip # Thu Aug 30 14:00:01 -0400 2018
  datetime = DateTime.parse(date_as_text).to_s # 2018-08-30T14:00:01-04:00
  if excel_date
    return datetime[0..9] + " " + datetime[11..15] # 2018-08-30 14:00
  end
  datetime
end

def count_from_line(line)
  return line.gsub("Requests in queue:", "").to_i
end

puts "timestamp\trequest_in_queue"
date = "N/A"
File.readlines(log_file).each do |line|
  if line.start_with?("Date ")
    date = date_from_line(line, excel_date)
  elsif line.include?("Requests in queue:")
    request_count = count_from_line(line)
    puts "\"#{date}\"\t#{request_count}"
  end
end

In this particular case the number of requests in queue was caused by bad/unwanted traffic. If the increase in traffic had been legitimate we would have taken a different route, like adding more processes to our Passenger instance to handle the traffic.

Configuring Ruby, Bundler, and Passenger

Recently we upgraded several of our applications to a newer version of Ruby which was relatively simple to do in our local development machines. However, we ran into complications once we started deploying the updated applications to our development and productions servers. The problems that we ran into highlighted issues in the way we had configured our applications and Passenger on the servers. This blog post elaborates on the final configurations that we arrived to (at least for now) and explains the rationale for the settings that worked for us.

Our setup

We deploy our applications using a “system account” (e.g. appuser) so that execution permissions and file ownership are not tied to the account of the developer doing the deployment.

We use Apache as our web server and Phusion Passenger as the application server to handle our Ruby applications.

And last but not least, we use Bundler to manage gems in our Ruby applications.

Application-level configuration

We perform all the steps to deploy a new version of our applications with the “system account” for the application (e.g. appuser.)

Since sometimes we have more than one version of Ruby in our servers we use chruby to switch between versions on the server when we are logged in as the appuser. However, we have learned that is better not to select a particular version of Ruby as part this user’s bash profile. Executing ruby -v as this user upon login will typical show the version that came with the operating system (e.g. “ruby 1.8.7”).

By leaving the system Ruby as the default we are forced to select the proper version of Ruby that we want on each application, this has the advantage that the configuration for each application is explicit on what version of Ruby it needs. This also makes applications less likely to break when we install a newer version of Ruby on the server. This is particularly useful in our development server where we have many Ruby applications running and each of them might be using a different version of Ruby.

If we want to do something for a particular application (say install gems or run a rake task) then we switch to the version of Ruby (via chruby) that we need for the application before executing the required commands.

We have also found useful to configure Bundler to install application gems inside the application folder rather than in a global folder. We do this via Bundler --path parameter. The only gem that we install globally (i.e. in GEM_HOME) is bundler.

A typical deployment script looks more or less like this.

$ ssh our-production-machine

Switch to our system account on the remote server (notice that it references the Ruby that came with the operating system):

$ su - appuser

$ ruby -v
# => ruby 1.8.7 (2013-06-27 patchlevel 374) [x86_64-linux]
 
$ which ruby
# => /usr/bin/ruby

Activate the version of Ruby that we want for this app (notice that it references the Ruby that we installed):

$ source /opt/local/chruby/share/chruby/chruby.sh
$ chruby ruby-2.3.6

$ ruby -v 
# => ruby 2.3.6p384 (2017-12-14 revision 61254) [x86_64-linux] 
 
$ which ruby
# => ~/rubies/ruby-2.3.6/bin/ruby
 
$ env | grep GEM
# => GEM_HOME=/opt/local/.gem/ruby/2.3.6
# => GEM_ROOT=/opt/local/rubies/ruby-2.3.6/lib/ruby/gems/2.3.0
# => GEM_PATH=/opt/local/.gem/ruby/2.3.6:/opt/local/rubies/ruby-2.3.6/lib/ruby/gems/2.3.0

Install bundler (this is only needed the first time, notice how it is installed in GEM_HOME):

$ gem install bundler
$ gem list bundler -d
# => Installed at: /opt/local/.gem/ruby/2.3.6

Install the rest of the app, its gems, and execute some rake tasks (notice that Bundler will indicate that gems are being installed locally to ./vendor/bundle):

$ cd /path/to/appOne
$ git pull

$ RAILS_ENV=production bundle install --path vendor/bundle
# => Bundled gems are installed into `./vendor/bundle`

$ RAILS_ENV=production bundle exec rake assets:precompile

Passenger configuration

Our default passenger configuration is rather bare-bones and indicates only a few settings. For example our /etc/httpd/conf.d/passenger.conf looks more or less like this:

LoadModule passenger_module /opt/local/.gem/gems/passenger-5.1.12/buildout/apache2/mod_passenger.so

<IfModule mod_passenger.c>
  PassengerRoot /opt/local/.gem/gems/passenger-5.1.12
  PassengerUser appuser
  PassengerStartTimeout 300
</IfModule>

Include /path/to/appOne/http/project_passenger.conf
Include /path/to/appTwo/http/project_passenger.conf

Notice that there are no specific Ruby settings indicated above. The Ruby specific settings are indicated on the individual project_passenger.conf files for each application.

If we look at the passenger config for one of the apps (say /path/to/appOne/http/project_passenger.conf) it would look more or less like this:

<Location /appOne>
  PassengerBaseURI /appOne
  PassengerRuby /opt/local/rubies/ruby-2.3.6/bin/ruby
  PassengerAppRoot /path/to/appOne/
  SetEnv GEM_PATH /opt/local/.gem/ruby/2.3.6/
</Location>

Notice that this configuration indicates both the path to the Ruby version that we want for this application (PassengerRuby) and also where to find (global) gems for this application (GEM_PATH).

The value for PassengerRuby matches the path that which ruby returned above (/opt/local/rubies/ruby-2.3.6/bin/ruby) and clearly indicates that we are using version 2.3.6 for this application.

The GEM_PATH settings is very important since this is what allows Passenger to find bundler when loading our application. Not setting this value results in the application not loading and Apache logging the following error:

Could not spawn process for application /path/to/AppOne: An error occurred while starting up the preloader.
Error ID: dd0dcbd4
Error details saved to: /tmp/passenger-error-3OKItz.html
Message from application: cannot load such file -- bundler/setup (LoadError)
/opt/local/rubies/ruby-2.3.6/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
/opt/local/rubies/ruby-2.3.6/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'

Notice that we set the GEM_PATH value to the path returned by gem list bundler -d above. This is a bit tricky since if you are looking closely we are setting GEM_PATH to the value that GEM_HOME reported above (/opt/local/.gem/ruby/2.3.6/). I suspect we could have set GEM_PATH to /opt/local/.gem/ruby/2.3.6:/opt/local/rubies/ruby-2.3.6/lib/ruby/gems/2.3.0 to match the GEM_PATH above but we didn’t try that.

UPDATE: The folks at Phusion recommend setting GEM_HOME as well (even if Passenger does not need it) because some gems might need it.

— Hector Correa & Joe Mancino

Python/Django warnings

I recently updated a Django project from 1.8 to 1.11. In the process, I started turning warnings into errors. Django docs recommend resolving any deprecation warnings with current version, before upgrading to a new version of Django. In this case, I didn’t start my upgrade work by resolving warnings, but I did run the tests with warnings enabled for part of the process.

Here’s how to enable all warnings when you’re running your tests:

From the CLI
- use -Werror to raise Exceptions for all warnings
- use -Wall to print all warnings
In the code
- import warnings; warnings.filterwarnings(‘error’) – raise Exceptions on all warnings
- import warnings; warnings.filterwarnings(‘always’) – print all warnings

If a project runs with no warnings on a Django LTS release, it’ll (generally) run on the next LTS release as well. This is because Django intentionally tries to keep compatibility shims until after a LTS release, so that third-party applications can more easily support multiple LTS releases.

Enabling warnings is nice because you see warnings from python or other packages, so you can address whatever problems they’re warning about, or at least know that they will be an issue in the future.

LibGuides + Canvas

LibGuides is a content management system developed by Springshare, a software-as-a-service company that makes several packages aimed at library customers. The Brown University Library uses LibGuides to publish online guides that help students and researchers discover appropriate library resources for their scholarly work. LibGuides is a common platform used by thousands of libraries^[1], with basic templates and simple tools that allow librarians to share their knowledge, without having to learn the ins and outs of web publishing.

Instructional Design Librarian Sarah Evelyn asked me if there was any technological way that we could make LibGuides content show up in online Canvas^[2] courses. Canvas supports the Learning Tools Interoperability (LTI) standard, but the version of LibGuides we use does not, nor does it have an open API.^[3]

We met with Ed Casey and Marc Mestre, of CIS’ Instructional Technology Group, to develop a plan that would allow us to systematically include LibGuides content into Canvas. CIS shared with us information about how Canvas courses are constructed, including how standard course/SIS IDs from the Registrar’s Office are used to programmatically generate course identifiers. Each department has a three- or four-letter identifier unique to the department’s area of study (for example, Physics courses use PHYS as an identifier). We decided that we could use these identifiers in LibGuides to create a connection to Canvas.

The Library publishes two types of guides — subject guides and course guides. A subject guide includes links to journals, databases, and other resources on a particular topic. For example, we have a subject guide for Italian Studies (ITAL). A course guide is developed in collaboration with an instructor, highlighting resources that are relevant for use with that particular course, for example, this guide for an Italian Studies course on Machiavelli (ITAL0981).

I developed a schema to tag each subject guide with its SIS identifier, prepending an S- for subject guides, or C- for course guides, which Sarah added to each relevant LibGuide (using the examples above, the Italian Studies subject guide was tagged S-ITAL, and the Machiavelli course was tagged C-ITAL0981). Library programmer Yvonne Federowicz developed a system to harvest the tags from LibGuides into a database, and wrote a programatic service that makes those tags available to Canvas. Marc, in turn, wrote a feature in Canvas that uses that data service to create a direct link from any course to it’s related subject guide and, if available, its specific course guide. The new feature is available to any course when an instructor simply drags the “Brown Library Resources” item into the Course Navigation tab in Canvas Settings.

We hope that this feature will help students to easily discover high-quality scholarly resources, and more readily connect with the subject librarians in their areas of study.

—-

[1] https://www.springshare.com/libguides/
[2] Canvas is a learning management system used widely at Brown.
[3] Springshare’s LibGuides CMS product does support LTI, but we use the lighter LibGuides v2 product, so the LTI was not available to us.

WordPress for Exhibits

The Exhibits committee assembled a subcommittee to explore tools and processes to publish online exhibits, whether analogs to physical exhibits, or exhibits that only exist online.

The group considered many tools, including more display-focused packages such as Creativist and Google Open Gallery, and more metadata-driven tools such as Omeka, Collective Access, Collection Space, and more interactive tools such as Open Exhibit and Viewshare.

Ultimately the group decided that our needs were more display-focused, as metadata would generally be handled by the BDR, but the tools we’d examined didn’t meet our needs for dynamic and varied display. Instead, we decided to work on developing a WordPress theme that would be flexible enough for the project’s requirements. The first example is the exhibit The Unicorn Found.

Brown Library Web

The Library’s web site is a constantly evolving tool, with a goal of developing and maintaining accurate, informative, and interesting content for library patrons. Library Web Services work to develop the content and infrastructure that power the Library’s main site.