This is a guest post by Brown University Library’s Web Archiving Intern, Christina Cahoon. Christina is currently finishing her Masters of Library and Information Science degree at the University of Rhode Island.
After the recent passing of Brown University alumnus and Library staff member Mark Baumer MFA ‘11, the Brown University Library tasked itself with preserving his prolific web presence. I’m working towards that goal with Digital Preservation Librarian, Kevin Powell. Baumer was a poet and environmental activist who worked within the Digital Technologies Department as Web Content Specialist. This past October, Baumer began his Barefoot Across America campaign, with plans to walk barefoot from Rhode Island to California in an effort to raise money for environmental preservation and to support the FANG Collective. Unfortunately, this journey was tragically cut short on January 21, 2017, when Baumer was struck by a vehicle and killed while walking along a highway in Florida.
Baumer was an avid social media user who posted on several platforms multiple times a day. As such, the task of recording and archiving Baumer’s web presence is quite large and not free from complications. Currently, we are using Archive-It to crawl Baumer’s social media accounts and news sites containing coverage of Baumer’s campaign, including notices of his passing. While Archive-It does a fairly decent job recording news sites, it encounters various issues when attempting to capture social media content, including content embedded in news articles. As you can imagine, this is causing difficulties capturing the bulk of Baumer’s presence on the web.
Archive-It’s help center has multiple suggestions to aid in capturing social media sites that have proven useful when capturing Baumer’s Twitter feed; however, suggestions have either not been helpful or are non-existent when it comes to other social media sites like YouTube, Instagram, and Medium. The issues faced with crawling these websites range from capturing way too much information, as in the case with YouTube where our tests captured every referred video file from every video in the playlist, to capturing only the first few pages of dynamically loading content, as is the case with Instagram and Medium. We are re-configuring our approach to YouTube after viewing Archive-It’s recently-held Archiving Video webinar, but unfortunately the software does not have solutions for Instagram and Medium at this time.
These issues have caused us to re-evaluate our options for best methods to capture Baumer’s work. We have tested how WebRecorder works in capturing sites like Flickr and Instagram and we are still encountering problems where images and videos are not being captured. It seems as though there will not be one solution to our problem and we will have to use multiple services to sufficiently capture all of Baumer’s social media accounts.
The problems encountered in this instance are not rare in the field of digital preservation. Ultimately, we must continue testing different preservation methods in order to find what works best in this situation. It is likely we will need to use multiple services in order to capture everything necessary to build this collection. As for now, the task remains of discovering the best methods to properly capture Baumer’s work.