Hydra Connect 2016 – Brown University Library Digital Technologies

Last week I attended Hydra Connect 2016 in Boston, with a team of three others from the Brown University Library. Our team consisted of a Repository Developer, Discovery Systems Developer, Metadata Specialist, and Repository Manager. Here are some notes and thoughts related to the conference from my perspective as a repository programmer.

IPFS

There was a poster about IPFS, which is a peer-to-peer hypermedia protocol for creating a distributed web. It’s an interesting idea, and I’d like to look into it more.

APIs and Architecture

There was a lot of discussion about the architecture of Hydra, and Tom Cramer mentioned APIs specifically in his keynote address. In the Brown Digital Repository, we use a set of APIs that clients can access and use from any programming language. This architecture lets us define layers in the repository: the innermost layer is Fedora and Solr, the next layer is our set of APIs, and the outer layer is the Studio UI, image/book viewers, and custom websites built on the BDR data. There is some overlap in our layers (eg. the Studio UI does hit Solr directly instead of going through the APIs), but I still think it improves the architecture to think about these layers and try not to cross multiple boundaries. Besides having clients that are written in python, ruby, and php, this API layer may be useful when we migrate to Fedora 4 – we can use our APIs to communicate with both Fedora 3 and Fedora 4, and any client that only hits the APIs wouldn’t need to be changed to be able to handle content in Fedora 4.

I would be interested in seeing a similar architecture in Hydra-land (note: this is an outsider’s perspective – I don’t currently work on CurationConcerns, Sufia, or other Hydra gems). A clear boundary between “business logic” or processing and the User Interface or data presentation seems like good architecture to me.

Data Modeling and PCDM

Monday was workshop day at Hydra Connect 2016, and I went to the Data Modeling workshop in the morning and the PCDM In-depth workshop in the afternoon. In the morning session, someone mentioned that we shouldn’t have data modeling differences without good reason (ie. does a book in one institution really have to be modeled differently from a book at another institution?). I think that’s a good point – if we can model our data the same way, that would help with interoperability. PCDM, as a standard for how our data objects are modeled, might be great way to promote interoperability between applications and institutions. In the BDR, we could start using PCDM vocabulary and modeling techniques, even while our data is in Fedora 3 and our code is written in Python. I also think it would be helpful to define and document what interoperability should look like between institutions, or different applications at the same institution.

Imitate IIIF?

It seems like the IIIF community has a good solution to image interoperability. The IIIF community has defined a set of APIs, and then it lists various clients and servers that implement those APIs. I wonder if the Hydra community would benefit from more of a focus on APIs and specifications, and then there could be various “Hydra-compliant” servers and clients. Of course, the Hydra community should continue to work on code as well, but a well-defined specification and API might improve the Hydra code and allow the development of other Hydra-compliant code (eg. code in other programming languages, different UIs using the same API, …).