Going to the Research Data Mountain: Behind the Lab Walls of Brown’s BioMed

Post by Andrew Creamer, Scientific Data Management Specialist 

When my colleagues and I talk to other information professionals about taking on roles in research data management, we are often asked how we got our foot in the door. They mean how did we find the opportunities to provide these services and partner with students and faculty researchers. How to get started providing research data management services and strategies for reaching out to student and faculty researchers and building relationships on campus have become staples of library conferences related to research data management; however, I do feel that these questions, hoary chestnuts to some, should be continually asked because new opportunities present themselves every day on research campuses across the globe. Getting a foot in the door requires action. Outreach opportunities are rarely going to come to us. The majority of the campus has no idea I am here in the library so I have to look for ways to get involved with research on campus. Thus, to recast Francis Bacon and his famous mountain ellipsis in his essay Of Boldness (http://goo.gl/yVvbO0), the research data mountain will not come to us information professionals; we must go to the research data mountain.

Being new on this campus requires me to be extra vigilant and aware of potential outreach opportunities. I have had a few meet-and-greets with administrators and faculty in the sciences, great opportunities for me to make my elevator pitch so that they will hopefully consider making room on their syllabi and projects for me and research data management instruction. But I have also been searching for opportunities to meet undergraduate and graduate student researchers. I want to see just what type of science is happening behind the walls of the labs on campus so that I can see what tools they are using to create data. By observing and learning about their research I can better plan for ways to help them to manage and share their data. Last week such an opportunity for student outreach opened up. I saw a posting on Brown’s Division of Biology and Medicine’s (BioMed) weekly email bulletin advertising a four-week module aimed at introducing students to the resources, tools and techniques in biomedical research. I quickly registered.

This week was our first meeting. I felt nervous, like a student on the first day attending a new school. As the participants went around the room introducing themselves and their reasons for enrolling in the module, I gradually felt more at ease because each of us related a similar sentiment: we wanted to know what other people are doing, what research, what tools, what resources are in the labs that we pass by or work next to or even work in every day. The students were looking for opportunities for their own outreach and collaboration.

Our first speaker was Dr. Pam Swiatek, Director of Research Operations and Major Proposal Coordination for Brown’s BioMed. She shared an ambitious project called CoresRI (http://coresri.org/). The premise of CoresRI is to be a resource for research institutions across Rhode Island (labs, universities, hospitals, etc.) to list the facilities, tools and resources they have so that other researchers in the state requiring such instruments or expertise can contact them to get what they need. The resource is open to any scientist looking to arrange access to needed facilities for research purposes.

Our second speaker was Dr. Ed Hawrot, Associate Dean for Biology and Director of the Rhode Island BioBank. He introduced us to the Biobank, which is a human tissue cryogenic repository for researchers to have access to samples of human tissue (blood, brain, spinal fluid, etc.) representative of the state’s population. The significance of gathering these samples (obtained by consent) is that they have the ability to link them back to the donors’ electronic medical records. Such a link provides insight on a variety of data points: their diet, biometrics, medications, DNA, etc. Repositories like the BioBank have allowed for the potential to conduct genome-wide association (GWAS) and phenotype-wide association (PWAS) studies by looking at single nucleotide polymorphisms (SNPs), which are distinct variations in our genes, and the resulting prevalence or absence of disease, as well as biomarkers associated with disease.

Preserving such valuable human tissue requires strategies and a large investment of time and money. Such an operation as the BioBank requires a great deal of planning for managing a large amount of physical and digital data. The Biobank has to keep track of hundreds of thousands of physical specimens (whole, halved, and sectioned) distributed among a network of freezers, as well as linking with the terabytes of warehoused clinical data resulting from their digital assays and other analyses. There are also consent and related IRB forms associated with each donor and their specimens. The program relies on special software for this management called Freezerworks™ as well as a system of fail alarms and backup systems in the case of freezer emergencies that could threaten the integrity of valuable specimens. Dr. Hawrot reminded students of the 2012 incident at Harvard where a large repository of brains and human tissue samples were lost by a catastrophic failure of a freezer and its backups systems. http://blogs.nature.com/news/2012/06/brains-thaw-at-harvard-repository.html

The last speaker was Dr. Robbert Creton, Associate Professor of Medical Science and organizer of the module. Professor Creton took us on a tour of the Leduc Bioimaging Facility (http://www.brown.edu/Facilities/Leduc_Bioimaging_Facility/). We visited the facilities for the zebrafish used as model organisms for imaging in developmental research. He described to us the microscopes in the facility available for researchers. We toured the confocal microscope, transmission electron microscope (TEM), and scanning electronic microscope (SEM). We also looked at the microtome tools used for slicing thin sections of specimens and a visualization screen set up in the facility for researchers to visualize and analyze the florescence and slide imaging in 3D.

I took a great deal away with me from this first class. In addition to learning about the tools available through CoresRI and the data available to researchers in the RI BioBank, I learned a great deal about the types of image files (TIFF) produced by the microscopes, and ImageJ, the open, Java-based, image-processing software program developed at the National Institutes of Health (NIH). Researchers use the program to open their downloaded files and work with their images. Professor Creton taught me about the average file size created by each scope, the standard file name and extension assigned by these instruments, the standard metadata embedded by the instrumentation, the typical storage options, and the typical practices for integrating images with their paper lab notebooks. This knowledge has made me more aware of the gaps that I could help fill by helping students manage their data files and lab notebooks. For example, if a student were to just save image files without changing the file name assigned by the instrument, this could mean a lot of time wasted sorting through anonymous files to find the image they want or some data management horror scenarios: having to repeat an experiment and re-image because they lost or wrote over an image file they needed or that could have had the potential to advance their project.

The largest benefit of joining this module has been making relationships with students. By enrolling in this module I was able to get my foot in the door and let them know about my and my colleagues’ researcher support services. After class I was able to set up an appointment with a doctoral student from Brown’s Molecular Pharmacology, Physiology and Biotechnology department to assist him with locating and uploading his dataset to a data repository so that he can get a DOI and URL for his dataset. This will allow him and others to better access his data with a genome browser, and he will be able link his dataset with any related article publications, have other people cite his data, and will one day help him to better the measure the impact of all his research products, including his data.

So the next time I get asked by someone how to get his or her foot in the door to provide research data management library services, my response will be: be bold; go to the research data mountain. Resources, Tools, and Techniques in Biomedical Research is offered until May 14th, 2014. For more information please contact Professor Creton.