Andrew Creamer, Scientific Data Management Specialist, Brown University Library
This week I attended my first Data Curation Group meeting. The group consists of the Library’s data and digital specialists, representing the humanities, social sciences, and sciences’ domains. Since I am new in my position as the Library’s data management specialist for the sciences, and new to Brown, I have been trying to learn the campus contacts, resources, and tools available for the University’s undergraduate, graduate, and faculty researchers, and the relevant policies that pertain to managing, archiving and sharing their data. I have been exploring ways to update our group’s online research guide using campus-specific items. I thought I would begin my search for these items using some guiding questions, reflecting the various aspects of data management planning. For example, one of the NSF recommendations for a data management plan is for researchers to describe their plans for storing, backing up, and securing their data. So in my first meeting, I asked my colleagues to consider one of my questions as if a student or faculty researcher were asking it: “What are my options for storing data at Brown?” To my surprise it invoked a lively debate.
“This question would not be asked!” responded one of my colleagues. “You’re asking the wrong question!”
“Ok, then think of this question more as a way for me to get to know what is available here.” I tried starting a list.
“There is no one University solution!” a colleague countered. “Your questions,” my colleagues argued, “should be: Where is my data now? Where should my data go?”
“Those are great additional questions; but if we could just answer the first question, locating the options that are available to…”
“But that should not be the first question!” exclaimed my colleagues.
“Ok, then please don’t consider this question as part of a sequence; think of it as a category,” I offered helplessly.
“Your question does not consider the data life cycle,” a colleague added.
“Alright, then what are some of the options available at Brown for any part of the research and data life cycles? For example, what storage options does Brown offer for researchers collecting and staging data during the project?” I pleaded.
In the midst of this Socratic exchange with my experienced and knowledgeable colleagues it finally dawned on me just how ineffectual and reductive it was for me to attempt to frame the structure of our research guide or my orientation to the University’s services by excluding the researchers’ unique, and sometimes messy, circumstances, thereby intimating there exists prescriptive, black and white answers. Of course, my colleagues were right; it is unlikely that anyone would ever ask my question as originally worded, disconnected from any context.
One of the reasons I joined Brown’s University Library, the CDS, and the Data Curation Group was the fact that I yearned to be challenged by, learn from, and collaborate with data and digital specialists representing the different domains of scholarship. My first meeting with the Data Curation Group was an example of the immense benefits one receives having such multidisciplinary perspectives. I walked away from this meeting considerably more inspired, with better ideas for customizing our research guide had we just made a laundry list of Brown’s storage options. This post should serve as reminder that in-person meetings should be reserved for such dialogue and negotiating multiple perspectives, the real return-on-investment of our team’s time and attention. Asking my colleagues to make a list would have been a waste of their time–something we could have done on a shared Google doc.
The most common answer to RDM questions is: it depends. How long should I retain this data? It depends. Can I share this data set? It depends. Where can I store this data? It depends. How should I de-identify this data set? It depends. Who owns my data? It depends. MIT and the University of Wisconsin-Madison have already done an outstanding job creating library research guides that explain the nuts and bolts of data management. We do not want to reinvent the wheel. On the contrary, the research data management (RDM) guides with the potential to be the most helpful for our users will be ones that function as Choose-Your-Own-Adventure guides, customized to the research ecology of our institutions. The answers to RDM questions depend on the specific intentions of researchers and the many variables and idiosyncrasies of their particular research projects.
Take storage, for example. If a researcher were to look for data storage solutions, examples of factors influencing his or her options include the size of the data set, its format, its perceived rate of growth, its restrictions, the need for access, the funds available, etc. A colleague gave us two examples of unique storage situations that had come up at Brown. One faculty needed a storage solution that would accommodate off-campus collaborators who would need to be able to access the data stored at Brown. Another faculty had a grant that required a storage option that would provide an off-site copy of the data.
Considering the lessons I learned above, I am now going to approach collaborating with my colleagues to update and frame our LibGuide using use-case style scenarios: I want to find my data; I want to know where I can stick human subjects’ data; I want to store small data sets; I want to store my data on campus but make it accessible to collaborators off-campus, etc.
So if you’re starting a research project, consider stopping by the second floor of the Rock and let us help you to choose your next data management adventure. What will you get out of it? It depends.