LibGuides: Research Data Management: Storing & sharing data

Storing & archiving data after your project is done

It is important to keep your data around for a number of years (often specified in your project plan, funder requirements, or the law). Safely stored or archived data backs up your research and protects your scholarly reputation. You may also want to reuse the data later on.

Storing data involves arranging a secure solution using servers and/or devices, both during and after the data collection process. Sensitive data must be encrypted. Storage also involves some human consideration, too: who will be in charge of the stored data? Who will be trusted with access? How long should the data be stored, and who will securely delete it?

Archiving data entails more care: you will provide documentation, prepare the data for long-term storage, and straighten out institutional or organizational support. You may have to work out an archiving solution with your department or institution, or you may choose to submit the data to a trusted repository. The method of archiving your data may therefore also be how you share your data.

Repositories: CUNY Academic Works and more

A repository is an online vault for materials contributed by many people. Depending on the repository, its contents can be publicly available, restricted to certain people, or embargoed (delayed for a number of months).

The below repositories accept data. You may have to apply to be able to contribute your materials. All contributions are credited to you and belong to you. Other repositories not listed may be built to accept articles only; others accept supplemental data bundled with the article.

CUNY Academic Works

John Jay's institutional repository, CUNY Academic Works, accepts publications, data sets, creative works, and more. Submissions are open access with the possibility of embargoing. See submission policies. Questions about Academic Works? Email Prof. Ellen Sexton (esextonATjjay.cuny.edu).

Subject-specific data repositories:

ICPSR: social science data
National Archive of Criminal Justice Data
Resource Center for Minority Data
DataOne: earth data
Digital history and classics studies
List of repositories for historians. Includes archeology, epigraphy, manuscript studies.
Zenodo
Universal repository, launched May 2013 by CERN
Mega-list from re3data
Searchable, browsable catalog of data repositories. (Formerly Databib.)
Mega-list from UM
Sorted by subject area.
Mega-list from DataCite
This Google Drive spreadsheet lists out many repositories by subject, location, and deposit requirements.

Why should you share your data?

How do researchers share their data?

Many scholars choose to make their research data available to the scholarly community or to the public by:

submitting the data to a repository: open (public), restricted (not public), or embargoed (delayed for a number of months)
including the data as supplemental material in a journal article
giving their data a DOI and citing it, and/or
personally providing the data on request

• • • • •

Why would researchers share their data?

Reproducibility: a scientifically responsible project enables reproducibility by providing documentation and data (and, when applicable, code)

Impact: authors who share their data are cited more*

Funder or journal requirements or recommendations: some funders, like NIH, and some journals, like PLoS, require that data be shared.
In 2013, the White House issued a memo directing that research funded by major federal agencies must be publicly accessible, and that those agencies should strengthen policies on managing and sharing scientific data

For science! / For the humanities!: Shared data can be reused (and cited, of course) and contribute to the progress of human knowledge

* More impact — references:

Amy M. Pienta, George Alter, Jared Lyle. (2010) The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data. ICPSR. hdl.handle.net/2027.42/78307
Piwowar H., Day R., Fridsma D. (2007) Sharing Detailed Research Data Is Associated with Increased Citation Rate. PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308

Before sharing data, consider...

Is there any private or revealing information in these data?
Are the data in their final (not preliminary) state?
Are your data properly licensed and credited to you / your project?
Have your project collaborators agreed that the data should be shared?
Have you included the necessary documentation and metadata?

Simple Open Data is a list of helpful hints to make sharing your data simple and effective.

Make your data easy to reuse

Just because you've put your data in a repository or on a storage server, it doesn't mean your data will be useful in the future to those who want to use it. Without documentation and organization, future researchers who are looking at your data may be confused about its context or usage.

The core guiding principle is simple: Someone unfamiliar with your project should be able to look at your computer files and understand in detail what you did and why […] Most commonly, however, that “someone” is you. A few months from now, you may not remember what you were up to when you created a particular set of files, or you may not remember what conclusions you drew. You will either have to then spend time reconstructing your previous experiments or lose whatever insights you gained from those experiments.

(William Stafford Noble [2009] A Quick Guide to Organizing Computational Biology Projects. PLoSComputBiol 5(7): e1000424. doi:10.1371/journal.pcbi.1000424. Emphasis mine.)

Read this short lesson, Preserving Your Research Data by James Baker, for good summary of general best practices. (It's also the source of the above quote.)

See the Metadata tab in this guide for more about formatting your data itself.

Further resources

The Open Data Commons wants all research data to be open and accessible. They provide a guide to licensing your data to protect the data creators and to let users know what can and can't be done to/with the data.

The University of Minnesota has a great guide to citing data, whether you're using someone else's data or providing a citation for your own.

Our Guide to Faculty Scholarship Resources highlights other ways to track and measure your scholarly activities.