Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Lloyd Sealy Library
John Jay College of Criminal Justice

Research Data Management

Resources and tools for managing your research data. (For faculty and graduate students.)

What is metadata?

Metadata: the who, what, when, where, why, how of your research. For research data, metadata come in two flavors:

  • Disciplinary standards: your area of research likely has a standardized way of organizing and describing data, like specifications of data collection and analysis. Make sure you're following community convention as much as possible for better understanding and reuse. See: list of standards by discipline from the DCC
  • Discovery: when you submit your data to a repository, for example, you will provide information such as creator, date, subject, and more. When another user searches the repository, they will find your dataset when they search for this information

The University of Minnesota has compiled a helpful guide to data documentation and metadata.

At minimum, it is helpful to other researchers if you include a readme.txt file that provides more information and context about the data, formats, and collection methods. Spreadsheets should include labels. The more documentation you provide, the less chance there is of misuse or misunderstanding of your data.

Protect your reputation and your data

From Towards effective and rewarding data sharing:

Detailed metadata—descriptions of data including protocols and analytic specifications—are required to understand what the primary data meant in its original context. In the absence of such metadata, analyses of data by an outside investigator are open to misinterpretation. Such misreading could lead to the publication of unwarranted results that might improperly cast doubt upon the conclusions of the original work, or impugn unfairly the competence or scientific integrity of the original investigators.

Source: Gardner D., et al. (2003) Towards effective and rewarding data sharing. Neuroinformatics 1:289–295. doi:10.1385/NI:1:3:289

This article is a collective response to NIH policy and voices researchers' concerns as well as recommendations, both to research communities and to organizations building institutional repositories.

Metadata standards