Skip to Main Content

Data Management & Sharing: Organizing your Data:
Metadata, File Naming, and Data Cleaning

The GSU library can help you manage your working data, write data management plans for grants, and make your data accessible for future researchers.

FAIR Principles and C-U-R-A-T-E Steps

When preparing your data for your own use and for reuse by future researchers:

Use the FAIR Principles created by the Go FAIR Initiative to approach making your data Findable, Accessible, Interoperable, and Reusable.

Also consult the C-U-R-A-T-E Steps created by the Data Curation Network to guide preparing your data for curation.

Metadata Standards

Never heard of metadata and wondering what it's all about? 'Metadata' is added information about your data (e.g., codebooks, data documentation) beyond the raw data files that will help you, your research team, and other researchers readily access, understand, and use the data - learn even more about metadata here.

Grant funding agencies and data repositories recommend using established metadata standards whenever possible. Some metadata schema and standards are discipline-specific, such as Darwin Core (biology) or DDI (social and behavioral sciences), while others are designed for a particular type of resource or may cover any discipline.

A metadata schema is a set of metadata elements with the name and meaning of each element specifically defined. A schema may also define rules for content, allowable data values, syntax, and/or other rules for recording and encoding data.

Image CC Attribution 2.0 Generic from

File Naming - Best Practices

Following standardized and consistent best practices for naming your files will help both you and your research team readily access your data while still involved in the project. It similarly will help others if/when you share your data for replication/transparency purposes or reuse by other researchers. Check out Stanford Libraries recommended best practices for file naming.

Image public domain from

Data Cleaning Tools

Raw data is often messy and needs cleaned up before analysis can be performed -- linked below are video tutorials about some data cleaning tools and also links to freely download them.

Image adapted from public domain images at