UMass Amherst
Site Search
UMass Amherst LibrariesAsk a librarian

Data Management Plan Guidance

Types of Data

Metadata and Standards

Sharing and Access

Re-use and Re-distribution

Short and Long Term storage and Management

Services » Services for Faculty » Data Management » Data Management Plan Guidance » Types of Data

Types of Data

Data are the recorded factual material commonly accepted in the scientific community as necessary to validate researching findings (OMB). This refers not just to summary published statistics or tables, but the data on which those summaries are based (NIH). This could include material such as documents, lab notebooks, questionnaires and responses, photos, audio or video files, models and algorithms, database content, software, and so on.

Things to Think About

In a data management plan, the data your research generates should be described in some detail. The description can include:

What kind of data is it? (from MIT Libraries)

What format(s) does it include?  (from MIT Libraries)

Data formats can be: Storage file formats include
Text ascii, Word, PDF
Numerical ascii, SPSS, STATA, Excel, Access, MySQL
Multimedia jpeg, tiff, dicom, mpeg, quicktime
Models 3D, statistical
Software Java, C

Discipline-specific

Insrument-specific

FITS in astronomy, CIF in chemistry

Olympus Confocal Microscope Data Format

 

 

 

 

 

 

 

Also, (from our DMP template PDF)

Best Practices Tip:  File Formats for Long-Term Access 

(From MIT Libraries)

The file format in which you keep your data is a primary factor in one's ability to use your data in the future. As technology continually changes, researchers should plan for both hardware and software obsolescence. How will your data be read if the software used to produce them becomes unavailable?

Formats more likely to be accessible in the future are:

Consider migrating your data into a format with the above characteristics, in addition to keeping a copy in the original software format.           

Examples of preferred format choices:

Example language

  1. How data is created/acquired and recorded; file formats; required software:
    Every two days, we will subsample E. affinis populations growing under our treatment conditions. We will use a microscope to identify the life-stage and sex of the subsampled individuals. We will document the information first in a laboratory notebook and then copy the data into an Excel spreadsheet. For quality control, values will be entered separately by two different people to ensure accuracy. The Excel spreadsheet will be saved as a comma-separated value (.csv) file daily and backed up to a server. After all data are collected, the Excel spreadsheet will be saved as a .csv file and imported into the program R for statistical analysis.
    (https://www.dataone.org/sites/all/documents/DMP_Copepod_Formatted.pdf)
  2. Size of data sets
    The final data product distributed to most users will occupy less than 500 KB; raw and ancillary data, which will be distributed on request comprise less than 10 MB.
    (https://www.dataone.org/sites/all/documents/DMP_MaunaLoa_Formatted.pdf)
  3. Non-standard data
    This project is designed primarily as an educational intervention rather than a research project per se.  However because the goal is to provide a foundation for future research studies, the data will be managed as for a research project.... Data will consist of notes and transcriptions of discussions and focus groups, reports and reviews, summaries, curricular materials, and both quantitative and qualitative evaluations of the capacity-building workshops and the impact of implementation on trainees of the faculty participants in the workshops.  Materials will all be created de novo or transcribed into standard Microsoft Office applications (Word, Excel, and PowerPoint).  For the purpose of wider, long-term access, primary documents will be converted at regular intervals into pdf documents.
    (http://rci.ucsd.edu/_files/DMP%20Example%20Michael%20Kalichman.doc)

References

 

 

Last Edited: 17 July 2013