Research Commons

Connect. Collaborate. Contribute.


Research Data Management - Best Practices

Introduction to Research Data Management

File Format Basics

Your file format influences your ability to open a file at a later date. Proprietary file formats require the proper version of the proprietary software. Non-proprietary, or open, formats are more inter-operable and thus more durable. Saving your data in open, unencrypted and uncompressed formats will make your data more usable for years to come. If you can’t save your data in an open format, consider including the software name, version, and parent company in the accompanying readme.txt file for future users.

 

For more in-depth discussion, see the Library of Congress’ Sustainability of Digital Formats web site.

Type of Data Recommended Formats
Text
  • Plain text (.txt)
  • Portable Document Format (.pdf)
  • LaTeX documents (.tex)
  • Hypertext Markup Language (.html)
  • Open Document Format (.odt)
  • Extensible Markup Language (.xml)
Tables, spreadsheets, and databases
  • Tab-separated tables (.txt — sometimes .tsv or .tab)
  • Comma-separated tables (.csv or .txt)
  • Other standard delimiter (e.g. colon, pipe)
  • Fixed-width
  • OpenDocument Spreadsheet (.ods)
  • OpenDocument Database (.odb)
Image Files
  • TIFF (.tiff or .tif)
  • JPEG (.jpg or .jp2)
  • Portable Network Graphics (.png)
  • Scalable Vector Graphics (.svg)
  • Portable Document Format (.pdf)
  • Graphics Interchange Format (.gif)
  • Microsoft Windows Bitmap Format (.bmp)
Sound Files
  • WAVE (.wav)
  • FLAC (.flac)
  • MPEG-3 (.mp3 — usually suitable for human voice and moderate-quality audio, but may not be suitable for high-fidelity audio)
  • Audio Interchange File Format (.aiff)
Video Files
  • MPEG-4 (.mp4)
  • Material Exchange Format (.mxf)
Databases
  • Extensible Markup Language (.xml)
  • Comma-separated tables (.csv)
Geospatial Data
  • Geo-Referenced TIFF (.tiff)
  • ESRI Shapefile (.shp, .shx, .dbf)
  • Keyhole Markup Language (.kml)
  • Network Common Data Format (.nc)
Web Data
  • Javascript Object Notation (.json)
  • Extensible Markup Language (.xml)
  • Hypertext Markup Language (.html)
Web Archive
  • WebARChive (.warc)
Multidimensional Arrays
  • Common Data Format (.cdf)
  • Network Common Data Format (.nc)
  • Hierarchical Data Format (usually .hdf or .h5)
E-books
  • Electronic Publication (.epub)

File Format Support

File Format Considerations. Research Data Service. University Library. University of Illinois at Urbana-Champaign. https://www.library.illinois.edu/rds/file-formats/


© The Ohio State University - University Libraries

1858 Neil Avenue Mall, Columbus, OH 43210

Phone: (614) 292-OSUL (6785)

Request an alternate format of this page | Accessibility | Privacy Policy | Contact Us

Creative Commons

Copyright Information | Details and Exceptions