We start with the series on a deep dive into Trust Markers by looking at one of the Trust Markers available in Dimensions Research Integrity — Data Availability Statement (DAS). This blog gives insights into a DAS, why it is important, where to find it in a scientific publication, what information a DAS should include, and pointers on formulating a DAS. So let’s dive in!
What is a data availability statement (DAS)?
A data availability statement (DAS) is an essential component of a scientific article, separate from the main body of text. It serves the purpose of indicating the accessibility and testability of a study’s research data. The DAS plays a crucial role in connecting the findings presented in a paper and the supporting evidence. Including a DAS in a manuscript reinforces the study’s credibility, research transparency, and overall trust in the study. Although not mandatory for all journals or funders, the presence of a DAS significantly improves the quality of a manuscript and facilitates the citability of the underlying data.
A DAS is not only for open, accessible data, though. There are circumstances in which data availability is neither feasible nor responsible, such as in the case of protecting human subjects or other personally identifiable information. But in a case such as that, a DAS can be used to explain further why the author(s) decided to limit their data availability.
Where does the data availability statement go?
The DAS should be clearly stated either at the beginning or at the end of the manuscript, sectioned with a heading such as “Availability of Data and Materials” or simply “Data Availability Statement.” The section will be distinct from other supplementary materials, so the section’s title explicitly mentions “data”.
What kind of data must be described in a data availability statement?
DAS references any research data needed to replicate or reuse the work. This includes, but is not limited to the following forms: Data collected, data downloaded and analyzed (but not manipulated), and data generated.
How to write a data availability statement
The length and wording of the DAS will vary depending on several factors, but a good DAS consists of four core elements:
- Data collected: These are the data collected or used to help answer the study objective. Listing this/these datasets help the reader quickly understand what they will find in the dataset and is particularly helpful when dealing with multiple datasets.
- Data location or repository name/archive: These can be both physical and digital datasets.
- Link to dataset/repository (if applicable): Where is the repository stored? Include a link to your repository to help the reader find your dataset more easily
- PIDs (Persistent Identifiers): These may be represented as digital object identifiers (DOIs), reference numbers, ARK, HANDL, PURL, etc.
- Because most DAS will be published in a web-based format, include hyperlinks wherever possible.
Basic data availability statement Template
The dataset [title or type] used in this study is publicly available online in the [repository name] [link to data/repository] repository: [PIDs].
Poor example of a data availability statement
All datasets used in this study are publicly available through an open repository.
The example above states the availability status of the data used in the study, but it is lacking all specific identifiers needed to locate and identify said data.
Good example of a data availability statement
All cell-type transcriptome data are available in the NCBI SRA database under accession number PRJNA412708. Additional supplementary data are available from the Dryad Digital Repository: https://doi.org/10.5061/dryad.hp2fr73. (Sogabe, Hatleberg, Kocot, et al. 2019)
As outlined in the graphic below, this DAS includes the necessary information for identifying and locating all data used in the study.
Further reading for DAS:
- Data Availability Statement , Springer Nature
- Archiving and sharing data: Data access statements, University of Bath
- Availability Statement Examples, American Meteorological Society
Data availability statement used as an example: Sogabe, Shunsuke et al. (2019), Data from: Pluripotency and the origin of animal multicellularity, Dryad, Dataset, https://doi.org/10.5061/dryad.hp2fr73
Author credit: This article originally appeared in the Ripeta blog and was authored by August DeVore . August was the Communications Specialist on the team. Ripeta is now part of the Dimensions family known as Dimensions Research Integrity.