The sharing of all research outputs, including data, code, materials, and other types of information beyond the traditional research paper has the potential to aid the advancement of scientific progress generally and benefit individual researchers by adding transparency to their research process as well as potentially increasing citations to their work (Piwowar, H. A., & Vision, T. J. 2013).
In order to maximize these effects we encourage sharing of all data, materials, and software code whenever possible. The likelihood that sharing any particular research output will contribute to these benefits depends on many factors including how complete it is, how completely it is documented, where it is stored, and how it is shared.
Planning from the beginning
As mentioned in our Data Management Planning guide, planning for the publication of research outputs from the beginning of the research process can have a significant impact on the quantity of material available for sharing as well as the quality of documentation and metadata for those materials.
Because results are most commonly shared months or years after the point when data is collected and processed, producing the documentation, metadata, and other elements important to sharing research outputs may require more work when done at the end of a research process than it would have required if they were planned for from the beginning. This potential reduction in the work required to share research outputs makes Data Management Planning an important supporting practice for the sharing of research outputs.
Likewise, setting the intention from the beginning of a project that research outputs will eventually be made public may support and assist teams during the process of data management. The assumption that work will be made public may provide additional motivation for researchers to maintain metadata documentation and version management processes agreed upon at the outset of research, or to add additional descriptions and quality control tests to custom software and analysis scripts.
Making your outputs findable and usable
When sharing research outputs, a variety of factors influence how readily those materials can be found, understood, and used. In addition to the previously mentioned issues of metadata collection, data anonymization, workflow documentation, and reproducible software environment, issues related to choosing an appropriate copyright license (Butler, 2017), selecting interoperable formats, and others are important considerations.
For a more comprehensive look at the range of issues involved and how they may relate to your research outputs, refer to the “FAIR Guiding Principles for scientific data management and stewardship” (Wilkinson et al., 2016). Depending on the type of research outputs you produce the particular considerations for making them FAIR will vary. Regardless of the type of output, we suggest favoring publishing tools that enable unique citations for each type of research output and as many of the individual elements inside that type as you might wish to refer to directly.
Importance of unique citations
Having a clear citation available for each of your data, materials, code, or other class of research outputs makes it easier to track what elements of your research are being cited and may help reinforce the norm that each of these different types of outputs should be cited if used in f new research.
Unique and persistent identifiers at the file level additionally enable you to tie your research outputs directly to relevant portions of your research papers. This can take the form of links inside figure captions to the source data and analysis scripts that created the figure, links from methods sections to more complete protocol information, links from representative image selections to the complete collection of images, or other similar connections.
These direct connections add information to your research findings and help your research paper serve as a map to the additional research outputs you make public. In one study evaluating open data (Roche, Kruuk, Lanfear, & Binning, 2015), this behavior of direct linking was a notable feature of the datasets that simultaneously scored highly for both completeness and reusability.
Publishing research outputs on OSF
OSF generates citations for each component in a project and assigns every project, component, and file a short url that is globally unique and whose persistence is guaranteed by a data preservation fund currently sufficient to provide 50+ years of public access. An OSF project with separate components for each type of research output you intend to share (eg "Data", "Materials", "Protocol", "Analysis Scripts", etc) will give you an organizational structure and a set of unique citations conducive to sharing all of your research outputs.
Because projects on OSF are private by default, you can create this project structure and use it to actively manage your research materials privately during the course of your research, then simply make whichever of those components you wish to share public. OSF's granular permission system enables you to pick and choose only those portions of your project you wish to make public, while keeping the rest only accessible to your research team.
- Piwowar, H. A., & Vision, T. J. (2013). Data reuse and the open data citation advantage. PeerJ, 1, e175. https://doi.org/10.7717/peerj.175
- Wilkinson, M. D., Dumontier, M., Aalbersberg, Ij. J., Appleton, G., Axton, M., Baak, A., … Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3, 160018.
- Roche, D. G., Kruuk, L. E. B., Lanfear, R., & Binning, S. A. (2015). Public Data Archiving in Ecology and Evolution: How Well Are We Doing? PLOS Biology, 13(11), e1002295. https://doi.org/10.1371/journal.pbio.1002295