Data Management Plan Guidance
Introduction
Data management plans (DMPs) are highly contextual to research field, project, and mode of inquiry, but every DMP should include considerations of several key topics – i.e., roles and responsibilities; ethical, legal, and Indigenous data management considerations (if applicable); data collection; documentation and metadata; storage, security, and access; and data retention, deposit and sharing.
Typically, an ideal DMP will be complete, precise, and in line with disciplinary best practices. Shortcomings in a DMP will normally stem from lacking one of these features – e.g., it will not discuss data deposit (incomplete), or it will not say where data will be deposited (imprecise), or the chosen repository is poorly suited for the data (not in line with disciplinary best practices). CIHR recognizes that for many research fields, data management practices are in development and disciplinary best practices have not yet been established (e.g., preferred repositories, metadata standards, etc.).
DMPs should describe how data will be FAIR – findable, accessible, interoperable, and reusable. This does not mean that the DMP needs to include a section specifically devoted to making data FAIR. Rather, by completing each section completely, precisely, and in line with disciplinary best practices (where they exist), the DMP will describe how the data will be FAIR. CIHR recognizes that the extent to which data can be FAIR could be constrained by infrastructure limitations (e.g., lack of suitable repositories) and disciplinary practices (e.g., metadata standards have not been established). CIHR also recognizes that ‘accessible’ data is not synonymous with ‘open’ data. In many instances, due to ethical, commercial or legal obligations, access to data will need to be controlled; and in some instances, access to data cannot be provided at all. See CIHR’s guidance on how to make data FAIR.
For research conducted by and with First Nations, Inuit and Métis communities, DMPs should be co-developed with these communities, in accordance with research data management principles that they accept, such as the CARE (collective benefit, authority to control, responsibility, and ethics) principles, the First Nations principles of OCAP© (ownership, control, access and possession), the National Inuit Strategy on Research Principles of Inuit Qaujimajatuqangit [ PDF (4.3 MB) - external link ], or the Manitoba Métis principles of OCAS.
Data Management Plans – Guidance on Specific Sections
This DMP format closely aligns with the Digital Research Alliance of Canada’s Simplified DMP Format, which is accessible to researchers through DMP Assistant. DMP Assistant is a national, online, bilingual data management planning tool that assists researchers in preparing DMPs.
Introductory Context – Roles and Responsibilities, Ethical and Legal Considerations, Indigenous Research Data Management (RDM)
- Identify who will be responsible for managing the project's data during and after the project, and the major data management tasks for which they will be responsible.
- Describe any ethical, legal or commercial constraints the data are subject to. If the project includes sensitive data, describe how ethical obligations preliminary to the research project (e.g., participant consent to collect and use data) will be met.
- If the project involves research conducted by and with First Nations, Inuit or Métis communities, explain how Indigenous RDM principles will be followed. Explain how the DMP was co-developed with the Indigenous partner(s) involved in the research (including who was engaged and when, and how the partnership informed the DMP).
Possible Shortcomings to Avoid in This Section
- The plan mentions privacy risks but does not explain how they will be mitigated.
- The plan explains that the project will include Indigenous groups and commits to following Indigenous RDM principles, but it does not describe the approach for managing Indigenous research data throughout the course of the project and beyond – for example, the plan does not explain the governance structures for ensuring Indigenous ownership and control of Indigenous data.
Data Collection
- Explain what data will be collected, created or used, and how – e.g., through observational studies, experiments, simulations, and the use of specific software or tools (e.g., REDCap).
Possible Shortcomings to Avoid in This Section
- The plan describes what data will be created but does not identify the software or platform being used to generate the data.
Documentation and Metadata
- Explain whether any information will be provided for others to understand and reuse the data – e.g., a ‘readme’ text file, code books, or lab notebooks, etc. Ideally, dataset documentation should be provided in machine readable, openly accessible formats (e.g., .csv, .txt file formats).
- Where possible and applicable, state what metadata standard will be followed. The standard can be general (e.g., Dublin Core), but will ideally be domain-specific (in which case, it may be supported by the repository where you plan to deposit the data). Stating the metadata standard will be particularly pertinent to research teams planning to establish a data platform or hub.
Possible Shortcomings to Avoid in This Section
- The plan commits to adequately documenting the data but does not explain how this will be done – e.g., whether through ‘readme’ text file, etc.
- The plan does not specify whether a metadata standard will be used or not, and why.
Storage, Security and Access
- Explain where and how data will be stored and secured during the research project, and who will have access to the data (e.g., which members of the research team). Oftentimes data will be stored at your institution through the course of the project, but sometimes this storage will be supplied by external providers, particularly if you plan to collaborate with industry or other non-academic partners.
Possible Shortcomings to Avoid in This Section
- The plan says that data will be securely stored during the project, but it does not specify where the data will be stored or who will have access to the data through the course of the project.
Retention, Deposit and Sharing
- Explain where data will be retained and deposited following completion of the research project, and for how long. Sometimes researchers will decide to keep the data in their institution’s repository; sometimes they will deposit the data in an external repository; and sometimes they will choose to keep some data at their institution (e.g., sensitive data), and deposit other data in an external repository (e.g., data that can be openly accessible). This is important to plan from the outset of the project so that there is a clear plan in place in case the NPI retires or moves to another institution.
- Explain which data will be shared, and in what form (e.g., raw, processed). If applicable, explain whether data are subject to access controls or limitations due to confidentiality, privacy, and/or intellectual property considerations.
- If applicable, describe the data access procedures.
Possible Shortcomings to Avoid in This Section
- The plan indicates that data will be deposited, but it does not specify where and/or for how long; or it names the repository, but the repository is inadequate or not aligned with disciplinary best practices.
- The plan says that data cannot be shared due to privacy or research ethics board (REB) requirements but does not explain why or the explanation is not compelling.
- The plan does not say how data will be made available to others – e.g., whether the repository makes the data available to anyone on the web or whether access will be controlled, or whether the data will be assigned a persistent identifier to be included in publications, etc.
How to Make Your Data FAIR – 5 Essential Steps
Researchers should follow international best practices for ensuring that research data are shared in a manner such that they are Findable, Accessible, Interoperable and Reusable, referred to as the FAIR principles. This document provides guidance on how to make research data FAIR.
The FAIR principles do not concern, and this document does not provide guidance on, ethical issues and approaches that must be considered to determine whether and how access should be provided to research data. For information and guidance on ethical issues related to data sharing, please reach out to data librarians and/or research ethics officers at your institution.
For research conducted by and with First Nations, Métis and Inuit communities, any data that are made FAIR should be done so only with the knowledge and consent by the Indigenous community and in accordance with research data management principles the community accepts, such as the CARE (collective benefit, authority to control, responsibility, and ethics) and OCAP© (ownership, control, access and possession) principles.
The box below includes a quick summary on how to make data FAIR in five steps. For more detailed and technical guidance, consult the FAIR Principles webpage at GOFAIR.
Making Your Data FAIR – 5 Essential Steps
- Create or save a version of your data using a commonly understood and non-proprietary file format (e.g., .txt, .csv). If your research field has data standards or expectations on data formats, follow them.
- Deposit your data in a domain repository that is recognized in your research field. If no domain repository exists, choose a generalist one (e.g., FRDR, Zenodo, Dryad). The repository should be indexed in leading database aggregators (e.g., OpenAIRE, Google Dataset Search, DataMed).
- The repository should use a metadata standard – follow the metadata standard when completing the metadata record for the dataset. You can use online tools such as the CEDAR Workbench to help you complete the metadata record without mistakes.
- Choose an appropriate license for re-use of your data.
- The repository should assign a persistent identifier (PID) to the dataset and/or metadata record – for example, a Digital Object Identifier (DOI). When you publish a paper that relies on the data, ensure that the paper has a data availability statement and includes the PID in the statement.
- Date modified: