In the last 18 months we have discussed the future of Research Data Management (RDM) in Germany and approaches for a National Research Data Infrastructure for Chemistry with key players in infrastructure development, learned societies, and the chemistry community. The current state of RDM in chemistry and initial ideas were presented in our concept paper and laid the foundation for further discussion. Our key objectives and the decision to focus on molecules and their characterisation data are the result of several workshops, detailed discussions and meta-analysis of surveys on chemistry RDM.
Key insights can be summarised as follows:
- Institutional Research Data Management poorly implemented.
- Electronic Laboratory Notebooks ELNs and LIMS for early capture of (meta) data in electronic form are desired but rarely used.
- Scientists in chemistry like to re-use data but rarely share their data.
- The concept of electronic metadata annotations are poorly understood and rarely used.
- Training in basic principles of RDM, covering the data workflow from the bench via repositories to peer-reviewed publications as well as basic concepts such as FAIR principles, is urgently needed.
Based on our analysis, we have designed a work programmeto achieve a breakthrough in RDM in chemistry. Our six task areas (TA) each address one or more of the issues identified. We base our work on integrating the existing lighthouses of RDM in our research domain, fill gaps both in the repository landscape and the underlying standards, develop and disseminate powerful tools to enable early digital data and metadata capture in the lab, and develop a strong training programme for chemists to understand and adopt the concepts developed in NFDI4Chem. Also, we dedicate a full task area to leveraging the synergies between the NFDI4Chem and the NFDI as a whole.
TA1 Management and Coordination will ensure the lean and efficient financial and organisational management of the project.
TA2 Smart Laboratory (Smart Lab) focuses on the implementation and adaptation of existing and development of so far missing IT components embedded in a flexible work environment, necessary to capture data early in the life cycle and to further manage, analyse and store associated information. TA2 enables a digital change in chemistry by supporting scientists with digital infrastructure of tools, services and repositories interoperable within the NFDI infrastructure.
TA3 Repositories enables the reliable storage, dissemination and archival of all relevant research data at each stage of the data lifecycle. This includes raw data in diverse formats as well as curated datasets. TA3 will adapt major existing chemistry repositories and databases to standards and interfaces, thus fostering interoperability and FAIRness as well as facilitating storing, disclosing, searching and re-using research data across distributed data sources.
TA4 Metadata, Data Standards and Publication Standards creates and maintains the specification and documentation of standards required for archival, publication and exchange of data and metadata on molecule characterisation and reactions, together with reference implementations and data validation. Ontologies are used where possible, and missing terminological artifacts are added.
TA5 Community Involvement and Training interfaces between community and infrastructure units: the community’s requirements are collected, analysed and channeled. Equally, dissemination and training on all levels are organised, starting in early undergraduate studies, and discipline-specific training material is developed. TA5 also fosters the awareness of the community for RDM and offers incentives for innovations.
TA6 Synergies coordinates the activities of NFDI4Chem with the other NFDI consortia. TA6 is responsible for the coordination of the cross-cutting topics, including cross-domain metadata standards, semantic data annotation for cross-domain mapping of ontologies, provision of terminology services as well as legal aspects of FAIR RDM. Harmonisation will be sought by working with international bodies such as the Research Data Alliance (RDA) and the International Union of Pure and Applied Chemistry (IUPAC). TA6 develops an overarching search service and terminology service, that both will be linked to the NFDI.