National research data infrastructure for chemistry

Workshop report

On 30 October, more than 50 participants met at the TIB for the workshop “National Research Data Infrastructure for Chemistry (NFDI4Chem)”.Scientists from university and non-university research organisations as well as representatives of infrastructure facilities and funding institutions accepted the invitation from TIB and the German Chemical Society (GDCh) to discuss the future of research data management in academic chemical research.

The prehistory – the idea of the National Research Data Infrastructure

In her article “Research data – step by step. But where is the journey going?”, Janna Neumann gave an outlook on developments in science policy for 2018. In March 2018, the Council for Information Infrastructures (RfII) further concretised its ideas and recommendations for a National Research Data Infrastructure (NFDI) for science in Germany in its latest discussion paper “Cooperation as an opportunity – Second impulse for discussion on the design of a National Research Data Infrastructure (NFDI) for science in Germany”. The idea of the NFDI formulated by the RfII  is as follows

Here, scientific communities and specialised communities, i.e. the research users who require certain research data management services, should form on a sufficiently broad basis and […] join forces with infrastructure partners […]. The resulting […] consortia receive resources that are intended to create a permanent basis for the necessary research data management solutions, particularly in terms of organisation and personnel. The consortia are the main players in the organisation of research data management – and also in the gradual development of the NFDI

The proposals for the NFDI have now been taken up and commented on by the Joint Science Conference (GWK) and numerous research organisations.

First steps towards an NFDI4Chem

Back in April 2018, TIB and GDCh organised the first “National Research Data in Chemistry (NFDI4Chem)” expert discussion. Almost 20 participants from science, infrastructure facilities and funding institutions discussed the status quo of research data management (RDM) in chemistry at academic research institutions and the challenges for the establishment of an NFDI4Chem.

The survey revealed a very heterogeneous picture of the current handling of research data at research institutes: The documentation of chemical experiments is still predominantly carried out using paper laboratory journals, observations and measurements are distributed and available in analogue and digital form, and the evaluation of data is also carried out in analogue and digital form. At national level, there are also few data repositories in which chemists can store their data. Open data formats for this are partly outdated. Furthermore, the topic of research data management is not sufficiently present among researchers. Tools and possibilities are often unknown. Data publications play only a minor role in scientific reputation.

Thus, the central thesis of NFDI4Chem is nothing less than a call for digital change in chemical research, combined with the demand to sensitise and win over scientists for research data management. For successful research data management in the advancing digital age, it is necessary to digitise and interlink all processes. This begins with data collection and continues through processing and analysis to publication. Laboratory journal software (ELN) can be an important tool in the future to support scientists in the preparation of research data. The findings of this discussion were published in the thesis paper NFDI4Chem.

NFDI4Chem Workshop

The NFDI4Chem discussion continued on 30 October. The participation of numerous scientists from the most diverse fields of chemistry, representatives of professional societies such as the GDCh and its specialist groups, the German Pharmaceutical Society (DPhG) and the Bunsen Society clearly showed that the “scientific communities and specialist communities, i.e. the research users who require certain research data management services” want to play a major role in the design of an NFDI4Chem. However, important infrastructure facilities were also represented by TIB, FIZ Karlsruhe and university libraries and computing centres. After an introduction to the current status of the NFDI and NFDI4Chem discussion by Angelina Kraft and Oliver Koepler, the FAIR principles were briefly presented once again before the topics of “Digital Change”, “Data Repositories”, “Tools, Services and Networking” and “Community Involvement” were discussed in four groups in the afternoon and possible focal points for an NFDI4Chem were discussed. The respective significance of the subject areas and fields of action for the specialist areas of chemistry were analysed in depth.

Digital change in the chemical industry encompasses a whole range of aspects. These include the end-to-end digitalisation and interlinking of all processes, e.g. through the use of ELNs and other tools. Due to the large number of device and data types, the development and application of open data formats is required for the interoperability of a networked infrastructure. The annotation and enrichment of data with metadata must be intuitively possible at all points in the process chain and supported by devices and software.

Repositories with German participation such as Chemotion, nmrshiftDB, NOMAD, Göbench or RADAR were identified as interesting, possible building blocks for the development of an NFDI4Chem, although international repositories are of course just as important. There was lively discussion about which data should be described and to what extent and stored in data repositories, ranging from raw data to curated data. Metadata standards and controlled vocabularies were recognised as important, and minimum information standards were mentioned. The goals should be reproducibility, comparability and good scientific practice. What does it mean for RDM if future re-use is unpredictable? Around the discussion of data publication in repositories, important aspects of licensability and licence standards for research data, role and rights management, quality standards and provenance of research data (hash/blockchain procedure) were identified. 

A number of aspects were identified to improve the status of research data management in the specialist community. In general, the digital handling of data should be anchored in the curriculum and become a matter of course through appropriate training and clear rules in everyday laboratory work. Added values such as the quality assurance of research data should be emphasised. Universities can gain impetus from the exchange of experience with industry (Industry 4.0). The professional associations have an important role to play in the development of recommendations and standards in training. Research funders can also provide impetus through guidelines on RDM, as can publishers when submitting publications. It is important to strike a balance between incentives and obligations.

International networking is essential for an NFDI4Chem. Data standards and policies should be developed in dialogue with international initiatives such as the RDA Data Interest Group/Chemistry. It consists of representatives of the International Union of Pure and Applied Chemistry, the Research Data Alliance and the GO FAIR Chemistry Implementation Network, among others. The connection of international repositories is just as relevant as the development of new publication processes with international journals.

What happens next?

The aim of the workshop was to concretise the thematic focus and fields of work for a National Research Data Infrastructure and to recruit active participants for the formation of an NFDI4Chem consortium. This goal has been achieved, a draft structure plan is available, the formation of working groups and the enquiry for participation is underway. Newly identified stakeholders are being approached and invited to participate. The findings of the workshop will also be discussed further within the professional organisations. The NFDI4Chem itself and some of its future members are in contact with international initiatives in order to prepare the harmonisation of data standards and policies. The overarching role of specialised fields such as materials science or biochemistry and the necessary networking with the NFDI4Ing or NFDI4Life were identified. And last but not least: the next workshop is expected to take place at the beginning of 2019.

Are you interested in the NFDI4Chem and in actively helping to shape it? Contact us at