4. Ontologies4Chem Workshop

The 4th Ontologies4Chem Workshop, organised by NFDI4Chem in collaboration with NFDI4Cat, the Beilstein-Institut, and the Physical Sciences Data Infrastructure (PSDI), took place as an in-person event for the first time from 11–13 November 2025 in Limburg an der Lahn, Hesse, Germany. The event combined presentations with extensive hands-on ‘hacking’ sessions in the afternoon.

The primary aim of the workshop was to advance a shared, community-governed “ontology canon”, understood as a set of recommended chemical ontologies for metadata annotation and FAIR data management across the chemistry community. Beyond this core objective, the workshop also placed strong emphasis on networking, providing an important platform for bringing together relevant stakeholders and fostering exchange and collaboration within the community.

Key Themes & Discussions

Toward a Chemistry Ontology Canon

Across the presentations and discussions, a recurring theme emerged highlighting the need for a shared, community-agreed ontology canon in order to reduce fragmentation and enhance interoperability across chemical data. Building on contributions from NFDI4Chem and the insights gained during the discussions, the NFDI4Chem Terminology Service (TS),  was proposed as a suitable environment for documenting and governing such a canon. The TS provides functionality such as integrated issue tracking, directly linked to ontology repositories, as well as user-specific note-taking to capture metadata on ontology usage, including which projects use particular ontologies, how terms are applied, and where gaps remain.

Participants were invited to contribute their own real data or context, such as annotated chemical data or descriptions of their ontology usage, barriers, and missing ontologies, to stress-test and improve the canon.

Ontological Representation of Chemical Reactions

During the “chemical reaction” track of the workshop, discussions focused on how to represent reactions more richly in ontologies, going beyond just reactants and products.

While existing resources (e.g. reaction ontologies and databases) were acknowledged as useful foundations, there was a consensus that the semantic scope needed to be expanded in order to capture reaction roles (e.g. catalyst or reagent), bond changes, mechanisms, intermediates, and reaction conditions. This would enable more nuanced, machine-actionable reaction descriptions.

The idea of a next-generation reaction ontology was discussed, which could potentially enable more comprehensive reaction metadata, better FAIR annotation of reaction datasets, and improved interoperability.

Ontological Representation of Chemical Entities

In the chemical entity track, the community revisited the status and future of widely used ontologies, such as ChEBI (Chemical Entities of Biological Interest). While ChEBI was recognised as a core reference ontology, discussion focused on extending its scope beyond biological molecules to better represent mixtures, materials, salts, racemic mixtures, macromolecules, and so forth.

The presentation of CHEMROF showcased a semantic schema for representing individual chemical entities (e.g. elements, molecules, ions), their groupings (e.g. substance classes, mixture components), and reaction-related constructs. The framework targets not only pure compounds but also mixtures and reaction-centric information, so it can support use cases spanning small-molecule chemistry and materials, and can serve as a higher-level schema for ontologies like ChEBI.

Another contribution was an ensemble approach called Chebifier 2, which combines rule-based, deep-learning and LLM-based classification to automatically classify chemical entities into ChEBI classes (currently over 1,700).

Furthermore, the launch of the improved submission portal for ChEBI was described. The portal enables community curation, which is a key step towards keeping the ontology up to date and responsive to the needs of the community.

Collaboration, Tools & Infrastructure Integration

The workshop emphasised that developing and maintaining a robust ontology canon is a social task as well as a technical one, requiring coordinated stewardship, governance, and community buy-in.

The integration of ontologies with research data infrastructures and data management tools was a recurring theme, for example, linking ontology use with metadata frameworks to ensure that research data (experimental and computational) can be annotated consistently and made FAIR from the outset.

Participants were encouraged to use real-world data to test ontologies, file issues, propose missing ontologies and contribute to the TS, thereby making the canon more practical and community-driven.

Significance & Outlook

Putting together a community-maintained ontology canon for chemistry, alongside modern ontology-engineering tools and infrastructure, is a major step toward improved semantic interoperability in chemical research data. A robust, curated canon plus an evolving ChEBI (and related ontologies) would enable:

  • more reliable integration of chemical data from different sources (experiments, computation, literature),
  • richer and more precise modeling of reactions, entities, materials, and mixtures,
  • easier FAIR annotation, sharing, and reuse of chemical datasets, and
  • greater readiness for AI-driven research and machine learning on semantic chemical data.

With growing collaboration across NFDI4Chem, NFDI4Cat, PSDI and other partners, the community is well-positioned to make significant progress, and the 4th Workshop makes clear that ontology development in chemistry is entering a new, more mature phase.

The Workshop report will be published in the near future and planning 5th Ontologies4Chem Workshop in 2026 has already begun.