Hacking for Metabolomics & Mass Spectrometry format interoperability

European BioHackathon 2025

The ELIXIR BioHackathon Europe took place 3–7 November 2025 in Bad Saarow near Berlin. Among the ~30 Projects was the “Metabolomics and proteomics file format interoperability fest”, co-organised by Steffen Neumann. Our project focused on the mzTab-M format and its specification and implementations.

We had around 40 participants in the Slack channel, based in at least eleven countries, and up to ten on-site sitting around the table(s), and connected them via Zoom, Slack, and the Online Documents.

The mzTab family of standards goes back more than ten years, and captures the results of Proteomics, Metabolomics and Lipidomics experiments [10.1074/mcp.O113.036681, 10.1021/acs.analchem.8b04310]. Since then, mzTab-M has been picked up by a number of metabolomics and lipidomics tools and resources. In fact, obtaining an overview of available implementations was an initial goal and precondition for the project. Work in the project itself was organised in several departments.

In the Software and Libraries Department, we surveyed implementations and characterised their features (and limitations), obtained example files for real-world metabolomics and lipidomics studies, and pushed them through the jmzTab-M validator. Example files in the HUPO-PSI/mzTab-M are now being validated through sophisticated GitHub actions developed during the Hackathon.

The real interoperability fest was then to test whether and how well the readers can import and then perform their respective analyses. We had outputs from mzmine, MS-Dial, xcms, LipidDataAnalyzer and LipidCompass, Progenesis QI, MetaboScape, and MASSter. They got pushed to GNPS, LipidCompass, MetFamily, MetaboLights, and MetaboAnalyst. And there were the good, the bad, and (a few of) the ugly. In the end, we reached 20 combinations, nearly all of them mostly succeeding.

Steffen Neumann (right) talking with other participants

Focus of the second half was therefore to improve the state of the software implementations. Luckily (and thanks to all our participants!), we had participation from nearly all software ecosystems creating and accepting mzTab-M. Hence, some issues could be solved right on the spot, for others we created clear issue reports to be addressed in a timely fashion.

In the mzTab-M Improvements Department we discussed how real-world usage and the above insights can lead to a (careful) evolution of the mzTab-M specification. One of the most pressing issues is the encoding of multifactorial experimental designs, which were uncommon in the proteomics world two decades ago, but are the norm in today’s metabolomics world. Another ambiguous corner in the specification is handling of ambiguity, e.g. in the case where multiple molecular structures are proposed, which resulted from different adduct hypotheses or other sources of ambiguity. The model now also accepts a more relaxed version of charges for chemical features. And of course, documentation, examples, and recommendations in the specification always have room for improvement.

Finally, you are currently reading one of the products from the Training and outreach Department 🙂 In addition to this article, we are working on a follow-up manuscript describing the real-world adoption and benefits of mzTab-M.

Once again, the ELIXIR BioHackathon proved the benefit of bringing together experts from different bioinformatics projects, with experience in various languages and frameworks, but also users with experience from metabolomics infrastructures and repositories: this combination helped to validate and improve data publishing, exchange, and interoperability.

Photo Gallery

Posted on 28/11/202528/11/2025 by Steffen Neumann