October 6, 2023

Watch out for negative data bias in machine learning

Machine learning, with its potential to revolutionise the field of immunology, isn’t immune to the pitfalls of biased data. That is a cautious lesson discussed in a new academic publication by ImmuneWatch co-founders Prof. Kris Laukens and Prof. Pieter Meysman.

In this paper, they discuss recent work by Gao et al., who sought to predict TCR–epitope bindings using an unseen epitope model, achieving a reported ROC-AUC of 70.8%. Yet, a new evaluation by the team at the University of Antwerp showed that biases in the negative data used to train the model resulted in performance dropping to random levels when tested in real-world scenarios.

For context, biases in machine learning data aren’t a new phenomenon. An illustrative example is a classifier designed to identify malignant skin lesions, which instead learned to identify rulers in images. Such biases, whether obvious or subtle, compromise the efficacy of algorithms in real-world applications.

At ImmuneWatch we have no doubt that accurate TCR–epitope binding prediction tools can bring immense potential benefits for the development of T-cell based diagnostics and therapeutics, which will ultimately benefit the patient. Rigorous validation of prediction scores and exclusion of negative data bias is an important part of bringing these models to reality.

The pitfalls of negative data bias for the T-cell epitope specificity challenge
Nature Machine Learning Intelligence
https://doi.org/10.1038/s42256-023-00727-0

Stay up to date

Subscribe to our newsletter to receive useful tips and information.

Thank you! Your submission has been received!

Oups ! Un problème s'est produit lors de l'envoi du formulaire.

Recommended articles

July 29, 2024

ImmuneWatch, Quantoom Biosciences (Univercells) and UAntwerp join forces to explore new in vitro and in silico assays that predict T cell responses in vaccines

ImmuneWatch, Quantoom Biosciences (“Quantoom”) and UAntwerp (Lab of Experimental Hematology) are delighted to announce a public-private partnership in a project under the “Pandemic Preparedness” call, launched by the University of Antwerp in 2022.

May 27, 2026

Annotating the specificity of TIL products with TCR sequencing: Demonstrating the usefulness of computational algorithms

A new peer-reviewed paper in ImmunoInformatics demonstrates how computational TCR epitope annotation can characterise TIL products, screen for viral bystanders, and track tumour-reactive clones in vivo.