Efficient Model Learning from Data with Partially Incorrect Labels

Principal Investigators:

Alan Akbik

Team Members:

Christoph Alt (Postdoctoral researcher)
Elena Merdjanovska (Doctoral researcher)

Learning from noisy labels

Research Unit 3, SCIoI Project 44

All intelligent systems that learn by example must be able to deal with incorrectly provided examples, i.e. data points for which the label is incorrect. Consider a child (or robotic agent) pointing at a potted plant to inquire its name – but the parent mistakenly believes the child is pointing at the window behind the plant, and thus provides the wrong name. Ideally, the child should notice the misunderstanding in this situation and conclude that the provided label actually belongs to a different type of object.

Learning from noisy labels is an important research area in machine learning that distinguishes between instance-independent and instance-dependent noise. For the former, a multitude of strategies have been proposed including noise adaptation layers, loss correction , or reweighting, whereas the latter is comparatively less researched despite its higher relevance for real-world applications. In particular, existing strategies build on complex multi-model architectures to counter confirmation bias and semantic drift, and thus do not scale to applications such as online learning.

With this project, we aim to devise strategies for intelligent systems to deal with such incorrectly labeled data points while maintaining a single model of the world. This entails (1) identifying possibly incorrect data points based on the model’s current understanding of the world, (2) learning to ignore incorrectly labeled data points, and (3) potentially overwriting incorrectly labeled data points with pseudo-labels and using these as supervision. We argue that such a mechanism will enable model learning in the presence of noisy data, and more effectively scale to realistic scenarios that involve large amounts of streaming and partially labeled data.


Related Publications

Ziletti, A., Akbik, A., Berns, C., Herold, T., Legler, M., & Viell, M. (2022). Medical Coding with Biomedical Transformer Ensembles and Zero/Few-shot Learning. NAACL, 176–187. https://doi.org/10.18653/v1/2022.naacl-industry.21
Weber, L., Sänger, M., Garda, S., Barth, F., Alt, C., & Leser, U. (2021). Humboldt@ DrugProt: Chemical-Protein Relation Extraction with Pretrained Transformers and Entity Descriptions. Proceedings of the BioCreative VII challenge evaluation workshop. https://biocreative.bioinformatics.udel.edu/media/store/files/2021/Track1_pos_2_BC7_submission_172.pdf
Harbecke, D., Chen, Y., Hennig, L., & Alt, C. (2022). Why only Micro-F1? Class Weighting of Measures for Relation Classification. Proceedings of the 1st Workshop on Efficient Benchmarking in NLP. https://doi.org/10.18653/v1/2022.nlppower-1.4


Photo by DeepMind on Unsplash