Detecting Out-Of-Distribution Labels in Image Datasets With Pre-trained Networks

dc.contributor.authorWulz, Susanne
dc.contributor.authorKrispel, Ulrich
dc.date.accessioned2025-07-30T08:15:37Z
dc.date.available2025-07-30T08:15:37Z
dc.date.issued2025
dc.description.abstract-translatedEnsuring the correctness of annotations in training datasets is one way to increase the trustworthiness and reliability of Machine Learning. This study aims to detect semantic shifts in datasets using Feature-Based Out-Of-Distribution and outlier detection methods, assuming Out-Of-Distribution samples are far from In-Distribution data. The experiments began with distance-based methods, such as k-Nearest Neighbours and Mahalanobis, followed by feature pyramids and dimensionality reduction techniques to address high-dimensional challenges. The results showed that the k-Nearest Neighbours detector performed robustly, achieving 100% AUROC when using ResNet50 on the Caltech-101 dataset, while the Mahalanobis detector showed unstable results with scores close to 50%. Moreover, selecting the right backbone model and feature levels, particularly low-level features from ResNet50, improved performance achieving AUROC score of 96% on the DelftBikes dataset for both k-Nearest Neighbours and Local Outlier Factor. The study highlights that k-Nearest Neighbours, Local Outlier Factor, alongside feature pyramids and dimensionality reduction constitute an effective setup for Out-of-Distribution detection, but optimal performance depends on tailored configurations across varying data conditions.en
dc.format6 s.cs
dc.format.mimetypeapplication/pdf
dc.identifier.doihttp://www.doi.org/10.24132/JWSCG.2025-10
dc.identifier.issn1213-6972 (print)
dc.identifier.issn1213-6964 (online)
dc.identifier.urihttp://hdl.handle.net/11025/62204
dc.language.isoenen
dc.publisherVáclav Skala - UNION Agencycs
dc.rights© Václav Skala - UNION Agencyen
dc.rights.accessopenAccessen
dc.subjectstrojové učenícs
dc.subjectneuronové sítěcs
dc.subjectkonvoluční neuronové sítěcs
dc.subjectdetekce odchylek od distribuce na základě příznakůcs
dc.subjectmetody založené na vzdálenosti · detekce odlehlých hodnotcs
dc.subjectfaktor lokálních odlehlých hodnotcs
dc.subject.translatedmachine learningen
dc.subject.translatedneural networksen
dc.subject.translatedconvolutional neural networksen
dc.subject.translatedfeature-based out-of-distribution detectionen
dc.subject.translateddistance-based methods · outlier detectionen
dc.subject.translatedlocal outlier factoren
dc.titleDetecting Out-Of-Distribution Labels in Image Datasets With Pre-trained Networksen
dc.typečlánekcs
dc.typearticleen
dc.type.statusPeer-revieweden
dc.type.versionpublishedVersionen
local.files.count1*
local.files.size1055363*
local.has.filesyes*

Files

Original bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
C41-1-6.pdf
Size:
1.01 MB
Format:
Adobe Portable Document Format
License bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: