Detecting Out-Of-Distribution Labels in Image Datasets With Pre-trained Networks
| dc.contributor.author | Wulz, Susanne | |
| dc.contributor.author | Krispel, Ulrich | |
| dc.date.accessioned | 2025-07-30T08:15:37Z | |
| dc.date.available | 2025-07-30T08:15:37Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract-translated | Ensuring the correctness of annotations in training datasets is one way to increase the trustworthiness and reliability of Machine Learning. This study aims to detect semantic shifts in datasets using Feature-Based Out-Of-Distribution and outlier detection methods, assuming Out-Of-Distribution samples are far from In-Distribution data. The experiments began with distance-based methods, such as k-Nearest Neighbours and Mahalanobis, followed by feature pyramids and dimensionality reduction techniques to address high-dimensional challenges. The results showed that the k-Nearest Neighbours detector performed robustly, achieving 100% AUROC when using ResNet50 on the Caltech-101 dataset, while the Mahalanobis detector showed unstable results with scores close to 50%. Moreover, selecting the right backbone model and feature levels, particularly low-level features from ResNet50, improved performance achieving AUROC score of 96% on the DelftBikes dataset for both k-Nearest Neighbours and Local Outlier Factor. The study highlights that k-Nearest Neighbours, Local Outlier Factor, alongside feature pyramids and dimensionality reduction constitute an effective setup for Out-of-Distribution detection, but optimal performance depends on tailored configurations across varying data conditions. | en |
| dc.format | 6 s. | cs |
| dc.format.mimetype | application/pdf | |
| dc.identifier.doi | http://www.doi.org/10.24132/JWSCG.2025-10 | |
| dc.identifier.issn | 1213-6972 (print) | |
| dc.identifier.issn | 1213-6964 (online) | |
| dc.identifier.uri | http://hdl.handle.net/11025/62204 | |
| dc.language.iso | en | en |
| dc.publisher | Václav Skala - UNION Agency | cs |
| dc.rights | © Václav Skala - UNION Agency | en |
| dc.rights.access | openAccess | en |
| dc.subject | strojové učení | cs |
| dc.subject | neuronové sítě | cs |
| dc.subject | konvoluční neuronové sítě | cs |
| dc.subject | detekce odchylek od distribuce na základě příznaků | cs |
| dc.subject | metody založené na vzdálenosti · detekce odlehlých hodnot | cs |
| dc.subject | faktor lokálních odlehlých hodnot | cs |
| dc.subject.translated | machine learning | en |
| dc.subject.translated | neural networks | en |
| dc.subject.translated | convolutional neural networks | en |
| dc.subject.translated | feature-based out-of-distribution detection | en |
| dc.subject.translated | distance-based methods · outlier detection | en |
| dc.subject.translated | local outlier factor | en |
| dc.title | Detecting Out-Of-Distribution Labels in Image Datasets With Pre-trained Networks | en |
| dc.type | článek | cs |
| dc.type | article | en |
| dc.type.status | Peer-reviewed | en |
| dc.type.version | publishedVersion | en |
| local.files.count | 1 | * |
| local.files.size | 1055363 | * |
| local.has.files | yes | * |