Towards Zero-Shot Camera Trap Image Categorization

dc.contributor.authorVyskočil, Jiří
dc.contributor.authorPicek, Lukáš
dc.date.accessioned2026-03-24T19:05:25Z
dc.date.available2026-03-24T19:05:25Z
dc.date.issued2025
dc.date.updated2026-03-24T19:05:25Z
dc.description.abstractThis paper describes the search for an alternative approach to the automatic categorization of camera trap images. First, we benchmark state-of-the-art classifiers using a single model for all images. Next, we evaluate methods combining MegaDetector with one or more classifiers and Segment Anything to assess their impact on reducing location-specific overfitting. Last, we propose and test two approaches using large language and foundational models, such as DINOv2, BioCLIP, BLIP, and ChatGPT, in a zero-shot scenario. Evaluation carried out on two publicly available datasets (WCT from New Zealand, CCT20 from the Southwestern US) and a private dataset (CEF from Central Europe) revealed that combining MegaDetector with two separate classifiers achieves the highest accuracy. This approach reduced the relative error of a single BEiTV2 classifier by approximately 42\% on CCT20, 48\% on CEF, and 75\% on WCT. Besides, as the background is removed, the error in terms of accuracy in new locations is reduced to half. The proposed zero-shot pipeline based on DINOv2 and FAISS achieved competitive results (1.0\% and 4.7\% smaller on CCT20, and CEF, respectively), which highlights the potential of zero-shot approaches for camera trap image categorization.en
dc.format17
dc.identifier.doi10.1007/978-3-031-92387-6_3
dc.identifier.isbn978-3-031-92386-9
dc.identifier.issn0302-9743
dc.identifier.obd43944169
dc.identifier.orcidVyskočil, Jiří 0000-0002-6443-2051
dc.identifier.orcidPicek, Lukáš 0000-0002-6041-9722
dc.identifier.urihttp://hdl.handle.net/11025/67353
dc.language.isoen
dc.project.IDSS05010008
dc.publisherSpringer
dc.relation.ispartofseriesWorkshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024
dc.subjectcamera trapsen
dc.subjectclassificationen
dc.subjectretrievalen
dc.subjectBLIPen
dc.subjectDINOv2en
dc.subjectzero-shoten
dc.subjectvision and languageen
dc.subjectChatGPTen
dc.subjectSAMen
dc.subjectMegaDetectoren
dc.titleTowards Zero-Shot Camera Trap Image Categorizationen
dc.typeStať ve sborníku (D)
dc.typeSTAŤ VE SBORNÍKU
dc.type.statusPublished Version
local.files.count1*
local.files.size3339477*
local.has.filesyes*
local.identifier.eid2-s2.0-105007140291

Files

Original bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
978-3-031-92387-6_3.pdf
Size:
3.18 MB
Format:
Adobe Portable Document Format
License bundle
Showing 1 - 1 out of 1 results
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: