Weakly supervised bounding-box generation for camera-trap image based animal detection

Puxuan Xie, Renwu Gao, Weizeng Lu, Linlin Shen

Research output: Journal PublicationArticlepeer-review

Abstract

In ecology, deep learning is improving the performance of camera-trap image based wild animal analysis. However, high labelling cost becomes a big challenge, as it requires involvement of huge human annotation. For example, the Snapshot Serengeti (SS) dataset contains over 900,000 images, while only 322,653 contains valid animals, 68,000 volunteers were recruited to provide image level labels such as species, the no. of animals and five behaviour attributes such as standing, resting and moving etc. In contrast, the Gold Standard SS Bounding-Box Coordinates (GSBBC for short) contains only 4011 images for training of object detection algorithms, as the annotation of bounding-box for animals in the image, is much more costive. Such a no. of training images, is obviously insufficient. To address this, the authors propose a method to generate bounding-boxes for a larger dataset using limited manually labelled images. To achieve this, the authors first train a wild animal detector using a small dataset (e.g. GSBBC) that is manually labelled to locate animals in images; then apply this detector to a bigger dataset (e.g. SS) for bounding-box generation; finally, we remove false detections according to the existing label information of the images. Experiments show that detector trained with images whose bounding-boxes are generated using the proposal, outperformed the existing camera-trap image based animal detection, in terms of mean average precision (mAP). Compared with the traditional data augmentation method, our method improved the mAP by 21.3% and 44.9% for rare species, also alleviating the long-tail issue in data distribution. In addition, detectors trained with the proposed method also achieve promising results when applied to classification and counting tasks, which are commonly required in wildlife research.

Original languageEnglish
Article numbere12332
JournalIET Computer Vision
Volume19
Issue number1
DOIs
Publication statusPublished - 1 Jan 2025

Keywords

  • computer vision
  • image classification
  • learning (artificial intelligence)
  • object detection

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Weakly supervised bounding-box generation for camera-trap image based animal detection'. Together they form a unique fingerprint.

Cite this