Abstract
Few-shot learning is seeking to generalize well to unseen tasks with insufficient labeled samples. Existing works have achieved generalization by exploring inter-class discrimination. However, their performance is limited because sample discrimination is neglected. In this work, we propose a metric-based few-shot approach that leverages self-supervised learning, Prototypical networks, and knowledge distillation, referred to as SSL-ProtoNet, to utilize sample discrimination. The proposed SSL-ProtoNet consists of three stages: pre-training stage, fine-tuning stage, and self-distillation stage. In the pre-training stage, self-supervised learning is leveraged to cluster the samples with their augmented variants to enhance the sample discrimination. The learned representation is then served as an initial point for the next stage. In the fine-tuning stage, the model weights transferred from the pre-training stage are fine-tuned to the target few-shot tasks. A self-supervised loss and a few-shot loss are integrated to prevent overfitting during few-shot task adaptation and to maintain the embedding diversity. In the self-distillation stage, the model is arranged in a teacher–student architecture. The teacher model will serve as a guidance in student model training to reduce overfitting and further improve the performance. The experimental results show that the proposed SSL-ProtoNet outshines the state-of-the-art few-shot image classification methods on three benchmark few-shot datasets, namely, miniImageNet, tieredImageNet, and CIFAR-FS. The source code for the proposed method is available at https://github.com/Jityan/sslprotonet.
Original language | English |
---|---|
Article number | 122173 |
Journal | Expert Systems with Applications |
Volume | 238 |
DOIs | |
Publication status | Published - 15 Mar 2024 |
Externally published | Yes |
Keywords
- Few-shot classification
- Few-shot learning
- Knowledge distillation
- Prototypical networks
- Self-supervised learning
ASJC Scopus subject areas
- General Engineering
- Computer Science Applications
- Artificial Intelligence