Point Clouds are Specialized Images: A Knowledge Transfer Approach for 3D Understanding

Jiachen Kang, Wenjing Jia, Xiangjian He, Kin Man Lam

Research output: Journal PublicationArticlepeer-review

Abstract

Self-supervised representation learning (SSRL) has gained increasing attention in point cloud understanding, in addressing the challenges posed by 3D data scarcity and high annotation costs. This paper presents PCExpert, a novel SSRL approach that reinterprets point clouds as “specialized images”. This conceptual shift allows PCExpert to leverage knowledge derived from large-scale image modality in a more direct and deeper manner, via extensively sharing the parameters with a pretrained image encoder in a multi-way Transformer architecture. The parameter sharing strategy, combined with an additional pretext task for pre-training, i.e., transformation estimation, empowers PCExpert to outperform the state of the arts in a variety of tasks, with a remarkable reduction in the number of trainable parameters. Notably, PCExpert’s performance under LINEAR fine-tuning (e.g., yielding a 90.02% overall accuracy on ScanObjectNN) has already closely approximated the results obtained with FULL model fine-tuning (92.66%), demonstrating its effective representation capability.

Original languageEnglish
Pages (from-to)10755-10765
Number of pages11
JournalIEEE Transactions on Multimedia
Volume26
DOIs
Publication statusPublished - 2024

Keywords

  • Cross-modal learning
  • point cloud understanding
  • self-supervision
  • transfer learning

ASJC Scopus subject areas

  • Signal Processing
  • Media Technology
  • Computer Science Applications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Point Clouds are Specialized Images: A Knowledge Transfer Approach for 3D Understanding'. Together they form a unique fingerprint.

Cite this