Marker-Free Multi-Modal Motion Capture for 6-DoF Object Position and Orientation Estimation

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

In this work, we present a novel multi-modal, end-to-end, marker-free motion capture framework designed to estimate the six degrees of freedom (6-DoF) states of objects. Traditional motion capture systems often rely on infrared optical, inertial, or magnetic markers to identify and track objects. However, in many application scenarios, such as outdoor environments and robotics development, the use of markers interferes with system operation, and the markers themselves are prone to environmental interference. Our proposed framework tackles these challenges using a two-stage approach, leveraging multimodal sensor fusion techniques. The framework integrates cameras and Light Detection and Ranging (Lidar) sensors around the workspace, each operating at different frequencies. A data synchronizer controls the triggering of these sensors, ensuring synchronized data collection from multiple sensor streams. In stage I, the framework focuses on multimodal feature extraction, utilizing multiple modules to process the sensor data streams and extract spatial features. In stage II, the position and pose extraction module calculates the spatial state of the object by combining the extracted features with the object's spatial state context from previous frames. We validate the framework through experiments on the Nvidia ISAAC digital twin platform and in real-world environments, demonstrating its feasibility and robustness across a variety of test objects. This approach provides a reliable and flexible solution for motion capture in complex environments, eliminating the need for invasive markers. High real-time performance is achieved at each stage and within each submodule by using lightweight neural networks and time-aligned data synchronization. By integrating multimodal sensor fusion and context-based spatial state computation, the proposed method ensures high recognition accuracy, even in challenging symmetrical objects cases.

Original languageEnglish
Title of host publication2025 IEEE Symposium on Computational Intelligence in Image, Signal Processing and Synthetic Media, CISM 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331508357
DOIs
Publication statusPublished - 2025
Event2025 IEEE Symposium on Computational Intelligence in Image, Signal Processing and Synthetic Media, CISM 2025 - Trondheim, Norway
Duration: 17 Mar 202520 Mar 2025

Publication series

Name2025 IEEE Symposium on Computational Intelligence in Image, Signal Processing and Synthetic Media, CISM 2025

Conference

Conference2025 IEEE Symposium on Computational Intelligence in Image, Signal Processing and Synthetic Media, CISM 2025
Country/TerritoryNorway
CityTrondheim
Period17/03/2520/03/25

Keywords

  • 6-DoF estimation
  • Marker-free motion capture
  • Sensor fusion

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Mathematical Physics

Fingerprint

Dive into the research topics of 'Marker-Free Multi-Modal Motion Capture for 6-DoF Object Position and Orientation Estimation'. Together they form a unique fingerprint.

Cite this