T-MAE : Temporal Masked Autoencoders for Point Cloud Representation Learning

Open Access
Authors
Publication date 2025
Host editors
  • A. Leonardis
  • E. Ricci
  • S. Roth
  • O. Russakovsky
  • T. Sattler
  • G. Varol
Book title Computer Vision – ECCV 2024
Book subtitle 18th European Conference, Milan, Italy, September 29–October 4, 2024 : proceedings
ISBN
  • 9783031732461
ISBN (electronic)
  • 9783031732478
Series Lecture Notes in Computer Science
Event The 18th European Conference on Computer Vision ECCV 2024
Volume | Issue number XI
Pages (from-to) 178–195
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Institute for Biodiversity and Ecosystem Dynamics (IBED)
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
The scarcity of annotated data in LiDAR point cloud understanding hinders effective representation learning. Consequently, scholars have been actively investigating efficacious self-supervised pre-training paradigms. Nevertheless, temporal information, which is inherent in the LiDAR point cloud sequence, is consistently disregarded. To better utilize this property, we propose an effective pre-training strategy, namely Temporal Masked Auto-Encoders (T-MAE), which takes as input temporally adjacent frames and learns temporal dependency. A SiamWCA backbone, containing a Siamese encoder and a windowed cross-attention (WCA) module, is established for the two-frame input. Considering that the movement of an ego-vehicle alters the view of the same instance, temporal modeling also serves as a robust and natural data augmentation, enhancing the comprehension of target objects. SiamWCA is a powerful architecture but heavily relies on annotated data. Our T-MAE pre-training strategy alleviates its demand for annotated data. Comprehensive experiments demonstrate that T-MAE achieves the best performance on both Waymo and ONCE datasets among competitive self-supervised approaches.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-031-73247-8_11
Downloads
T-MAE (Final published version)
Supplementary materials
Permalink to this page
Back