Learning Spatial-Aware Cross-View Embeddings for Ground-to-Aerial Geolocalization

Rui Cao, Jiasong Zhu, Qing Li, Qian Zhang, Qingquan Li, Bozhi Liu, Guoping Qiu

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)


Image-based geolocalization is an important alternative to GPS-based localization in GPS-denied situations. Among them, ground-to-aerial geolocalization is particularly promising but also difficult due to drastic viewpoint and appearance differences between ground and aerial images. In this paper, we propose a novel spatial-aware Siamese-like network to address the issue by exploiting the spatial transformer layer to effectively alleviate the large view variation and learn location discriminative embeddings from the cross-view images. Furthermore, we propose to combine the triplet ranking loss with a simple and effective location identity loss to further enhance the performances. We test our method on a publicly available dataset and the results show that the proposed method outperforms state-of-the-art by a large margin.

Original languageEnglish
Title of host publicationImage and Graphics - 10th International Conference, ICIG 2019, Proceedings, Part 1
EditorsYao Zhao, Chunyu Lin, Nick Barnes, Baoquan Chen, Rüdiger Westermann, Xiangwei Kong
Number of pages11
ISBN (Print)9783030341190
Publication statusPublished - 2019
Event10th International Conference on Image and Graphics, ICIG 2019 - Beijing, China
Duration: 23 Aug 201925 Aug 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11901 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference10th International Conference on Image and Graphics, ICIG 2019


  • Cross-view geolocalization
  • Deep metric learning
  • Image retrieval
  • Image-based localization

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Learning Spatial-Aware Cross-View Embeddings for Ground-to-Aerial Geolocalization'. Together they form a unique fingerprint.

Cite this