Biblioteca Digital

A study on tackling visual odometry by a transformer architecture

**Autoria(s):** Wen, Xiaowei
Contribuinte(s)	Di Stefano, Luigi De Luigi, Luca
Data(s)	06/10/2022
Resumo	This dissertation describes a deepening study about Visual Odometry problem tackled with transformer architectures. The existing VO algorithms are based on heavily hand-crafted features and are not able to generalize well to new environments. To train them, we need carefully fine-tune the hyper-parameters and the network architecture. We propose to tackle the VO problem with transformer because it is a general-purpose architecture and because it was designed to transformer sequences of data from a domain to another one, which is the case of the VO problem. Our first goal is to create synthetic dataset using BlenderProc2 framework to mitigate the problem of the dataset scarcity. The second goal is to tackle the VO problem by using different versions of the transformer architecture, which will be pre-trained on the synthetic dataset and fine-tuned on the real dataset, KITTI dataset. Our approach is defined as follows: we use a feature-extractor to extract features embeddings from a sequence of images, then we feed this sequence of embeddings to the transformer architecture, finally, an MLP is used to predict the sequence of camera poses.
Formato	application/pdf
Identificador	http://amslaurea.unibo.it/26920/1/tesi.pdf Wen, Xiaowei (2022) A study on tackling visual odometry by a transformer architecture. [Laurea magistrale], Università di Bologna, Corso di Studio in Artificial intelligence [LM-DM270] <http://amslaurea.unibo.it/view/cds/CDS9063/>, Documento ad accesso riservato.
Idioma(s)	en
Publicador	Alma Mater Studiorum - Università di Bologna
Relação	http://amslaurea.unibo.it/26920/
Direitos	Free to read
Palavras-Chave	#Visual Odometry,Transformer,Deep learning #Artificial intelligence [LM-DM270]
Tipo	PeerReviewed info:eu-repo/semantics/masterThesis

Acesso ao item digital