2 resultados para Visual immersive environments
em AMS Tesi di Laurea - Alm@DL - Università di Bologna
Resumo:
Since the end of the long winter of virtual reality (VR) at the beginning of the 2010 decade, many improvements have been made in terms of hardware technologies and software platforms performances and costs. Many expect such trend will continue, pushing the penetration rate of virtual reality headsets to skyrocket at some point in the future, just as mobile platforms did before. In the meantime, virtual reality is slowly transitioning from a specialized laboratory-only technology, to a consumer electronics appliance, opening interesting opportunities and challenges. In this transition, two interesting research questions amount to how 2D-based content and applications may benefit (or be hurt) by the adoption of 3D-based immersive environments and to how to proficiently support such integration. Acknowledging the relevance of the former, we here consider the latter question, focusing our attention on the diversified family of PC-based simulation tools and platforms. VR-based visualization is, in fact, widely understood and appreciated in the simulation arena, but mainly confined to high performance computing laboratories. Our contribution here aims at characterizing the simulation tools which could benefit from immersive interfaces, along with a general framework and a preliminary implementation which may be put to good use to support their transition from uniquely 2D to blended 2D/3D environments.
Resumo:
This dissertation describes a deepening study about Visual Odometry problem tackled with transformer architectures. The existing VO algorithms are based on heavily hand-crafted features and are not able to generalize well to new environments. To train them, we need carefully fine-tune the hyper-parameters and the network architecture. We propose to tackle the VO problem with transformer because it is a general-purpose architecture and because it was designed to transformer sequences of data from a domain to another one, which is the case of the VO problem. Our first goal is to create synthetic dataset using BlenderProc2 framework to mitigate the problem of the dataset scarcity. The second goal is to tackle the VO problem by using different versions of the transformer architecture, which will be pre-trained on the synthetic dataset and fine-tuned on the real dataset, KITTI dataset. Our approach is defined as follows: we use a feature-extractor to extract features embeddings from a sequence of images, then we feed this sequence of embeddings to the transformer architecture, finally, an MLP is used to predict the sequence of camera poses.