878 resultados para stereo 3D


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lo scopo della tesi è creare un’architettura in FPGA in grado di ricavare informazioni 3D da una coppia di sensori stereo. La pipeline è stata realizzata utilizzando il System-on-Chip Zynq, che permette una stretta interazione tra la parte hardware realizzata in FPGA e la CPU. Dopo uno studio preliminare degli strumenti hardware e software, è stata realizzata l’architettura base per la scrittura e la lettura di immagini nella memoria DDR dello Zynq. In seguito l’attenzione si è spostata sull’implementazione di algoritmi stereo (rettificazione e stereo matching) su FPGA e nella realizzazione di una pipeline in grado di ricavare accurate mappe di disparità in tempo reale acquisendo le immagini da una camera stereo.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Questa tesi si occupa dell’estensione di un framework software finalizzato all'individuazione e al tracciamento di persone in una scena ripresa da telecamera stereoscopica. In primo luogo è rimossa la necessità di una calibrazione manuale offline del sistema sfruttando algoritmi che consentono di individuare, a partire da un fotogramma acquisito dalla camera, il piano su cui i soggetti tracciati si muovono. Inoltre, è introdotto un modulo software basato su deep learning con lo scopo di migliorare la precisione del tracciamento. Questo componente, che è in grado di individuare le teste presenti in un fotogramma, consente ridurre i dati analizzati al solo intorno della posizione effettiva di una persona, escludendo oggetti che l’algoritmo di tracciamento sarebbe portato a individuare come persone.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation presents a study and experimental research on asymmetric coding of stereoscopic video. A review on 3D technologies, video formats and coding is rst presented and then particular emphasis is given to asymmetric coding of 3D content and performance evaluation methods, based on subjective measures, of methods using asymmetric coding. The research objective was de ned to be an extension of the current concept of asymmetric coding for stereo video. To achieve this objective the rst step consists in de ning regions in the spatial dimension of auxiliary view with di erent perceptual relevance within the stereo pair, which are identi ed by a binary mask. Then these regions are encoded with better quality (lower quantisation) for the most relevant ones and worse quality (higher quantisation) for the those with lower perceptual relevance. The actual estimation of the relevance of a given region is based on a measure of disparity according to the absolute di erence between views. To allow encoding of a stereo sequence using this method, a reference H.264/MVC encoder (JM) has been modi ed to allow additional con guration parameters and inputs. The nal encoder is still standard compliant. In order to show the viability of the method subjective assessment tests were performed over a wide range of objective qualities of the auxiliary view. The results of these tests allow us to prove 3 main goals. First, it is shown that the proposed method can be more e cient than traditional asymmetric coding when encoding stereo video at higher qualities/rates. The method can also be used to extend the threshold at which uniform asymmetric coding methods start to have an impact on the subjective quality perceived by the observers. Finally the issue of eye dominance is addressed. Results from stereo still images displayed over a short period of time showed it has little or no impact on the proposed method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Acquiring 3D shape from images is a classic problem in Computer Vision occupying researchers for at least 20 years. Only recently however have these ideas matured enough to provide highly accurate results. We present a complete algorithm to reconstruct 3D objects from images using the stereo correspondence cue. The technique can be described as a pipeline of four basic building blocks: camera calibration, image segmentation, photo-consistency estimation from images, and surface extraction from photo-consistency. In this Chapter we will put more emphasis on the latter two: namely how to extract geometric information from a set of photographs without explicit camera visibility, and how to combine different geometry estimates in an optimal way. © 2010 Springer-Verlag Berlin Heidelberg.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the introduction of new input devices, such as multi-touch surface displays, the Nintendo WiiMote, the Microsoft Kinect, and the Leap Motion sensor, among others, the field of Human-Computer Interaction (HCI) finds itself at an important crossroads that requires solving new challenges. Given the amount of three-dimensional (3D) data available today, 3D navigation plays an important role in 3D User Interfaces (3DUI). This dissertation deals with multi-touch, 3D navigation, and how users can explore 3D virtual worlds using a multi-touch, non-stereo, desktop display. The contributions of this dissertation include a feature-extraction algorithm for multi-touch displays (FETOUCH), a multi-touch and gyroscope interaction technique (GyroTouch), a theoretical model for multi-touch interaction using high-level Petri Nets (PeNTa), an algorithm to resolve ambiguities in the multi-touch gesture classification process (Yield), a proposed technique for navigational experiments (FaNS), a proposed gesture (Hold-and-Roll), and an experiment prototype for 3D navigation (3DNav). The verification experiment for 3DNav was conducted with 30 human-subjects of both genders. The experiment used the 3DNav prototype to present a pseudo-universe, where each user was required to find five objects using the multi-touch display and five objects using a game controller (GamePad). For the multi-touch display, 3DNav used a commercial library called GestureWorks in conjunction with Yield to resolve the ambiguity posed by the multiplicity of gestures reported by the initial classification. The experiment compared both devices. The task completion time with multi-touch was slightly shorter, but the difference was not statistically significant. The design of experiment also included an equation that determined the level of video game console expertise of the subjects, which was used to break down users into two groups: casual users and experienced users. The study found that experienced gamers performed significantly faster with the GamePad than casual users. When looking at the groups separately, casual gamers performed significantly better using the multi-touch display, compared to the GamePad. Additional results are found in this dissertation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Il caso studio del vestibolo ottagonale di Villa Adriana ha dato la possibilità di applicare ad un edificio di notevole valore storico e artistico tecniche di restituzione digitale e di modellazione tridimensionale basate su applicativi di modellazione geometrica, con lo scopo di generarne il modello 3D digitale fotorealistico e polifunzionale. Nel caso specifico del vestibolo, un modello tridimensionale di questo tipo risulta utile a fini documentativi, a sostegno di ipotesi costruttive e come strumento per la valutazione di interventi di restauro. Il percorso intrapreso ha permesso di valutare le criticità nelle tecniche di acquisizione, modellazione e foto-modellazione tridimensionale applicate in ambito archeologico, tecniche usate abitualmente anche in settori quali l’architettura, il design industriale ma anche nel cinema (effetti speciali e film d’animazione) e in ambito videoludico, con obiettivi differenti: nel settore del design e della progettazione industriale il Reverse Modeling viene impiegato per eseguire controlli di qualità e rispetto delle tolleranze sul prodotto finale, mentre in ambito cinematografico e videoludico (in combinazione con altri software) permette la creazione di modelli realistici da inserire all’interno di film o videogiochi, (modelli non solo di oggetti ma anche di persone). La generazione di un modello tridimensionale ottenuto tramite Reverse Modeling è frutto di un processo opposto alla progettazione e può avvenire secondo diverse strategie, ognuna delle quali presenta vantaggi e svantaggi specifici che la rendono più indicata in alcuni casi piuttosto che in altri. In questo studio sono state analizzate acquisizioni tridimensionali effettuate tramite Laser Scan e tramite applicazioni Structure from Motion/Dense Stereo View.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

L’automatisation de la détection et de l’identification des animaux est une tâche qui a de l’intérêt dans plusieurs domaines de recherche en biologie ainsi que dans le développement de systèmes de surveillance électronique. L’auteur présente un système de détection et d’identification basé sur la vision stéréo par ordinateur. Plusieurs critères sont utilisés pour identifier les animaux, mais l’accent a été mis sur l’analyse harmonique de la reconstruction en temps réel de la forme en 3D des animaux. Le résultat de l’analyse est comparé avec d’autres qui sont contenus dans une base évolutive de connaissances.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis, we propose several advances in the numerical and computational algorithms that are used to determine tomographic estimates of physical parameters in the solar corona. We focus on methods for both global dynamic estimation of the coronal electron density and estimation of local transient phenomena, such as coronal mass ejections, from empirical observations acquired by instruments onboard the STEREO spacecraft. We present a first look at tomographic reconstructions of the solar corona from multiple points-of-view, which motivates the developments in this thesis. In particular, we propose a method for linear equality constrained state estimation that leads toward more physical global dynamic solar tomography estimates. We also present a formulation of the local static estimation problem, i.e., the tomographic estimation of local events and structures like coronal mass ejections, that couples the tomographic imaging problem to a phase field based level set method. This formulation will render feasible the 3D tomography of coronal mass ejections from limited observations. Finally, we develop a scalable algorithm for ray tracing dense meshes, which allows efficient computation of many of the tomographic projection matrices needed for the applications in this thesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work aims to develop a neurogeometric model of stereo vision, based on cortical architectures involved in the problem of 3D perception and neural mechanisms generated by retinal disparities. First, we provide a sub-Riemannian geometry for stereo vision, inspired by the work on the stereo problem by Zucker (2006), and using sub-Riemannian tools introduced by Citti-Sarti (2006) for monocular vision. We present a mathematical interpretation of the neural mechanisms underlying the behavior of binocular cells, that integrate monocular inputs. The natural compatibility between stereo geometry and neurophysiological models shows that these binocular cells are sensitive to position and orientation. Therefore, we model their action in the space R3xS2 equipped with a sub-Riemannian metric. Integral curves of the sub-Riemannian structure model neural connectivity and can be related to the 3D analog of the psychophysical association fields for the 3D process of regular contour formation. Then, we identify 3D perceptual units in the visual scene: they emerge as a consequence of the random cortico-cortical connection of binocular cells. Considering an opportune stochastic version of the integral curves, we generate a family of kernels. These kernels represent the probability of interaction between binocular cells, and they are implemented as facilitation patterns to define the evolution in time of neural population activity at a point. This activity is usually modeled through a mean field equation: steady stable solutions lead to consider the associated eigenvalue problem. We show that three-dimensional perceptual units naturally arise from the discrete version of the eigenvalue problem associated to the integro-differential equation of the population activity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nell’ambito della Stereo Vision, settore della Computer Vision, partendo da coppie di immagini RGB, si cerca di ricostruire la profondità della scena. La maggior parte degli algoritmi utilizzati per questo compito ipotizzano che tutte le superfici presenti nella scena siano lambertiane. Quando sono presenti superfici non lambertiane (riflettenti o trasparenti), gli algoritmi stereo esistenti sbagliano la predizione della profondità. Per risolvere questo problema, durante l’esperienza di tirocinio, si è realizzato un dataset contenente oggetti trasparenti e riflettenti che sono la base per l’allenamento della rete. Agli oggetti presenti nelle scene sono associate annotazioni 3D usate per allenare la rete. Invece, nel seguente lavoro di tesi, utilizzando l’algoritmo RAFT-Stereo [1], rete allo stato dell’arte per la stereo vision, si analizza come la rete modifica le sue prestazioni (predizione della disparità) se al suo interno viene inserito un modulo per la segmentazione semantica degli oggetti. Si introduce questo layer aggiuntivo perché, trovare la corrispondenza tra due punti appartenenti a superfici lambertiane, risulta essere molto complesso per una normale rete. Si vuole utilizzare l’informazione semantica per riconoscere questi tipi di superfici e così migliorarne la disparità. È stata scelta questa architettura neurale in quanto, durante l’esperienza di tirocinio riguardante la creazione del dataset Booster [2], è risultata la migliore su questo dataset. L’obiettivo ultimo di questo lavoro è vedere se il riconoscimento di superfici non lambertiane, da parte del modulo semantico, influenza la predizione della disparità migliorandola. Nell’ambito della stereo vision, gli elementi riflettenti e trasparenti risultano estremamente complessi da analizzare, ma restano tuttora oggetto di studio dati gli svariati settori di applicazione come la guida autonoma e la robotica.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Depth estimation from images has long been regarded as a preferable alternative compared to expensive and intrusive active sensors, such as LiDAR and ToF. The topic has attracted the attention of an increasingly wide audience thanks to the great amount of application domains, such as autonomous driving, robotic navigation and 3D reconstruction. Among the various techniques employed for depth estimation, stereo matching is one of the most widespread, owing to its robustness, speed and simplicity in setup. Recent developments has been aided by the abundance of annotated stereo images, which granted to deep learning the opportunity to thrive in a research area where deep networks can reach state-of-the-art sub-pixel precision in most cases. Despite the recent findings, stereo matching still begets many open challenges, two among them being finding pixel correspondences in presence of objects that exhibits a non-Lambertian behaviour and processing high-resolution images. Recently, a novel dataset named Booster, which contains high-resolution stereo pairs featuring a large collection of labeled non-Lambertian objects, has been released. The work shown that training state-of-the-art deep neural network on such data improves the generalization capabilities of these networks also in presence of non-Lambertian surfaces. Regardless being a further step to tackle the aforementioned challenge, Booster includes a rather small number of annotated images, and thus cannot satisfy the intensive training requirements of deep learning. This thesis work aims to investigate novel view synthesis techniques to augment the Booster dataset, with ultimate goal of improving stereo matching reliability in presence of high-resolution images that displays non-Lambertian surfaces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La Stereo Vision è un popolare argomento di ricerca nel campo della Visione Artificiale; esso consiste nell’usare due immagini di una stessa scena,prodotte da due fotocamere diverse, per estrarre informazioni in 3D. L’idea di base della Stereo Vision è la simulazione della visione binoculare umana:le due fotocamere sono disposte in orizzontale per fungere da “occhi” che guardano la scena in 3D. Confrontando le due immagini ottenute, si possono ottenere informazioni riguardo alle posizioni degli oggetti della scena.In questa relazione presenteremo un algoritmo di Stereo Vision: si tratta di un algoritmo parallelo che ha come obiettivo di tracciare le linee di livello di un area geografica. L’algoritmo in origine era stato implementato per la Connection Machine CM-2, un supercomputer sviluppato negli anni 80, ed era espresso in *Lisp, un linguaggio derivato dal Lisp e ideato per la macchina stessa. Questa relazione tratta anche la traduzione e l’implementazione dell’algoritmo in CUDA, ovvero un’architettura hardware per l’elaborazione pa- rallela sviluppata da NVIDIA, che consente di eseguire codice parallelo su GPU. Si darà inoltre uno sguardo alle difficoltà che sono state riscontrate nella traduzione da *Lisp a CUDA.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important approach to cancer therapy is the design of small molecule modulators that interfere with microtubule dynamics through their specific binding to the ²-subunit of tubulin. In the present work, comparative molecular field analysis (CoMFA) studies were conducted on a series of discodermolide analogs with antimitotic properties. Significant correlation coefficients were obtained (CoMFA(i), q² =0.68, r²=0.94; CoMFA(ii), q² = 0.63, r²= 0.91), indicating the good internal and external consistency of the models generated using two independent structural alignment strategies. The models were externally validated employing a test set, and the predicted values were in good agreement with the experimental results. The final QSAR models and the 3D contour maps provided important insights into the chemical and structural basis involved in the molecular recognition process of this family of discodermolide analogs, and should be useful for the design of new specific ²-tubulin modulators with potent anticancer activity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this study was to evaluate the stress distribution in the cervical region of a sound upper central incisor in two clinical situations, standard and maximum masticatory forces, by means of a 3D model with the highest possible level of fidelity to the anatomic dimensions. Two models with 331,887 linear tetrahedral elements that represent a sound upper central incisor with periodontal ligament, cortical and trabecular bones were loaded at 45º in relation to the tooth's long axis. All structures were considered to be homogeneous and isotropic, with the exception of the enamel (anisotropic). A standard masticatory force (100 N) was simulated on one of the models, while on the other one a maximum masticatory force was simulated (235.9 N). The software used were: PATRAN for pre- and post-processing and Nastran for processing. In the cementoenamel junction area, tensile forces reached 14.7 MPa in the 100 N model, and 40.2 MPa in the 235.9 N model, exceeding the enamel's tensile strength (16.7 MPa). The fact that the stress concentration in the amelodentinal junction exceeded the enamel's tensile strength under simulated conditions of maximum masticatory force suggests the possibility of the occurrence of non-carious cervical lesions such as abfractions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We analyze the breaking of Lorentz invariance in a 3D model of fermion fields self-coupled through four-fermion interactions. The low-energy limit of the theory contains various submodels which are similar to those used in the study of graphene or in the description of irrational charge fractionalization.