954 resultados para Video-camera
Resumo:
A target tracking algorithm able to identify the position and to pursuit moving targets in video digital sequences is proposed in this paper. The proposed approach aims to track moving targets inside the vision field of a digital camera. The position and trajectory of the target are identified by using a neural network presenting competitive learning technique. The winning neuron is trained to approximate to the target and, then, pursuit it. A digital camera provides a sequence of images and the algorithm process those frames in real time tracking the moving target. The algorithm is performed both with black and white and multi-colored images to simulate real world situations. Results show the effectiveness of the proposed algorithm, since the neurons tracked the moving targets even if there is no pre-processing image analysis. Single and multiple moving targets are followed in real time.
Resumo:
The aim of this work is to evaluate the influence of point measurements in images, with subpixel accuracy, and its contribution in the calibration of digital cameras. Also, the effect of subpixel measurements in 3D coordinates of check points in the object space will be evaluated. With this purpose, an algorithm that allows subpixel accuracy was implemented for semi-automatic determination of points of interest, based on Fõrstner operator. Experiments were accomplished with a block of images acquired with the multispectral camera DuncanTech MS3100-CIR. The influence of subpixel measurements in the adjustment by Least Square Method (LSM) was evaluated by the comparison of estimated standard deviation of parameters in both situations, with manual measurement (pixel accuracy) and with subpixel estimation. Additionally, the influence of subpixel measurements in the 3D reconstruction was also analyzed. Based on the obtained results, i.e., on the quantification of the standard deviation reduction in the Inner Orientation Parameters (IOP) and also in the relative error of the 3D reconstruction, it was shown that measurements with subpixel accuracy are relevant for some tasks in Photogrammetry, mainly for those in which the metric quality is of great relevance, as Camera Calibration.
Resumo:
Bilayer segmentation of live video in uncontrolled environments is an essential task for home applications in which the original background of the scene must be replaced, as in videochats or traditional videoconference. The main challenge in such conditions is overcome all difficulties in problem-situations (e. g., illumination change, distract events such as element moving in the background and camera shake) that may occur while the video is being captured. This paper presents a survey of segmentation methods for background substitution applications, describes the main concepts and identifies events that may cause errors. Our analysis shows that although robust methods rely on specific devices (multiple cameras or sensors to generate depth maps) which aid the process. In order to achieve the same results using conventional devices (monocular video cameras), most current research relies on energy minimization frameworks, in which temporal and spacial information are probabilistically combined with those of color and contrast.
Resumo:
Il lavoro di tesi si è svolto in collaborazione con il laboratorio di elettrofisiologia, Unità Operativa di Cardiologia, Dipartimento Cardiovascolare, dell’ospedale “S. Maria delle Croci” di Ravenna, Azienda Unità Sanitaria Locale della Romagna, ed ha come obiettivo lo sviluppo di un metodo per l’individuazione dell’atrio sinistro in sequenze di immagini ecografiche intracardiache acquisite durante procedure di ablazione cardiaca transcatetere per il trattamento della fibrillazione atriale. La localizzazione della parete posteriore dell'atrio sinistro in immagini ecocardiografiche intracardiache risulta fondamentale qualora si voglia monitorare la posizione dell'esofago rispetto alla parete stessa per ridurre il rischio di formazione della fistola atrio esofagea. Le immagini derivanti da ecografia intracardiaca sono state acquisite durante la procedura di ablazione cardiaca ed esportate direttamente dall’ecografo in formato Audio Video Interleave (AVI). L’estrazione dei singoli frames è stata eseguita implementando un apposito programma in Matlab, ottenendo così il set di dati su cui implementare il metodo di individuazione della parete atriale. A causa dell’eccessivo rumore presente in alcuni set di dati all’interno della camera atriale, sono stati sviluppati due differenti metodi per il tracciamento automatico del contorno della parete dell’atrio sinistro. Il primo, utilizzato per le immagini più “pulite”, si basa sull’utilizzo del modello Chan-Vese, un metodo di segmentazione level-set region-based, mentre il secondo, efficace in presenza di rumore, sfrutta il metodo di clustering K-means. Entrambi i metodi prevedono l’individuazione automatica dell’atrio, senza che il clinico fornisca informazioni in merito alla posizione dello stesso, e l’utilizzo di operatori morfologici per l’eliminazione di regioni spurie. I risultati così ottenuti sono stati valutati qualitativamente, sovrapponendo il contorno individuato all'immagine ecografica e valutando la bontà del tracciamento. Inoltre per due set di dati, segmentati con i due diversi metodi, è stata eseguita una valutazione quantitativa confrontatoli con il risultato del tracciamento manuale eseguito dal clinico.
Resumo:
Adding virtual objects to real environments plays an important role in todays computer graphics: Typical examples are virtual furniture in a real room and virtual characters in real movies. For a believable appearance, consistent lighting of the virtual objects is required. We present an augmented reality system that displays virtual objects with consistent illumination and shadows in the image of a simple webcam. We use two high dynamic range video cameras with fisheye lenses permanently recording the environment illumination. A sampling algorithm selects a few bright parts in one of the wide angle images and the corresponding points in the second camera image. The 3D position can then be calculated using epipolar geometry. Finally, the selected point lights are used in a multi pass algorithm to draw the virtual object with shadows. To validate our approach, we compare the appearance and shadows of the synthetic objects with real objects.
Resumo:
This paper presents different application scenarios for which the registration of sub-sequence reconstructions or multi-camera reconstructions is essential for successful camera motion estimation and 3D reconstruction from video. The registration is achieved by merging unconnected feature point tracks between the reconstructions. One application is drift removal for sequential camera motion estimation of long sequences. The state-of-the-art in drift removal is to apply a RANSAC approach to find unconnected feature point tracks. In this paper an alternative spectral algorithm for pairwise matching of unconnected feature point tracks is used. It is then shown that the algorithms can be combined and applied to novel scenarios where independent camera motion estimations must be registered into a common global coordinate system. In the first scenario multiple moving cameras, which capture the same scene simultaneously, are registered. A second new scenario occurs in situations where the tracking of feature points during sequential camera motion estimation fails completely, e.g., due to large occluding objects in the foreground, and the unconnected tracks of the independent reconstructions must be merged. In the third scenario image sequences of the same scene, which are captured under different illuminations, are registered. Several experiments with challenging real video sequences demonstrate that the presented techniques work in practice.
Resumo:
OBJECTIVE To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. METHODS Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280 × 720, 640 × 480, 320 × 240, 160 × 120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0-500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. RESULTS Higher frame rate (>7 fps), higher camera resolution (>640 × 480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). CONCLUSION Webcameras have the potential to improve telecommunication of hearing-impaired individuals.
Resumo:
In free viewpoint applications, the images are captured by an array of cameras that acquire a scene of interest from different perspectives. Any intermediate viewpoint not included in the camera array can be virtually synthesized by the decoder, at a quality that depends on the distance between the virtual view and the camera views available at decoder. Hence, it is beneficial for any user to receive camera views that are close to each other for synthesis. This is however not always feasible in bandwidth-limited overlay networks, where every node may ask for different camera views. In this work, we propose an optimized delivery strategy for free viewpoint streaming over overlay networks. We introduce the concept of layered quality-of-experience (QoE), which describes the level of interactivity offered to clients. Based on these levels of QoE, camera views are organized into layered subsets. These subsets are then delivered to clients through a prioritized network coding streaming scheme, which accommodates for the network and clients heterogeneity and effectively exploit the resources of the overlay network. Simulation results show that, in a scenario with limited bandwidth or channel reliability, the proposed method outperforms baseline network coding approaches, where the different levels of QoE are not taken into account in the delivery strategy optimization.