954 resultados para video images
Resumo:
A target tracking algorithm able to identify the position and to pursuit moving targets in video digital sequences is proposed in this paper. The proposed approach aims to track moving targets inside the vision field of a digital camera. The position and trajectory of the target are identified by using a neural network presenting competitive learning technique. The winning neuron is trained to approximate to the target and, then, pursuit it. A digital camera provides a sequence of images and the algorithm process those frames in real time tracking the moving target. The algorithm is performed both with black and white and multi-colored images to simulate real world situations. Results show the effectiveness of the proposed algorithm, since the neurons tracked the moving targets even if there is no pre-processing image analysis. Single and multiple moving targets are followed in real time.
Resumo:
Accurate placement of lesions is crucial for the effectiveness and safety of a retinal laser photocoagulation treatment. Computer assistance provides the capability for improvements to treatment accuracy and execution time. The idea is to use video frames acquired from a scanning digital ophthalmoscope (SDO) to compensate for retinal motion during laser treatment. This paper presents a method for the multimodal registration of the initial frame from an SDO retinal video sequence to a retinal composite image, which may contain a treatment plan. The retinal registration procedure comprises the following steps: 1) detection of vessel centerline points and identification of the optic disc; 2) prealignment of the video frame and the composite image based on optic disc parameters; and 3) iterative matching of the detected vessel centerline points in expanding matching regions. This registration algorithm was designed for the initialization of a real-time registration procedure that registers the subsequent video frames to the composite image. The algorithm demonstrated its capability to register various pairs of SDO video frames and composite images acquired from patients.
Resumo:
OBJECTIVE: To assess the intra-reader and inter-reader reliabilities of interpreting ultrasonography by several experts using video clips. METHOD: 99 video clips of healthy and rheumatic joints were recorded and delivered to 17 physician sonographers in two rounds. The intra-reader and inter-reader reliabilities of interpreting the ultrasound results were calculated using a dichotomous system (normal/abnormal) and a graded semiquantitative scoring system. RESULTS: The video reading method worked well. 70% of the readers could classify at least 70% of the cases correctly as normal or abnormal. The distribution of readers answering correctly was wide. The most difficult joints to assess were the elbow, wrist, metacarpophalangeal (MCP) and knee joints. The intra-reader and inter-reader agreements on interpreting dynamic ultrasound images as normal or abnormal, as well as detecting and scoring a Doppler signal were moderate to good (kappa = 0.52-0.82). CONCLUSIONS: Dynamic image assessment (video clips) can be used as an alternative method in ultrasonography reliability studies. The intra-reader and inter-reader reliabilities of ultrasonography in dynamic image reading are acceptable, but more definitions and training are needed to improve sonographic reproducibility.
Resumo:
Adding virtual objects to real environments plays an important role in todays computer graphics: Typical examples are virtual furniture in a real room and virtual characters in real movies. For a believable appearance, consistent lighting of the virtual objects is required. We present an augmented reality system that displays virtual objects with consistent illumination and shadows in the image of a simple webcam. We use two high dynamic range video cameras with fisheye lenses permanently recording the environment illumination. A sampling algorithm selects a few bright parts in one of the wide angle images and the corresponding points in the second camera image. The 3D position can then be calculated using epipolar geometry. Finally, the selected point lights are used in a multi pass algorithm to draw the virtual object with shadows. To validate our approach, we compare the appearance and shadows of the synthetic objects with real objects.
Resumo:
Visual fixation is employed by humans and some animals to keep a specific 3D location at the center of the visual gaze. Inspired by this phenomenon in nature, this paper explores the idea to transfer this mechanism to the context of video stabilization for a handheld video camera. A novel approach is presented that stabilizes a video by fixating on automatically extracted 3D target points. This approach is different from existing automatic solutions that stabilize the video by smoothing. To determine the 3D target points, the recorded scene is analyzed with a stateof- the-art structure-from-motion algorithm, which estimates camera motion and reconstructs a 3D point cloud of the static scene objects. Special algorithms are presented that search either virtual or real 3D target points, which back-project close to the center of the image for as long a period of time as possible. The stabilization algorithm then transforms the original images of the sequence so that these 3D target points are kept exactly in the center of the image, which, in case of real 3D target points, produces a perfectly stable result at the image center. Furthermore, different methods of additional user interaction are investigated. It is shown that the stabilization process can easily be controlled and that it can be combined with state-of-theart tracking techniques in order to obtain a powerful image stabilization tool. The approach is evaluated on a variety of videos taken with a hand-held camera in natural scenes.
Resumo:
We present an algorithm for estimating dense image correspondences. Our versatile approach lends itself to various tasks typical for video post-processing, including image morphing, optical flow estimation, stereo rectification, disparity/depth reconstruction, and baseline adjustment. We incorporate recent advances in feature matching, energy minimization, stereo vision, and data clustering into our approach. At the core of our correspondence estimation we use Efficient Belief Propagation for energy minimization. While state-of-the-art algorithms only work on thumbnail-sized images, our novel feature downsampling scheme in combination with a simple, yet efficient data term compression, can cope with high-resolution data. The incorporation of SIFT (Scale-Invariant Feature Transform) features into data term computation further resolves matching ambiguities, making long-range correspondence estimation possible. We detect occluded areas by evaluating the correspondence symmetry, we further apply Geodesic matting to automatically determine plausible values in these regions.
Resumo:
In free viewpoint applications, the images are captured by an array of cameras that acquire a scene of interest from different perspectives. Any intermediate viewpoint not included in the camera array can be virtually synthesized by the decoder, at a quality that depends on the distance between the virtual view and the camera views available at decoder. Hence, it is beneficial for any user to receive camera views that are close to each other for synthesis. This is however not always feasible in bandwidth-limited overlay networks, where every node may ask for different camera views. In this work, we propose an optimized delivery strategy for free viewpoint streaming over overlay networks. We introduce the concept of layered quality-of-experience (QoE), which describes the level of interactivity offered to clients. Based on these levels of QoE, camera views are organized into layered subsets. These subsets are then delivered to clients through a prioritized network coding streaming scheme, which accommodates for the network and clients heterogeneity and effectively exploit the resources of the overlay network. Simulation results show that, in a scenario with limited bandwidth or channel reliability, the proposed method outperforms baseline network coding approaches, where the different levels of QoE are not taken into account in the delivery strategy optimization.
Resumo:
OBJECTIVE Vestibular neuritis is often mimicked by stroke (pseudoneuritis). Vestibular eye movements help discriminate the two conditions. We report vestibulo-ocular reflex (VOR) gain measures in neuritis and stroke presenting acute vestibular syndrome (AVS). METHODS Prospective cross-sectional study of AVS (acute continuous vertigo/dizziness lasting >24 h) at two academic centers. We measured horizontal head impulse test (HIT) VOR gains in 26 AVS patients using a video HIT device (ICS Impulse). All patients were assessed within 1 week of symptom onset. Diagnoses were confirmed by clinical examinations, brain magnetic resonance imaging with diffusion-weighted images, and follow-up. Brainstem and cerebellar strokes were classified by vascular territory-posterior inferior cerebellar artery (PICA) or anterior inferior cerebellar artery (AICA). RESULTS Diagnoses were vestibular neuritis (n = 16) and posterior fossa stroke (PICA, n = 7; AICA, n = 3). Mean HIT VOR gains (ipsilesional [standard error of the mean], contralesional [standard error of the mean]) were as follows: vestibular neuritis (0.52 [0.04], 0.87 [0.04]); PICA stroke (0.94 [0.04], 0.93 [0.04]); AICA stroke (0.84 [0.10], 0.74 [0.10]). VOR gains were asymmetric in neuritis (unilateral vestibulopathy) and symmetric in PICA stroke (bilaterally normal VOR), whereas gains in AICA stroke were heterogeneous (asymmetric, bilaterally low, or normal). In vestibular neuritis, borderline gains ranged from 0.62 to 0.73. Twenty patients (12 neuritis, six PICA strokes, two AICA strokes) had at least five interpretable HIT trials (for both ears), allowing an appropriate classification based on mean VOR gains per ear. Classifying AVS patients with bilateral VOR mean gains of 0.70 or more as suspected strokes yielded a total diagnostic accuracy of 90%, with stroke sensitivity of 88% and specificity of 92%. CONCLUSION Video HIT VOR gains differ between peripheral and central causes of AVS. PICA strokes were readily separated from neuritis using gain measures, but AICA strokes were at risk of being misclassified based on VOR gain alone.
Resumo:
Purpose To this day, the slit lamp remains the first tool used by an ophthalmologist to examine patient eyes. Imaging of the retina poses, however, a variety of problems, namely a shallow depth of focus, reflections from the optical system, a small field of view and non-uniform illumination. For ophthalmologists, the use of slit lamp images for documentation and analysis purposes, however, remains extremely challenging due to large image artifacts. For this reason, we propose an automatic retinal slit lamp video mosaicking, which enlarges the field of view and reduces amount of noise and reflections, thus enhancing image quality. Methods Our method is composed of three parts: (i) viable content segmentation, (ii) global registration and (iii) image blending. Frame content is segmented using gradient boosting with custom pixel-wise features. Speeded-up robust features are used for finding pair-wise translations between frames with robust random sample consensus estimation and graph-based simultaneous localization and mapping for global bundle adjustment. Foreground-aware blending based on feathering merges video frames into comprehensive mosaics. Results Foreground is segmented successfully with an area under the curve of the receiver operating characteristic curve of 0.9557. Mosaicking results and state-of-the-art methods were compared and rated by ophthalmologists showing a strong preference for a large field of view provided by our method. Conclusions The proposed method for global registration of retinal slit lamp images of the retina into comprehensive mosaics improves over state-of-the-art methods and is preferred qualitatively.
Resumo:
Near-bottom zooplankton communities have rarely been studied despite numerous reports of high zooplankton concentrations, probably due to methodological constraints. In Kongsfjorden, Svalbard, the near-bottom layer was studied for the first time by combining daytime deployments of a remotely operated vehicle (ROV), the optical zooplankton sensor moored on-sight key species investigation (MOKI), and Tucker trawl sampling. ROV data from the fjord entrance and the inner fjord showed high near-bottom abundances of euphausiids with a mean concentration of 17.3 ± 3.5 n/100 m**3. With the MOKI system, we observed varying numbers of euphausiids, amphipods, chaetognaths, and copepods on the seafloor at six stations. Light-induced zooplankton swarms reached densities in the order of 90,000 (euphausiids), 120,000 (amphipods), and 470,000 ind/m**3 (chaetognaths), whereas older copepodids of Calanus hyperboreus and C. glacialis did not respond to light. They were abundant at the seafloor and 5 m above and showed maximum abundance of 65,000 ind/m**3. Tucker trawl data provided an overview of the seasonal vertical distribution of euphausiids. The most abundant species Thysanoessa inermis reached near-bottom concentrations of 270 ind/m**3. Regional distribution was neither related to depth nor to location in the fjord. The taxa observed were all part of the pelagic community. Our observations suggest the presence of near-bottom macrozooplankton also in other regions and challenge the current view of bentho-pelagic coupling. Neglecting this community may cause severe underestimates of the stock of elagic zooplankton, especially predatory species, which link secondary production with higher trophic levels.
Resumo:
This article presents a probabilistic method for vehicle detection and tracking through the analysis of monocular images obtained from a vehicle-mounted camera. The method is designed to address the main shortcomings of traditional particle filtering approaches, namely Bayesian methods based on importance sampling, for use in traffic environments. These methods do not scale well when the dimensionality of the feature space grows, which creates significant limitations when tracking multiple objects. Alternatively, the proposed method is based on a Markov chain Monte Carlo (MCMC) approach, which allows efficient sampling of the feature space. The method involves important contributions in both the motion and the observation models of the tracker. Indeed, as opposed to particle filter-based tracking methods in the literature, which typically resort to observation models based on appearance or template matching, in this study a likelihood model that combines appearance analysis with information from motion parallax is introduced. Regarding the motion model, a new interaction treatment is defined based on Markov random fields (MRF) that allows for the handling of possible inter-dependencies in vehicle trajectories. As for vehicle detection, the method relies on a supervised classification stage using support vector machines (SVM). The contribution in this field is twofold. First, a new descriptor based on the analysis of gradient orientations in concentric rectangles is dened. This descriptor involves a much smaller feature space compared to traditional descriptors, which are too costly for real-time applications. Second, a new vehicle image database is generated to train the SVM and made public. The proposed vehicle detection and tracking method is proven to outperform existing methods and to successfully handle challenging situations in the test sequences.
Resumo:
In the context of aerial imagery, one of the first steps toward a coherent processing of the information contained in multiple images is geo-registration, which consists in assigning geographic 3D coordinates to the pixels of the image. This enables accurate alignment and geo-positioning of multiple images, detection of moving objects and fusion of data acquired from multiple sensors. To solve this problem there are different approaches that require, in addition to a precise characterization of the camera sensor, high resolution referenced images or terrain elevation models, which are usually not publicly available or out of date. Building upon the idea of developing technology that does not need a reference terrain elevation model, we propose a geo-registration technique that applies variational methods to obtain a dense and coherent surface elevation model that is used to replace the reference model. The surface elevation model is built by interpolation of scattered 3D points, which are obtained in a two-step process following a classical stereo pipeline: first, coherent disparity maps between image pairs of a video sequence are estimated and then image point correspondences are back-projected. The proposed variational method enforces continuity of the disparity map not only along epipolar lines (as done by previous geo-registration techniques) but also across them, in the full 2D image domain. In the experiments, aerial images from synthetic video sequences have been used to validate the proposed technique.
Resumo:
A real-time large scale part-to-part video matching algorithm, based on the cross correlation of the intensity of motion curves, is proposed with a view to originality recognition, video database cleansing, copyright enforcement, video tagging or video result re-ranking. Moreover, it is suggested how the most representative hashes and distance functions - strada, discrete cosine transformation, Marr-Hildreth and radial - should be integrated in order for the matching algorithm to be invariant against blur, compression and rotation distortions: (R; _) 2 [1; 20]_[1; 8], from 512_512 to 32_32pixels2 and from 10 to 180_. The DCT hash is invariant against blur and compression up to 64x64 pixels2. Nevertheless, although its performance against rotation is the best, with a success up to 70%, it should be combined with the Marr-Hildreth distance function. With the latter, the image selected by the DCT hash should be at a distance lower than 1.15 times the Marr-Hildreth minimum distance.