926 resultados para cameras and camera accessories
Resumo:
SANTANA, André M.; SANTIAGO, Gutemberg S.; MEDEIROS, Adelardo A. D. Real-Time Visual SLAM Using Pre-Existing Floor Lines as Landmarks and a Single Camera. In: CONGRESSO BRASILEIRO DE AUTOMÁTICA, 2008, Juiz de Fora, MG. Anais... Juiz de Fora: CBA, 2008.
Resumo:
The estimating of the relative orientation and position of a camera is one of the integral topics in the field of computer vision. The accuracy of a certain Finnish technology company’s traffic sign inventory and localization process can be improved by utilizing the aforementioned concept. The company’s localization process uses video data produced by a vehicle installed camera. The accuracy of estimated traffic sign locations depends on the relative orientation between the camera and the vehicle. This thesis proposes a computer vision based software solution which can estimate a camera’s orientation relative to the movement direction of the vehicle by utilizing video data. The task was solved by using feature-based methods and open source software. When using simulated data sets, the camera orientation estimates had an absolute error of 0.31 degrees on average. The software solution can be integrated to be a part of the traffic sign localization pipeline of the company in question.
Resumo:
SANTANA, André M.; SANTIAGO, Gutemberg S.; MEDEIROS, Adelardo A. D. Real-Time Visual SLAM Using Pre-Existing Floor Lines as Landmarks and a Single Camera. In: CONGRESSO BRASILEIRO DE AUTOMÁTICA, 2008, Juiz de Fora, MG. Anais... Juiz de Fora: CBA, 2008.
Resumo:
Oceans environmental monitoring and seafloor exploitation need in situ sensors and optical devices (cameras, lights) in various locations and on various carriers in order to initiate and to calibrate environmental models or to operate underwater industrial process supervision. For more than 10 years Ifremer deploys in situ monitoring systems for various seawater parameters and in situ observation systems based on lights and HD Cameras. To be economically operational, these systems must be equipped with a biofouling protection dedicated to the sensors and optical devices used in situ. Indeed, biofouling, in less than 15 days [1] will modify the transducing interfaces of the sensors and causes unacceptable bias on the measurements provided by the in situ monitoring system. In the same way biofouling will decrease the optical properties of windows and thus altering the lighting and the quality fot he images recorded by the camera.
Resumo:
Recent advances in mobile phone cameras have poised them to take over compact hand-held cameras as the consumer’s preferred camera option. Along with advances in the number of pixels, motion blur removal, face-tracking, and noise reduction algorithms have significant roles in the internal processing of the devices. An undesired effect of severe noise reduction is the loss of texture (i.e. low-contrast fine details) of the original scene. Current established methods for resolution measurement fail to accurately portray the texture loss incurred in a camera system. The development of an accurate objective method to identify the texture preservation or texture reproduction capability of a camera device is important in this regard. The ‘Dead Leaves’ target has been used extensively as a method to measure the modulation transfer function (MTF) of cameras that employ highly non-linear noise-reduction methods. This stochastic model consists of a series of overlapping circles with radii r distributed as r−3, and having uniformly distributed gray level, which gives an accurate model of occlusion in a natural setting and hence mimics a natural scene. This target can be used to model the texture transfer through a camera system when a natural scene is captured. In the first part of our study we identify various factors that affect the MTF measured using the ‘Dead Leaves’ chart. These include variations in illumination, distance, exposure time and ISO sensitivity among others. We discuss the main differences of this method with the existing resolution measurement techniques and identify the advantages. In the second part of this study, we propose an improvement to the current texture MTF measurement algorithm. High frequency residual noise in the processed image contains the same frequency content as fine texture detail, and is sometimes reported as such, thereby leading to inaccurate results. A wavelet thresholding based denoising technique is utilized for modeling the noise present in the final captured image. This updated noise model is then used for calculating an accurate texture MTF. We present comparative results for both algorithms under various image capture conditions.
Resumo:
Tumor functional volume (FV) and its mean activity concentration (mAC) are the quantities derived from positron emission tomography (PET). These quantities are used for estimating radiation dose for a therapy, evaluating the progression of a disease and also use it as a prognostic indicator for predicting outcome. PET images have low resolution, high noise and affected by partial volume effect (PVE). Manually segmenting each tumor is very cumbersome and very hard to reproduce. To solve the above problem I developed an algorithm, called iterative deconvolution thresholding segmentation (IDTS) algorithm; the algorithm segment the tumor, measures the FV, correct for the PVE and calculates mAC. The algorithm corrects for the PVE without the need to estimate camera’s point spread function (PSF); also does not require optimizing for a specific camera. My algorithm was tested in physical phantom studies, where hollow spheres (0.5-16 ml) were used to represent tumors with a homogeneous activity distribution. It was also tested on irregular shaped tumors with a heterogeneous activity profile which were acquired using physical and simulated phantom. The physical phantom studies were performed with different signal to background ratios (SBR) and with different acquisition times (1-5 min). The algorithm was applied on ten clinical data where the results were compared with manual segmentation and fixed percentage thresholding method called T50 and T60 in which 50% and 60% of the maximum intensity respectively is used as threshold. The average error in FV and mAC calculation was 30% and -35% for 0.5 ml tumor. The average error FV and mAC calculation were ~5% for 16 ml tumor. The overall FV error was ~10% for heterogeneous tumors in physical and simulated phantom data. The FV and mAC error for clinical image compared to manual segmentation was around -17% and 15% respectively. In summary my algorithm has potential to be applied on data acquired from different cameras as its not dependent on knowing the camera’s PSF. The algorithm can also improve dose estimation and treatment planning.
Resumo:
Fully articulated hand tracking promises to enable fundamentally new interactions with virtual and augmented worlds, but the limited accuracy and efficiency of current systems has prevented widespread adoption. Today's dominant paradigm uses machine learning for initialization and recovery followed by iterative model-fitting optimization to achieve a detailed pose fit. We follow this paradigm, but make several changes to the model-fitting, namely using: (1) a more discriminative objective function; (2) a smooth-surface model that provides gradients for non-linear optimization; and (3) joint optimization over both the model pose and the correspondences between observed data points and the model surface. While each of these changes may actually increase the cost per fitting iteration, we find a compensating decrease in the number of iterations. Further, the wide basin of convergence means that fewer starting points are needed for successful model fitting. Our system runs in real-time on CPU only, which frees up the commonly over-burdened GPU for experience designers. The hand tracker is efficient enough to run on low-power devices such as tablets. We can track up to several meters from the camera to provide a large working volume for interaction, even using the noisy data from current-generation depth cameras. Quantitative assessments on standard datasets show that the new approach exceeds the state of the art in accuracy. Qualitative results take the form of live recordings of a range of interactive experiences enabled by this new approach.
Resumo:
This paper presents a prototype tracking system for tracking people in enclosed indoor environments where there is a high rate of occlusions. The system uses a stereo camera for acquisition, and is capable of disambiguating occlusions using a combination of depth map analysis, a two step ellipse fitting people detection process, the use of motion models and Kalman filters and a novel fit metric, based on computationally simple object statistics. Testing shows that our fit metric outperforms commonly used position based metrics and histogram based metrics, resulting in more accurate tracking of people.
Resumo:
For robots to operate in human environments they must be able to make their own maps because it is unrealistic to expect a user to enter a map into the robot’s memory; existing floorplans are often incorrect; and human environments tend to change. Traditionally robots have used sonar, infra-red or laser range finders to perform the mapping task. Digital cameras have become very cheap in recent years and they have opened up new possibilities as a sensor for robot perception. Any robot that must interact with humans can reasonably be expected to have a camera for tasks such as face recognition, so it makes sense to also use the camera for navigation. Cameras have advantages over other sensors such as colour information (not available with any other sensor), better immunity to noise (compared to sonar), and not being restricted to operating in a plane (like laser range finders). However, there are disadvantages too, with the principal one being the effect of perspective. This research investigated ways to use a single colour camera as a range sensor to guide an autonomous robot and allow it to build a map of its environment, a process referred to as Simultaneous Localization and Mapping (SLAM). An experimental system was built using a robot controlled via a wireless network connection. Using the on-board camera as the only sensor, the robot successfully explored and mapped indoor office environments. The quality of the resulting maps is comparable to those that have been reported in the literature for sonar or infra-red sensors. Although the maps are not as accurate as ones created with a laser range finder, the solution using a camera is significantly cheaper and is more appropriate for toys and early domestic robots.
Resumo:
Automatic detection of suspicious activities in CCTV camera feeds is crucial to the success of video surveillance systems. Such a capability can help transform the dumb CCTV cameras into smart surveillance tools for fighting crime and terror. Learning and classification of basic human actions is a precursor to detecting suspicious activities. Most of the current approaches rely on a non-realistic assumption that a complete dataset of normal human actions is available. This paper presents a different approach to deal with the problem of understanding human actions in video when no prior information is available. This is achieved by working with an incomplete dataset of basic actions which are continuously updated. Initially, all video segments are represented by Bags-Of-Words (BOW) method using only Term Frequency-Inverse Document Frequency (TF-IDF) features. Then, a data-stream clustering algorithm is applied for updating the system's knowledge from the incoming video feeds. Finally, all the actions are classified into different sets. Experiments and comparisons are conducted on the well known Weizmann and KTH datasets to show the efficacy of the proposed approach.
Resumo:
In public venues, crowd size is a key indicator of crowd safety and stability. Crowding levels can be detected using holistic image features, however this requires a large amount of training data to capture the wide variations in crowd distribution. If a crowd counting algorithm is to be deployed across a large number of cameras, such a large and burdensome training requirement is far from ideal. In this paper we propose an approach that uses local features to count the number of people in each foreground blob segment, so that the total crowd estimate is the sum of the group sizes. This results in an approach that is scalable to crowd volumes not seen in the training data, and can be trained on a very small data set. As a local approach is used, the proposed algorithm can easily be used to estimate crowd density throughout different regions of the scene and be used in a multi-camera environment. A unique localised approach to ground truth annotation reduces the required training data is also presented, as a localised approach to crowd counting has different training requirements to a holistic one. Testing on a large pedestrian database compares the proposed technique to existing holistic techniques and demonstrates improved accuracy, and superior performance when test conditions are unseen in the training set, or a minimal training set is used.
Resumo:
Surveillance networks are typically monitored by a few people, viewing several monitors displaying the camera feeds. It is then very difficult for a human operator to effectively detect events as they happen. Recently, computer vision research has begun to address ways to automatically process some of this data, to assist human operators. Object tracking, event recognition, crowd analysis and human identification at a distance are being pursued as a means to aid human operators and improve the security of areas such as transport hubs. The task of object tracking is key to the effective use of more advanced technologies. To recognize an event people and objects must be tracked. Tracking also enhances the performance of tasks such as crowd analysis or human identification. Before an object can be tracked, it must be detected. Motion segmentation techniques, widely employed in tracking systems, produce a binary image in which objects can be located. However, these techniques are prone to errors caused by shadows and lighting changes. Detection routines often fail, either due to erroneous motion caused by noise and lighting effects, or due to the detection routines being unable to split occluded regions into their component objects. Particle filters can be used as a self contained tracking system, and make it unnecessary for the task of detection to be carried out separately except for an initial (often manual) detection to initialise the filter. Particle filters use one or more extracted features to evaluate the likelihood of an object existing at a given point each frame. Such systems however do not easily allow for multiple objects to be tracked robustly, and do not explicitly maintain the identity of tracked objects. This dissertation investigates improvements to the performance of object tracking algorithms through improved motion segmentation and the use of a particle filter. A novel hybrid motion segmentation / optical flow algorithm, capable of simultaneously extracting multiple layers of foreground and optical flow in surveillance video frames is proposed. The algorithm is shown to perform well in the presence of adverse lighting conditions, and the optical flow is capable of extracting a moving object. The proposed algorithm is integrated within a tracking system and evaluated using the ETISEO (Evaluation du Traitement et de lInterpretation de Sequences vidEO - Evaluation for video understanding) database, and significant improvement in detection and tracking performance is demonstrated when compared to a baseline system. A Scalable Condensation Filter (SCF), a particle filter designed to work within an existing tracking system, is also developed. The creation and deletion of modes and maintenance of identity is handled by the underlying tracking system; and the tracking system is able to benefit from the improved performance in uncertain conditions arising from occlusion and noise provided by a particle filter. The system is evaluated using the ETISEO database. The dissertation then investigates fusion schemes for multi-spectral tracking systems. Four fusion schemes for combining a thermal and visual colour modality are evaluated using the OTCBVS (Object Tracking and Classification in and Beyond the Visible Spectrum) database. It is shown that a middle fusion scheme yields the best results and demonstrates a significant improvement in performance when compared to a system using either mode individually. Findings from the thesis contribute to improve the performance of semi-automated video processing and therefore improve security in areas under surveillance.