126 resultados para Multi-Exposure Plate Images Processing
Resumo:
Features derived from the trispectra of DFT magnitude slices are used for multi-font digit recognition. These features are insensitive to translation, rotation, or scaling of the input. They are also robust to noise. Classification accuracy tests were conducted on a common data base of 256× 256 pixel bilevel images of digits in 9 fonts. Randomly rotated and translated noisy versions were used for training and testing. The results indicate that the trispectral features are better than moment invariants and affine moment invariants. They achieve a classification accuracy of 95% compared to about 81% for Hu's (1962) moment invariants and 39% for the Flusser and Suk (1994) affine moment invariants on the same data in the presence of 1% impulse noise using a 1-NN classifier. For comparison, a multilayer perceptron with no normalization for rotations and translations yields 34% accuracy on 16× 16 pixel low-pass filtered and decimated versions of the same data.
Resumo:
Road surface macrotexture is identified as one of the factors contributing to the surface's skid resistance. Existing methods of quantifying the surface macrotexture, such as the sand patch test and the laser profilometer test, are either expensive or intrusive, requiring traffic control. High-resolution cameras have made it possible to acquire good quality images from roads for the automated analysis of texture depth. In this paper, a granulometric method based on image processing is proposed to estimate road surface texture coarseness distribution from their edge profiles. More than 1300 images were acquired from two different sites, extending to a total of 2.96 km. The images were acquired using camera orientations of 60 and 90 degrees. The road surface is modeled as a texture of particles, and the size distribution of these particles is obtained from chord lengths across edge boundaries. The mean size from each distribution is compared with the sensor measured texture depth obtained using a laser profilometer. By tuning the edge detector parameters, a coefficient of determination of up to R2 = 0.94 between the proposed method and the laser profilometer method was obtained. The high correlation is also confirmed by robust calibration parameters that enable the method to be used for unseen data after the method has been calibrated over road surface data with similar surface characteristics and under similar imaging conditions.
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
CCTV and surveillance networks are increasingly being used for operational as well as security tasks. One emerging area of technology that lends itself to operational analytics is soft biometrics. Soft biometrics can be used to describe a person and detect them throughout a sparse multi-camera network. This enables them to be used to perform tasks such as determining the time taken to get from point to point, and the paths taken through an environment by detecting and matching people across disjoint views. However, in a busy environment where there are 100's if not 1000's of people such as an airport, attempting to monitor everyone is highly unrealistic. In this paper we propose an average soft biometric, that can be used to identity people who look distinct, and are thus suitable for monitoring through a large, sparse camera network. We demonstrate how an average soft biometric can be used to identify unique people to calculate operational measures such as the time taken to travel from point to point.
Resumo:
In this paper, we seek to expand the use of direct methods in real-time applications by proposing a vision-based strategy for pose estimation of aerial vehicles. The vast majority of approaches make use of features to estimate motion. Conversely, the strategy we propose is based on a MR (Multi- Resolution) implementation of an image registration technique (Inverse Compositional Image Alignment ICIA) using direct methods. An on-board camera in a downwards-looking configuration, and the assumption of planar scenes, are the bases of the algorithm. The motion between frames (rotation and translation) is recovered by decomposing the frame-to-frame homography obtained by the ICIA algorithm applied to a patch that covers around the 80% of the image. When the visual estimation is required (e.g. GPS drop-out), this motion is integrated with the previous known estimation of the vehicles’ state, obtained from the on-board sensors (GPS/IMU), and the subsequent estimations are based only on the vision-based motion estimations. The proposed strategy is tested with real flight data in representative stages of a flight: cruise, landing, and take-off, being two of those stages considered critical: take-off and landing. The performance of the pose estimation strategy is analyzed by comparing it with the GPS/IMU estimations. Results show correlation between the visual estimation obtained with the MR-ICIA and the GPS/IMU data, that demonstrate that the visual estimation can be used to provide a good approximation of the vehicle’s state when it is required (e.g. GPS drop-outs). In terms of performance, the proposed strategy is able to maintain an estimation of the vehicle’s state for more than one minute, at real-time frame rates based, only on visual information.
Resumo:
Virtual methods to assess the fitting of a fracture fixation plate were proposed recently, however with limitations such as simplified fit criteria or manual data processing. This study aims to automate a fit analysis procedure using clinical-based criteria, and then to analyse the results further for borderline fit cases. Three dimensional (3D) models of 45 bones and of a precontoured distal tibial plate were utilized to assess the fitting of the plate automatically. A Matlab program was developed to automatically measure the shortest distance between the bone and the plate at three regions of interest and a plate-bone angle. The measured values including the fit assessment results were recorded in a spreadsheet as part of the batch-process routine. An automated fit analysis procedure will enable the processing of larger bone datasets in a significantly shorter time, which will provide more representative data of the target population for plate shape design and validation. As a result, better fitting plates can be manufactured and made available to surgeons, thereby reducing the risk and cost associated with complications or corrective procedures. This in turn, is expected to translate into improving patients' quality of life.
Practical improvements to simultaneous computation of multi-view geometry and radial lens distortion
Resumo:
This paper discusses practical issues related to the use of the division model for lens distortion in multi-view geometry computation. A data normalisation strategy is presented, which has been absent from previous discussions on the topic. The convergence properties of the Rectangular Quadric Eigenvalue Problem solution for computing division model distortion are examined. It is shown that the existing method can require more than 1000 iterations when dealing with severe distortion. A method is presented for accelerating convergence to less than 10 iterations for any amount of distortion. The new method is shown to produce equivalent or better results than the existing method with up to two orders of magnitude reduction in iterations. Through detailed simulation it is found that the number of data points used to compute geometry and lens distortion has a strong influence on convergence speed and solution accuracy. It is recommended that more than the minimal number of data points be used when computing geometry using a robust estimator such as RANSAC. Adding two to four extra samples improves the convergence rate and accuracy sufficiently to compensate for the increased number of samples required by the RANSAC process.
Resumo:
In Chapter 10, Adam and Dougherty describe the application of medical image processing to the assessment and treatment of spinal deformity, with a focus on the surgical treatment of idiopathic scoliosis. The natural history of spinal deformity and current approaches to surgical and non-surgical treatment are briefly described, followed by an overview of current clinically used imaging modalities. The key metrics currently used to assess the severity and progression of spinal deformities from medical images are presented, followed by a discussion of the errors and uncertainties involved in manual measurements. This provides the context for an analysis of automated and semi-automated image processing approaches to measure spinal curve shape and severity in two and three dimensions.
Resumo:
Time-varying bispectra, computed using a classical sliding window short-time Fourier approach, are analyzed for scalp EEG potentials evoked by an auditory stimulus and new observations are presented. A single, short duration tone is presented from the left or the right, direction unknown to the test subject. The subject responds by moving the eyes to the direction of the sound. EEG epochs sampled at 200 Hz for repeated trials are processed between -70 ms and +1200 ms with reference to the stimulus. It is observed that for an ensemble of correctly recognized cases, the best matching timevarying bispectra at (8 Hz, 8Hz) are for PZ-FZ channels and this is also largely the case for grand averages but not for power spectra at 8 Hz. Out of 11 subjects, the only exception for time-varying bispectral match was a subject with family history of Alzheimer’s disease and the difference was in bicoherence, not biphase.
Resumo:
Motorcyclists are the most crash-prone road-user group in many Asian countries including Singapore; however, factors influencing motorcycle crashes are still not well understood. This study examines the effects of various roadway characteristics, traffic control measures and environmental factors on motorcycle crashes at different location types including expressways and intersections. Using techniques of categorical data analysis, this study has developed a set of log-linear models to investigate multi-vehicle motorcycle crashes in Singapore. Motorcycle crash risks in different circumstances have been calculated after controlling for the exposure estimated by the induced exposure technique. Results show that night-time influence increases crash risks of motorcycles particularly during merging and diverging manoeuvres on expressways, and turning manoeuvres at intersections. Riders appear to exercise more care while riding on wet road surfaces particularly during night. Many hazardous interactions at intersections tend to be related to the failure of drivers to notice a motorcycle as well as to judge correctly the speed/distance of an oncoming motorcycle. Road side conflicts due to stopping/waiting vehicles and interactions with opposing traffic on undivided roads have been found to be as detrimental factors on motorcycle safety along arterial, main and local roads away from intersections. Based on the findings of this study, several targeted countermeasures in the form of legislations, rider training, and safety awareness programmes have been recommended.
Resumo:
Several track-before-detection approaches for image based aircraft detection have recently been examined in an important automated aircraft collision detection application. A particularly popular approach is a two stage processing paradigm which involves: a morphological spatial filter stage (which aims to emphasize the visual characteristics of targets) followed by a temporal or track filter stage (which aims to emphasize the temporal characteristics of targets). In this paper, we proposed new spot detection techniques for this two stage processing paradigm that fuse together raw and morphological images or fuse together various different morphological images (we call these approaches morphological reinforcement). On the basis of flight test data, the proposed morphological reinforcement operations are shown to offer superior signal to-noise characteristics when compared to standard spatial filter options (such as the close-minus-open and adaptive contour morphological operations). However, system operation characterised curves, which examine detection verses false alarm characteristics after both processing stages, illustrate that system performance is very data dependent.
Resumo:
Person re-identification involves recognising individuals in different locations across a network of cameras and is a challenging task due to a large number of varying factors such as pose (both subject and camera) and ambient lighting conditions. Existing databases do not adequately capture these variations, making evaluations of proposed techniques difficult. In this paper, we present a new challenging multi-camera surveillance database designed for the task of person re-identification. This database consists of 150 unscripted sequences of subjects travelling in a building environment though up to eight camera views, appearing from various angles and in varying illumination conditions. A flexible XML-based evaluation protocol is provided to allow a highly configurable evaluation setup, enabling a variety of scenarios relating to pose and lighting conditions to be evaluated. A baseline person re-identification system consisting of colour, height and texture models is demonstrated on this database.
Resumo:
In this paper a real-time vision based power line extraction solution is investigated for active UAV guidance. The line extraction algorithm starts from ridge points detected by steerable filters. A collinear line segments fitting algorithm is followed up by considering global and local information together with multiple collinear measurements. GPU boosted algorithm implementation is also investigated in the experiment. The experimental result shows that the proposed algorithm outperforms two baseline line detection algorithms and is able to fitting long collinear line segments. The low computational cost of the algorithm make suitable for real-time applications.
Resumo:
Laboratory-based studies of human dietary behaviour benefit from highly controlled conditions; however, this approach can lack ecological validity. Identifying a reliable method to capture and quantify natural dietary behaviours represents an important challenge for researchers. In this study, we scrutinised cafeteria-style meals in the ‘Restaurant of the Future.’ Self-selected meals were weighed and photographed, both before and after consumption. Using standard portions of the same foods, these images were independently coded to produce accurate and reliable estimates of (i) initial self-served portions, and (ii) food remaining at the end of the meal. Plate cleaning was extremely common; in 86% of meals at least 90% of self-selected calories were consumed. Males ate a greater proportion of their self-selected meals than did females. Finally, when participants visited the restaurant more than once, the correspondence between selected portions was better predicted by the weight of the meal than by its energy content. These findings illustrate the potential benefits of meal photography in this context. However, they also highlight significant limitations, in particular, the need to exclude large amounts of data when one food obscures another.
Resumo:
Real-world AI systems have been recently deployed which can automatically analyze the plan and tactics of tennis players. As the game-state is updated regularly at short intervals (i.e. point-level), a library of successful and unsuccessful plans of a player can be learnt over time. Given the relative strengths and weaknesses of a player’s plans, a set of proven plans or tactics from the library that characterize a player can be identified. For low-scoring, continuous team sports like soccer, such analysis for multi-agent teams does not exist as the game is not segmented into “discretized” plays (i.e. plans), making it difficult to obtain a library that characterizes a team’s behavior. Additionally, as player tracking data is costly and difficult to obtain, we only have partial team tracings in the form of ball actions which makes this problem even more difficult. In this paper, we propose a method to overcome these issues by representing team behavior via play-segments, which are spatio-temporal descriptions of ball movement over fixed windows of time. Using these representations we can characterize team behavior from entropy maps, which give a measure of predictability of team behaviors across the field. We show the efficacy and applicability of our method on the 2010-2011 English Premier League soccer data.