948 resultados para biometria, impronte digitali, estrazione minuzie, ground truth
Resumo:
Background Timely diagnosis and reporting of patient symptoms in hospital emergency departments (ED) is a critical component of health services delivery. However, due to dispersed information resources and a vast amount of manual processing of unstructured information, accurate point-of-care diagnosis is often difficult. Aims The aim of this research is to report initial experimental evaluation of a clinician-informed automated method for the issue of initial misdiagnoses associated with delayed receipt of unstructured radiology reports. Method A method was developed that resembles clinical reasoning for identifying limb abnormalities. The method consists of a gazetteer of keywords related to radiological findings; the method classifies an X-ray report as abnormal if it contains evidence contained in the gazetteer. A set of 99 narrative reports of radiological findings was sourced from a tertiary hospital. Reports were manually assessed by two clinicians and discrepancies were validated by a third expert ED clinician; the final manual classification generated by the expert ED clinician was used as ground truth to empirically evaluate the approach. Results The automated method that attempts to individuate limb abnormalities by searching for keywords expressed by clinicians achieved an F-measure of 0.80 and an accuracy of 0.80. Conclusion While the automated clinician-driven method achieved promising performances, a number of avenues for improvement were identified using advanced natural language processing (NLP) and machine learning techniques.
Resumo:
This thesis presents an approach for a vertical infrastructure inspection using a vertical take-off and landing (VTOL) unmanned aerial vehicle and shared autonomy. Inspecting vertical structure such as light and power distribution poles is a difficult task. There are challenges involved with developing such an inspection system, such as flying in close proximity to a target while maintaining a fixed stand-off distance from it. The contributions of this thesis fall into three main areas. Firstly, an approach to vehicle dynamic modeling is evaluated in simulation and experiments. Secondly, EKF-based state estimators are demonstrated, as well as estimator-free approaches such as image based visual servoing (IBVS) validated with motion capture ground truth data. Thirdly, an integrated pole inspection system comprising a VTOL platform with human-in-the-loop control, (shared autonomy) is demonstrated. These contributions are comprehensively explained through a series of published papers.
Resumo:
In this paper we propose the hybrid use of illuminant invariant and RGB images to perform image classification of urban scenes despite challenging variation in lighting conditions. Coping with lighting change (and the shadows thereby invoked) is a non-negotiable requirement for long term autonomy using vision. One aspect of this is the ability to reliably classify scene components in the presence of marked and often sudden changes in lighting. This is the focus of this paper. Posed with the task of classifying all parts in a scene from a full colour image, we propose that lighting invariant transforms can reduce the variability of the scene, resulting in a more reliable classification. We leverage the ideas of “data transfer” for classification, beginning with full colour images for obtaining candidate scene-level matches using global image descriptors. This is commonly followed by superpixellevel matching with local features. However, we show that if the RGB images are subjected to an illuminant invariant transform before computing the superpixel-level features, classification is significantly more robust to scene illumination effects. The approach is evaluated using three datasets. The first being our own dataset and the second being the KITTI dataset using manually generated ground truth for quantitative analysis. We qualitatively evaluate the method on a third custom dataset over a 750m trajectory.
Resumo:
Clustering is an important technique in organising and categorising web scale documents. The main challenges faced in clustering the billions of documents available on the web are the processing power required and the sheer size of the datasets available. More importantly, it is nigh impossible to generate the labels for a general web document collection containing billions of documents and a vast taxonomy of topics. However, document clusters are most commonly evaluated by comparison to a ground truth set of labels for documents. This paper presents a clustering and labeling solution where the Wikipedia is clustered and hundreds of millions of web documents in ClueWeb12 are mapped on to those clusters. This solution is based on the assumption that the Wikipedia contains such a wide range of diverse topics that it represents a small scale web. We found that it was possible to perform the web scale document clustering and labeling process on one desktop computer under a couple of days for the Wikipedia clustering solution containing about 1000 clusters. It takes longer to execute a solution with finer granularity clusters such as 10,000 or 50,000. These results were evaluated using a set of external data.
Resumo:
Advances in neural network language models have demonstrated that these models can effectively learn representations of words meaning. In this paper, we explore a variation of neural language models that can learn on concepts taken from structured ontologies and extracted from free-text, rather than directly from terms in free-text. This model is employed for the task of measuring semantic similarity between medical concepts, a task that is central to a number of techniques in medical informatics and information retrieval. The model is built with two medical corpora (journal abstracts and patient records) and empirically validated on two ground-truth datasets of human-judged concept pairs assessed by medical professionals. Empirically, our approach correlates closely with expert human assessors ($\approx$ 0.9) and outperforms a number of state-of-the-art benchmarks for medical semantic similarity. The demonstrated superiority of this model for providing an effective semantic similarity measure is promising in that this may translate into effectiveness gains for techniques in medical information retrieval and medical informatics (e.g., query expansion and literature-based discovery).
Resumo:
In this paper, we develop and validate a new Statistically Assisted Fluid Registration Algorithm (SAFIRA) for brain images. A non-statistical version of this algorithm was first implemented in [2] and re-formulated using Lagrangian mechanics in [3]. Here we extend this algorithm to 3D: given 3D brain images from a population, vector fields and their corresponding deformation matrices are computed in a first round of registrations using the non-statistical implementation. Covariance matrices for both the deformation matrices and the vector fields are then obtained and incorporated (separately or jointly) in the regularizing (i.e., the non-conservative Lagrangian) terms, creating four versions of the algorithm. We evaluated the accuracy of each algorithm variant using the manually labeled LPBA40 dataset, which provides us with ground truth anatomical segmentations. We also compared the power of the different algorithms using tensor-based morphometry -a technique to analyze local volumetric differences in brain structure- applied to 46 3D brain scans from healthy monozygotic twins.
Resumo:
Over the last few decades, there has been a significant land cover (LC) change across the globe due to the increasing demand of the burgeoning population and urban sprawl. In order to take account of the change, there is a need for accurate and up-to-date LC maps. Mapping and monitoring of LC in India is being carried out at national level using multi-temporal IRS AWiFS data. Multispectral data such as IKONOS, Landsat-TM/ETM+, IRS-ICID LISS-III/IV, AWiFS and SPOT-5, etc. have adequate spatial resolution (similar to 1m to 56m) for LC mapping to generate 1:50,000 maps. However, for developing countries and those with large geographical extent, seasonal LC mapping is prohibitive with data from commercial sensors of limited spatial coverage. Superspectral data from the MODIS sensor are freely available, have better temporal (8 day composites) and spectral information. MODIS pixels typically contain a mixture of various LC types (due to coarse spatial resolution of 250, 500 and 1000 in), especially in more fragmented landscapes. In this context, linear spectral unmixing would be useful for mapping patchy land covers, such as those that characterise much of the Indian subcontinent. This work evaluates the existing unmixing technique for LC mapping using MODIS data, using end-members that are extracted through Pixel Purity Index (PPI), Scatter plot and N-dimensional visualisation. The abundance maps were generated for agriculture, built up, forest, plantations, waste land/others and water bodies. The assessment of the results using ground truth and a LISS-III classified map shows 86% overall accuracy, suggesting the potential for broad-scale applicability of the technique with superspectral data for natural resource planning and inventory applications. Index Terms-Remote sensing, digital
Resumo:
Cereal grain is one of the main export commodities of Australian agriculture. Over the past decade, crop yield forecasts for wheat and sorghum have shown appreciable utility for industry planning at shire, state, and national scales. There is now an increasing drive from industry for more accurate and cost-effective crop production forecasts. In order to generate production estimates, accurate crop area estimates are needed by the end of the cropping season. Multivariate methods for analysing remotely sensed Enhanced Vegetation Index (EVI) from 16-day Moderate Resolution Imaging Spectroradiometer (MODIS) satellite imagery within the cropping period (i.e. April-November) were investigated to estimate crop area for wheat, barley, chickpea, and total winter cropped area for a case study region in NE Australia. Each pixel classification method was trained on ground truth data collected from the study region. Three approaches to pixel classification were examined: (i) cluster analysis of trajectories of EVI values from consecutive multi-date imagery during the crop growth period; (ii) harmonic analysis of the time series (HANTS) of the EVI values; and (iii) principal component analysis (PCA) of the time series of EVI values. Images classified using these three approaches were compared with each other, and with a classification based on the single MODIS image taken at peak EVI. Imagery for the 2003 and 2004 seasons was used to assess the ability of the methods to determine wheat, barley, chickpea, and total cropped area estimates. The accuracy at pixel scale was determined by the percent correct classification metric by contrasting all pixel scale samples with independent pixel observations. At a shire level, aggregated total crop area estimates were compared with surveyed estimates. All multi-temporal methods showed significant overall capability to estimate total winter crop area. There was high accuracy at pixel scale (>98% correct classification) for identifying overall winter cropping. However, discrimination among crops was less accurate. Although the use of single-date EVI data produced high accuracy for estimates of wheat area at shire scale, the result contradicted the poor pixel-scale accuracy associated with this approach, due to fortuitous compensating errors. Further studies are needed to extrapolate the multi-temporal approaches to other geographical areas and to improve the lead time for deriving cropped-area estimates before harvest.
A method for mapping the distribution and density of rabbits and other vertebrate pests in Australia
Resumo:
The European wild rabbit has been considered Australia’s worst vertebrate pest and yet little effort appears to have gone into producing maps of rabbit distribution and density. Mapping the distribution and density of pests is an important step in effective management. A map is essential for estimating the extent of damage caused and for efficiently planning and monitoring the success of pest control operations. This paper describes the use of soil type and point data to prepare a map showing the distribution and density of rabbits in Australia. The potential for the method to be used for mapping other vertebrate pests is explored. The approach used to prepare the map is based on that used for rabbits in Queensland (Berman et al. 1998). An index of rabbit density was determined using the number of Spanish rabbit fleas released per square kilometre for each Soil Map Unit (Atlas of Australian Soils). Spanish rabbit fleas were released into active rabbit warrens at 1606 sites in the early 1990s as an additional vector for myxoma virus and the locations of the releases were recorded using a Global Positioning System (GPS). Releases were predominantly in arid areas but some fleas were released in south east Queensland and the New England Tablelands of New South Wales. The map produced appears to reflect well the distribution and density of rabbits, at least in the areas where Spanish fleas were released. Rabbit pellet counts conducted in 2007 at 54 sites across an area of south east South Australia, south eastern Queensland, and parts of New South Wales (New England Tablelands and south west) in soil Map Units where Spanish fleas were released, provided a preliminary means to ground truth the map. There was a good relationship between mean pellet count score and the index of abundance for soil Map Units. Rabbit pellet counts may allow extension of the map into other parts of Australia where there were no Spanish rabbit fleas released and where there may be no other consistent information on rabbit location and density. The recent Equine Influenza outbreak provided a further test of the value of this mapping method. The distribution and density of domestic horses were mapped to provide estimates of the number of horses in various regions. These estimates were close to the actual numbers of horses subsequently determined from vaccination records and registrations. The soil Map Units are not simply soil types they contain information on landuse and vegetation and the soil classification is relatively localised. These properties make this mapping method useful, not only for rabbits, but also for other species that are not so dependent on soil type for survival.
Resumo:
The use of remote sensing imagery as auxiliary data in forest inventory is based on the correlation between features extracted from the images and the ground truth. The bidirectional reflectance and radial displacement cause variation in image features located in different segments of the image but forest characteristics remaining the same. The variation has so far been diminished by different radiometric corrections. In this study the use of sun azimuth based converted image co-ordinates was examined to supplement auxiliary data extracted from digitised aerial photographs. The method was considered as an alternative for radiometric corrections. Additionally, the usefulness of multi-image interpretation of digitised aerial photographs in regression estimation of forest characteristics was studied. The state owned study area located in Leivonmäki, Central Finland and the study material consisted of five digitised and ortho-rectified colour-infrared (CIR) aerial photographs and field measurements of 388 plots, out of which 194 were relascope (Bitterlich) plots and 194 were concentric circular plots. Both the image data and the field measurements were from the year 1999. When examining the effect of the location of the image point on pixel values and texture features of Finnish forest plots in digitised CIR photographs the clearest differences were found between front-and back-lighted image halves. Inside the image half the differences between different blocks were clearly bigger on the front-lighted half than on the back-lighted half. The strength of the phenomenon varied by forest category. The differences between pixel values extracted from different image blocks were greatest in developed and mature stands and smallest in young stands. The differences between texture features were greatest in developing stands and smallest in young and mature stands. The logarithm of timber volume per hectare and the angular transformation of the proportion of broadleaved trees of the total volume were used as dependent variables in regression models. Five different converted image co-ordinates based trend surfaces were used in models in order to diminish the effect of the bidirectional reflectance. The reference model of total volume, in which the location of the image point had been ignored, resulted in RMSE of 1,268 calculated from test material. The best of the trend surfaces was the complete third order surface, which resulted in RMSE of 1,107. The reference model of the proportion of broadleaved trees resulted in RMSE of 0,4292 and the second order trend surface was the best, resulting in RMSE of 0,4270. The trend surface method is applicable, but it has to be applied by forest category and by variable. The usefulness of multi-image interpretation of digitised aerial photographs was studied by building comparable regression models using either the front-lighted image features, back-lighted image features or both. The two-image model turned out to be slightly better than the one-image models in total volume estimation. The best one-image model resulted in RMSE of 1,098 and the two-image model resulted in RMSE of 1,090. The homologous features did not improve the models of the proportion of broadleaved trees. The overall result gives motivation for further research of multi-image interpretation. The focus may be improving regression estimation and feature selection or examination of stratification used in two-phase sampling inventory techniques. Keywords: forest inventory, digitised aerial photograph, bidirectional reflectance, converted image coordinates, regression estimation, multi-image interpretation, pixel value, texture, trend surface
Resumo:
Mesoscale weather phenomena, such as the sea breeze circulation or lake effect snow bands, are typically too large to be observed at one point, yet too small to be caught in a traditional network of weather stations. Hence, the weather radar is one of the best tools for observing, analyzing and understanding their behavior and development. A weather radar network is a complex system, which has many structural and technical features to be tuned, from the location of each radar to the number of pulses averaged in the signal processing. These design parameters have no universal optimal values, but their selection depends on the nature of the weather phenomena to be monitored as well as on the applications for which the data will be used. The priorities and critical values are different for forest fire forecasting, aviation weather service or the planning of snow ploughing, to name a few radar-based applications. The main objective of the work performed within this thesis has been to combine knowledge of technical properties of the radar systems and our understanding of weather conditions in order to produce better applications able to efficiently support decision making in service duties for modern society related to weather and safety in northern conditions. When a new application is developed, it must be tested against ground truth . Two new verification approaches for radar-based hail estimates are introduced in this thesis. For mesoscale applications, finding the representative reference can be challenging since these phenomena are by definition difficult to catch with surface observations. Hence, almost any valuable information, which can be distilled from unconventional data sources such as newspapers and holiday shots is welcome. However, as important as getting data is to obtain estimates of data quality, and to judge to what extent the two disparate information sources can be compared. The presented new applications do not rely on radar data alone, but ingest information from auxiliary sources such as temperature fields. The author concludes that in the future the radar will continue to be a key source of data and information especially when used together in an effective way with other meteorological data.
Resumo:
Over the last few decades, there has been a significant land cover (LC) change across the globe due to the increasing demand of the burgeoning population and urban sprawl. In order to take account of the change, there is a need for accurate and up- to-date LC maps. Mapping and monitoring of LC in India is being carried out at national level using multi-temporal IRS AWiFS data. Multispectral data such as IKONOS, Landsat- TM/ETM+, IRS-1C/D LISS-III/IV, AWiFS and SPOT-5, etc. have adequate spatial resolution (~ 1m to 56m) for LC mapping to generate 1:50,000 maps. However, for developing countries and those with large geographical extent, seasonal LC mapping is prohibitive with data from commercial sensors of limited spatial coverage. Superspectral data from the MODIS sensor are freely available, have better temporal (8 day composites) and spectral information. MODIS pixels typically contain a mixture of various LC types (due to coarse spatial resolution of 250, 500 and 1000 m), especially in more fragmented landscapes. In this context, linear spectral unmixing would be useful for mapping patchy land covers, such as those that characterise much of the Indian subcontinent. This work evaluates the existing unmixing technique for LC mapping using MODIS data, using end- members that are extracted through Pixel Purity Index (PPI), Scatter plot and N-dimensional visualisation. The abundance maps were generated for agriculture, built up, forest, plantations, waste land/others and water bodies. The assessment of the results using ground truth and a LISS-III classified map shows 86% overall accuracy, suggesting the potential for broad-scale applicability of the technique with superspectral data for natural resource planning and inventory applications.
Resumo:
This paper describes a semi-automatic tool for annotation of multi-script text from natural scene images. To our knowledge, this is the maiden tool that deals with multi-script text or arbitrary orientation. The procedure involves manual seed selection followed by a region growing process to segment each word present in the image. The threshold for region growing can be varied by the user so as to ensure pixel-accurate character segmentation. The text present in the image is tagged word-by-word. A virtual keyboard interface has also been designed for entering the ground truth in ten Indic scripts, besides English. The keyboard interface can easily be generated for any script, thereby expanding the scope of the toolkit. Optionally, each segmented word can further be labeled into its constituent characters/symbols. Polygonal masks are used to split or merge the segmented words into valid characters/symbols. The ground truth is represented by a pixel-level segmented image and a '.txt' file that contains information about the number of words in the image, word bounding boxes, script and ground truth Unicode. The toolkit, developed using MATLAB, can be used to generate ground truth and annotation for any generic document image. Thus, it is useful for researchers in the document image processing community for evaluating the performance of document analysis and recognition techniques. The multi-script annotation toolokit (MAST) is available for free download.
Resumo:
We propose a set of metrics that evaluate the uniformity, sharpness, continuity, noise, stroke width variance,pulse width ratio, transient pixels density, entropy and variance of components to quantify the quality of a document image. The measures are intended to be used in any optical character recognition (OCR) engine to a priori estimate the expected performance of the OCR. The suggested measures have been evaluated on many document images, which have different scripts. The quality of a document image is manually annotated by users to create a ground truth. The idea is to correlate the values of the measures with the user annotated data. If the measure calculated matches the annotated description,then the metric is accepted; else it is rejected. In the set of metrics proposed, some of them are accepted and the rest are rejected. We have defined metrics that are easily estimatable. The metrics proposed in this paper are based on the feedback of homely grown OCR engines for Indic (Tamil and Kannada) languages. The metrics are independent of the scripts, and depend only on the quality and age of the paper and the printing. Experiments and results for each proposed metric are discussed. Actual recognition of the printed text is not performed to evaluate the proposed metrics. Sometimes, a document image containing broken characters results in good document image as per the evaluated metrics, which is part of the unsolved challenges. The proposed measures work on gray scale document images and fail to provide reliable information on binarized document image.
Resumo:
We propose an iterative algorithm to detect transient segments in audio signals. Short time Fourier transform(STFT) is used to detect rapid local changes in the audio signal. The algorithm has two steps that iteratively - (a) calculate a function of the STFT and (b) build a transient signal. A dynamic thresholding scheme is used to locate the potential positions of transients in the signal. The iterative procedure ensures that genuine transients are built up while the localised spectral noise are suppressed by using an energy criterion. The extracted transient signal is later compared to a ground truth dataset. The algorithm performed well on two databases. On the EBU-SQAM database of monophonic sounds, the algorithm achieved an F-measure of 90% while on our database of polyphonic audio an F-measure of 91% was achieved. This technique is being used as a preprocessing step for a tempo analysis algorithm and a TSR (Transients + Sines + Residue) decomposition scheme.