188 resultados para Histograms
Resumo:
In machine learning and pattern recognition tasks, the use of feature discretization techniques may have several advantages. The discretized features may hold enough information for the learning task at hand, while ignoring minor fluctuations that are irrelevant or harmful for that task. The discretized features have more compact representations that may yield both better accuracy and lower training time, as compared to the use of the original features. However, in many cases, mainly with medium and high-dimensional data, the large number of features usually implies that there is some redundancy among them. Thus, we may further apply feature selection (FS) techniques on the discrete data, keeping the most relevant features, while discarding the irrelevant and redundant ones. In this paper, we propose relevance and redundancy criteria for supervised feature selection techniques on discrete data. These criteria are applied to the bin-class histograms of the discrete features. The experimental results, on public benchmark data, show that the proposed criteria can achieve better accuracy than widely used relevance and redundancy criteria, such as mutual information and the Fisher ratio.
Resumo:
The application of forecast ensembles to probabilistic weather prediction has spurred considerable interest in their evaluation. Such ensembles are commonly interpreted as Monte Carlo ensembles meaning that the ensemble members are perceived as random draws from a distribution. Under this interpretation, a reasonable property to ask for is statistical consistency, which demands that the ensemble members and the verification behave like draws from the same distribution. A widely used technique to assess statistical consistency of a historical dataset is the rank histogram, which uses as a criterion the number of times that the verification falls between pairs of members of the ordered ensemble. Ensemble evaluation is rendered more specific by stratification, which means that ensembles that satisfy a certain condition (e.g., a certain meteorological regime) are evaluated separately. Fundamental relationships between Monte Carlo ensembles, their rank histograms, and random sampling from the probability simplex according to the Dirichlet distribution are pointed out. Furthermore, the possible benefits and complications of ensemble stratification are discussed. The main conclusion is that a stratified Monte Carlo ensemble might appear inconsistent with the verification even though the original (unstratified) ensemble is consistent. The apparent inconsistency is merely a result of stratification. Stratified rank histograms are thus not necessarily flat. This result is demonstrated by perfect ensemble simulations and supplemented by mathematical arguments. Possible methods to avoid or remove artifacts that stratification induces in the rank histogram are suggested.
Resumo:
I consider the possibility that respondents to the Survey of Professional Forecasters round their probability forecasts of the event that real output will decline in the future, as well as their reported output growth probability distributions. I make various plausible assumptions about respondents’ rounding practices, and show how these impinge upon the apparent mismatch between probability forecasts of a decline in output and the probabilities of this event implied by the annual output growth histograms. I find that rounding accounts for about a quarter of the inconsistent pairs of forecasts.
Resumo:
Most face recognition approaches require a prior training where a given distribution of faces is assumed to further predict the identity of test faces. Such an approach may experience difficulty in identifying faces belonging to distributions different from the one provided during the training. A face recognition technique that performs well regardless of training is, therefore, interesting to consider as a basis of more sophisticated methods. In this work, the Census Transform is applied to describe the faces. Based on a scanning window which extracts local histograms of Census Features, we present a method that directly matches face samples. With this simple technique, 97.2% of the faces in the FERET fa/fb test were correctly recognized. Despite being an easy test set, we have found no other approaches in literature regarding straight comparisons of faces with such a performance. Also, a window for further improvement is presented. Among other techniques, we demonstrate how the use of SVMs over the Census Histogram representation can increase the recognition performance.
Resumo:
Surface defects are extremely important in mechanical characterization of several different materials. Therefore, the analysis of surface finishing is essential for a further simulation of surface mechanical properties in a customized project in materials science and technology. One of the methods commonly employed for such purpose is the statistical mapping of different sample surface regions using the depth from focus technique. The analysis is usually performed directly from the elevation maps which are obtained from the digital image processing. In this paper, the possibility of quantifying the surface heterogeneity of Silicon Carbide porous ceramics by elevation map histograms is presented. The advantage of this technique is that it allows the qualitative or quantitative verification of all surface image fields that cannot be done by using the Surface Plot plugin of image J™ platform commonly used in digital image processing. © 2012 Springer Science+Business Media, LLC.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Histograms of Oriented Gradients (HoGs) provide excellent results in object detection and verification. However, their demanding processing requirements bound their applicability in some critical real-time scenarios, such as for video-based on-board vehicle detection systems. In this work, an efficient HOG configuration for pose-based on-board vehicle verification is proposed, which alleviates both the processing requirements and required feature vector length without reducing classification performance. The impact on classification of some critical configuration and processing parameters is in depth analyzed to propose a baseline efficient descriptor. Based on the analysis of its cells contribution to classification, new view-dependent cell-configuration patterns are proposed, resulting in reduced descriptors which provide an excellent balance between performance and computational requirements, rendering higher verification rates than other works in the literature.
Resumo:
Summarizing topological relations is fundamental to many spatial applications including spatial query optimization. In this paper, we present several novel techniques to eectively construct cell density based spatial histograms for range (window) summarizations restricted to the four most important topological relations: contains, contained, overlap, and disjoint. We rst present a novel framework to construct a multiscale histogram composed of multiple Euler histograms with the guarantee of the exact summarization results for aligned windows in constant time. Then we present an approximate algorithm, with the approximate ratio 19/12, to minimize the storage spaces of such multiscale Euler histograms, although the problem is generally NP-hard. To conform to a limited storage space where only k Euler histograms are allowed, an effective algorithm is presented to construct multiscale histograms to achieve high accuracy. Finally, we present a new approximate algorithm to query an Euler histogram that cannot guarantee the exact answers; it runs in constant time. Our extensive experiments against both synthetic and real world datasets demonstrated that the approximate mul- tiscale histogram techniques may improve the accuracy of the existing techniques by several orders of magnitude while retaining the cost effciency, and the exact multiscale histogram technique requires only a storage space linearly proportional to the number of cells for the real datasets.
Resumo:
Resources created at the University of Southampton for the module Remote Sensing for Earth Observation
Resumo:
A method using the ring-oven technique for pre-concentration in filter paper discs and near infrared hyperspectral imaging is proposed to identify four detergent and dispersant additives, and to determine their concentration in gasoline. Different approaches were used to select the best image data processing in order to gather the relevant spectral information. This was attained by selecting the pixels of the region of interest (ROI), using a pre-calculated threshold value of the PCA scores arranged as histograms, to select the spectra set; summing up the selected spectra to achieve representativeness; and compensating for the superimposed filter paper spectral information, also supported by scores histograms for each individual sample. The best classification model was achieved using linear discriminant analysis and genetic algorithm (LDA/GA), whose correct classification rate in the external validation set was 92%. Previous classification of the type of additive present in the gasoline is necessary to define the PLS model required for its quantitative determination. Considering that two of the additives studied present high spectral similarity, a PLS regression model was constructed to predict their content in gasoline, while two additional models were used for the remaining additives. The results for the external validation of these regression models showed a mean percentage error of prediction varying from 5 to 15%.
Resumo:
The citrus greening (or huanglongbing) disease has caused serious problems in citrus crops around the world. An early diagnostic method to detect this malady is needed due to the rapid dissemination of Candidatus Liberibacter asiaticus (CLas) in the field. This analytical study investigated the fluorescence responses of leaves from healthy citrus plants and those inoculated with CLas by images from a stereomicroscope and also evaluated their potential for the early diagnosis of the infection caused by this bacterium. The plants were measured monthly, and the evolution of the bacteria on inoculated plants was monitored by real-time quantitative polymerase chain reaction (RT-qPCR) amplification of CLas sequences. A statistical method was used to analyse the data. The selection of variables from histograms of colours (colourgrams) of the images was optimized using a paired Student's t-test. The intensity of counts for green colours from images of fluorescence had clearly minor variations for healthy plants than diseased ones. The darker green colours were the indicators of healthy plants and the light colours for the diseased. The method of fluorescence images is novel for fingerprinting healthy and diseased plants and provides an alternative to the current method represented by PCR and visual inspection. A new, non-subjective pattern of analysis and a non-destructive method has been introduced that can minimize the time and costs of analyses.
Resumo:
A network can be analyzed at different topological scales, ranging from single nodes to motifs, communities, up to the complete structure. We propose a novel approach which extends from single nodes to the whole network level by considering non-overlapping subgraphs (i.e. connected components) and their interrelationships and distribution through the network. Though such subgraphs can be completely general, our methodology focuses on the cases in which the nodes of these subgraphs share some special feature, such as being critical for the proper operation of the network. The methodology of subgraph characterization involves two main aspects: (i) the generation of histograms of subgraph sizes and distances between subgraphs and (ii) a merging algorithm, developed to assess the relevance of nodes outside subgraphs by progressively merging subgraphs until the whole network is covered. The latter procedure complements the histograms by taking into account the nodes lying between subgraphs, as well as the relevance of these nodes to the overall subgraph interconnectivity. Experiments were carried out using four types of network models and five instances of real-world networks, in order to illustrate how subgraph characterization can help complementing complex network-based studies.
Resumo:
In this paper, the method of Galerkin and the Askey-Wiener scheme are used to obtain approximate solutions to the stochastic displacement response of Kirchhoff plates with uncertain parameters. Theoretical and numerical results are presented. The Lax-Milgram lemma is used to express the conditions for existence and uniqueness of the solution. Uncertainties in plate and foundation stiffness are modeled by respecting these conditions, hence using Legendre polynomials indexed in uniform random variables. The space of approximate solutions is built using results of density between the space of continuous functions and Sobolev spaces. Approximate Galerkin solutions are compared with results of Monte Carlo simulation, in terms of first and second order moments and in terms of histograms of the displacement response. Numerical results for two example problems show very fast convergence to the exact solution, at excellent accuracies. The Askey-Wiener Galerkin scheme developed herein is able to reproduce the histogram of the displacement response. The scheme is shown to be a theoretically sound and efficient method for the solution of stochastic problems in engineering. (C) 2009 Elsevier Ltd. All rights reserved.