970 resultados para Naïve Bayesian Classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The number of remote sensing platforms and sensors rises almost every year, yet much work on the interpretation of land cover is still carried out using either single images or images from the same source taken at different dates. Two questions could be asked of this proliferation of images: can the information contained in different scenes be used to improve the classification accuracy and, what is the best way to combine the different imagery? Two of these multiple image sources are MODIS on the Terra platform and ETM+ on board Landsat7, which are suitably complementary. Daily MODIS images with 36 spectral bands in 250-1000 m spatial resolution and seven spectral bands of ETM+ with 30m and 16 days spatial and temporal resolution respectively are available. In the UK, cloud cover may mean that only a few ETM+ scenes may be available for any particular year and these may not be at the time of year of most interest. The MODIS data may provide information on land cover over the growing season, such as harvest dates, that is not present in the ETM+ data. Therefore, the primary objective of this work is to develop a methodology for the integration of medium spatial resolution Landsat ETM+ image, with multi-temporal, multi-spectral, low-resolution MODIS \Terra images, with the aim of improving the classification of agricultural land. Additionally other data may also be incorporated such as field boundaries from existing maps. When classifying agricultural land cover of the type seen in the UK, where crops are largely sown in homogenous fields with clear and often mapped boundaries, the classification is greatly improved using the mapped polygons and utilising the classification of the polygon as a whole as an apriori probability in classifying each individual pixel using a Bayesian approach. When dealing with multiple images from different platforms and dates it is highly unlikely that the pixels will be exactly co-registered and these pixels will contain a mixture of different real world land covers. Similarly the different atmospheric conditions prevailing during the different days will mean that the same emission from the ground will give rise to different sensor reception. Therefore, a method is presented with a model of the instantaneous field of view and atmospheric effects to enable different remote sensed data sources to be integrated.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis presents an investigation into the application of methods of uncertain reasoning to the biological classification of river water quality. Existing biological methods for reporting river water quality are critically evaluated, and the adoption of a discrete biological classification scheme advocated. Reasoning methods for managing uncertainty are explained, in which the Bayesian and Dempster-Shafer calculi are cited as primary numerical schemes. Elicitation of qualitative knowledge on benthic invertebrates is described. The specificity of benthic response to changes in water quality leads to the adoption of a sensor model of data interpretation, in which a reference set of taxa provide probabilistic support for the biological classes. The significance of sensor states, including that of absence, is shown. Novel techniques of directly eliciting the required uncertainty measures are presented. Bayesian and Dempster-Shafer calculi were used to combine the evidence provided by the sensors. The performance of these automatic classifiers was compared with the expert's own discrete classification of sampled sites. Variations of sensor data weighting, combination order and belief representation were examined for their effect on classification performance. The behaviour of the calculi under evidential conflict and alternative combination rules was investigated. Small variations in evidential weight and the inclusion of evidence from sensors absent from a sample improved classification performance of Bayesian belief and support for singleton hypotheses. For simple support, inclusion of absent evidence decreased classification rate. The performance of Dempster-Shafer classification using consonant belief functions was comparable to Bayesian and singleton belief. Recommendations are made for further work in biological classification using uncertain reasoning methods, including the combination of multiple-expert opinion, the use of Bayesian networks, and the integration of classification software within a decision support system for water quality assessment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sentiment analysis has long focused on binary classification of text as either positive or negative. There has been few work on mapping sentiments or emotions into multiple dimensions. This paper studies a Bayesian modeling approach to multi-class sentiment classification and multidimensional sentiment distributions prediction. It proposes effective mechanisms to incorporate supervised information such as labeled feature constraints and document-level sentiment distributions derived from the training data into model learning. We have evaluated our approach on the datasets collected from the confession section of the Experience Project website where people share their life experiences and personal stories. Our results show that using the latent representation of the training documents derived from our approach as features to build a maximum entropy classifier outperforms other approaches on multi-class sentiment classification. In the more difficult task of multi-dimensional sentiment distributions prediction, our approach gives superior performance compared to a few competitive baselines. © 2012 ACM.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a comparative study of three closely related Bayesian models for unsupervised document level sentiment classification, namely, the latent sentiment model (LSM), the joint sentiment-topic (JST) model, and the Reverse-JST model. Extensive experiments have been conducted on two corpora, the movie review dataset and the multi-domain sentiment dataset. It has been found that while all the three models achieve either better or comparable performance on these two corpora when compared to the existing unsupervised sentiment classification approaches, both JST and Reverse-JST are able to extract sentiment-oriented topics. In addition, Reverse-JST always performs worse than JST suggesting that the JST model is more appropriate for joint sentiment topic detection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62F15.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The real purpose of collecting big data is to identify causality in the hope that this will facilitate credible predictivity . But the search for causality can trap one into infinite regress, and thus one takes refuge in seeking associations between variables in data sets. Regrettably, the mere knowledge of associations does not enable predictivity. Associations need to be embedded within the framework of probability calculus to make coherent predictions. This is so because associations are a feature of probability models, and hence they do not exist outside the framework of a model. Measures of association, like correlation, regression, and mutual information merely refute a preconceived model. Estimated measures of associations do not lead to a probability model; a model is the product of pure thought. This paper discusses these and other fundamentals that are germane to seeking associations in particular, and machine learning in general. ACM Computing Classification System (1998): H.1.2, H.2.4., G.3.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2010 Mathematics Subject Classification: 94A17.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62E16, 65C05, 65C20.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62E16,62F15, 62H12, 62M20.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Requirement engineering is a key issue in the development of a software project. Like any other development activity it is not without risks. This work is about the empirical study of risks of requirements by applying machine learning techniques, specifically Bayesian networks classifiers. We have defined several models to predict the risk level for a given requirement using three dataset that collect metrics taken from the requirement specifications of different projects. The classification accuracy of the Bayesian models obtained is evaluated and compared using several classification performance measures. The results of the experiments show that the Bayesians networks allow obtaining valid predictors. Specifically, a tree augmented network structure shows a competitive experimental performance in all datasets. Besides, the relations established between the variables collected to determine the level of risk in a requirement, match with those set by requirement engineers. We show that Bayesian networks are valid tools for the automation of risks assessment in requirement engineering.