247 resultados para inductive inference

em Queensland University of Technology - ePrints Archive


Relevância:

60.00% 60.00%

Publicador:

Resumo:

We generalize the classical notion of Vapnik–Chernovenkis (VC) dimension to ordinal VC-dimension, in the context of logical learning paradigms. Logical learning paradigms encompass the numerical learning paradigms commonly studied in Inductive Inference. A logical learning paradigm is defined as a set W of structures over some vocabulary, and a set D of first-order formulas that represent data. The sets of models of ϕ in W, where ϕ varies over D, generate a natural topology W over W. We show that if D is closed under boolean operators, then the notion of ordinal VC-dimension offers a perfect characterization for the problem of predicting the truth of the members of D in a member of W, with an ordinal bound on the number of mistakes. This shows that the notion of VC-dimension has a natural interpretation in Inductive Inference, when cast into a logical setting. We also study the relationships between predictive complexity, selective complexity—a variation on predictive complexity—and mind change complexity. The assumptions that D is closed under boolean operators and that W is compact often play a crucial role to establish connections between these concepts. We then consider a computable setting with effective versions of the complexity measures, and show that the equivalence between ordinal VC-dimension and predictive complexity fails. More precisely, we prove that the effective ordinal VC-dimension of a paradigm can be defined when all other effective notions of complexity are undefined. On a better note, when W is compact, all effective notions of complexity are defined, though they are not related as in the noncomputable version of the framework.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The present paper motivates the study of mind change complexity for learning minimal models of length-bounded logic programs. It establishes ordinal mind change complexity bounds for learnability of these classes both from positive facts and from positive and negative facts. Building on Angluin’s notion of finite thickness and Wright’s work on finite elasticity, Shinohara defined the property of bounded finite thickness to give a sufficient condition for learnability of indexed families of computable languages from positive data. This paper shows that an effective version of Shinohara’s notion of bounded finite thickness gives sufficient conditions for learnability with ordinal mind change bound, both in the context of learnability from positive data and for learnability from complete (both positive and negative) data. Let Omega be a notation for the first limit ordinal. Then, it is shown that if a language defining framework yields a uniformly decidable family of languages and has effective bounded finite thickness, then for each natural number m >0, the class of languages defined by formal systems of length <= m: • is identifiable in the limit from positive data with a mind change bound of Omega (power)m; • is identifiable in the limit from both positive and negative data with an ordinal mind change bound of Omega × m. The above sufficient conditions are employed to give an ordinal mind change bound for learnability of minimal models of various classes of length-bounded Prolog programs, including Shapiro’s linear programs, Arimura and Shinohara’s depth-bounded linearly covering programs, and Krishna Rao’s depth-bounded linearly moded programs. It is also noted that the bound for learning from positive data is tight for the example classes considered.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the context of learning paradigms of identification in the limit, we address the question: why is uncertainty sometimes desirable? We use mind change bounds on the output hypotheses as a measure of uncertainty and interpret ‘desirable’ as reduction in data memorization, also defined in terms of mind change bounds. The resulting model is closely related to iterative learning with bounded mind change complexity, but the dual use of mind change bounds — for hypotheses and for data — is a key distinctive feature of our approach. We show that situations exist where the more mind changes the learner is willing to accept, the less the amount of data it needs to remember in order to converge to the correct hypothesis. We also investigate relationships between our model and learning from good examples, set-driven, monotonic and strong-monotonic learners, as well as class-comprising versus class-preserving learnability.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background The problem of silent multiple comparisons is one of the most difficult statistical problems faced by scientists. It is a particular problem for investigating a one-off cancer cluster reported to a health department because any one of hundreds, or possibly thousands, of neighbourhoods, schools, or workplaces could have reported a cluster, which could have been for any one of several types of cancer or any one of several time periods. Methods This paper contrasts the frequentist approach with a Bayesian approach for dealing with silent multiple comparisons in the context of a one-off cluster reported to a health department. Two published cluster investigations were re-analysed using the Dunn-Sidak method to adjust frequentist p-values and confidence intervals for silent multiple comparisons. Bayesian methods were based on the Gamma distribution. Results Bayesian analysis with non-informative priors produced results similar to the frequentist analysis, and suggested that both clusters represented a statistical excess. In the frequentist framework, the statistical significance of both clusters was extremely sensitive to the number of silent multiple comparisons, which can only ever be a subjective "guesstimate". The Bayesian approach is also subjective: whether there is an apparent statistical excess depends on the specified prior. Conclusion In cluster investigations, the frequentist approach is just as subjective as the Bayesian approach, but the Bayesian approach is less ambitious in that it treats the analysis as a synthesis of data and personal judgements (possibly poor ones), rather than objective reality. Bayesian analysis is (arguably) a useful tool to support complicated decision-making, because it makes the uncertainty associated with silent multiple comparisons explicit.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Image annotation is a significant step towards semantic based image retrieval. Ontology is a popular approach for semantic representation and has been intensively studied for multimedia analysis. However, relations among concepts are seldom used to extract higher-level semantics. Moreover, the ontology inference is often crisp. This paper aims to enable sophisticated semantic querying of images, and thus contributes to 1) an ontology framework to contain both visual and contextual knowledge, and 2) a probabilistic inference approach to reason the high-level concepts based on different sources of information. The experiment on a natural scene database from LabelMe database shows encouraging results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To date, automatic recognition of semantic information such as salient objects and mid-level concepts from images is a challenging task. Since real-world objects tend to exist in a context within their environment, the computer vision researchers have increasingly incorporated contextual information for improving object recognition. In this paper, we present a method to build a visual contextual ontology from salient objects descriptions for image annotation. The ontologies include not only partOf/kindOf relations, but also spatial and co-occurrence relations. A two-step image annotation algorithm is also proposed based on ontology relations and probabilistic inference. Different from most of the existing work, we specially exploit how to combine representation of ontology, contextual knowledge and probabilistic inference. The experiments show that image annotation results are improved in the LabelMe dataset.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Phase-type distributions represent the time to absorption for a finite state Markov chain in continuous time, generalising the exponential distribution and providing a flexible and useful modelling tool. We present a new reversible jump Markov chain Monte Carlo scheme for performing a fully Bayesian analysis of the popular Coxian subclass of phase-type models; the convenient Coxian representation involves fewer parameters than a more general phase-type model. The key novelty of our approach is that we model covariate dependence in the mean whilst using the Coxian phase-type model as a very general residual distribution. Such incorporation of covariates into the model has not previously been attempted in the Bayesian literature. A further novelty is that we also propose a reversible jump scheme for investigating structural changes to the model brought about by the introduction of Erlang phases. Our approach addresses more questions of inference than previous Bayesian treatments of this model and is automatic in nature. We analyse an example dataset comprising lengths of hospital stays of a sample of patients collected from two Australian hospitals to produce a model for a patient's expected length of stay which incorporates the effects of several covariates. This leads to interesting conclusions about what contributes to length of hospital stay with implications for hospital planning. We compare our results with an alternative classical analysis of these data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel approach for developing summary statistics for use in approximate Bayesian computation (ABC) algorithms by using indirect inference. ABC methods are useful for posterior inference in the presence of an intractable likelihood function. In the indirect inference approach to ABC the parameters of an auxiliary model fitted to the data become the summary statistics. Although applicable to any ABC technique, we embed this approach within a sequential Monte Carlo algorithm that is completely adaptive and requires very little tuning. This methodological development was motivated by an application involving data on macroparasite population evolution modelled by a trivariate stochastic process for which there is no tractable likelihood function. The auxiliary model here is based on a beta–binomial distribution. The main objective of the analysis is to determine which parameters of the stochastic model are estimable from the observed data on mature parasite worms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a fault diagnosis method based on adaptive neuro-fuzzy inference system (ANFIS) in combination with decision trees. Classification and regression tree (CART) which is one of the decision tree methods is used as a feature selection procedure to select pertinent features from data set. The crisp rules obtained from the decision tree are then converted to fuzzy if-then rules that are employed to identify the structure of ANFIS classifier. The hybrid of back-propagation and least squares algorithm are utilized to tune the parameters of the membership functions. In order to evaluate the proposed algorithm, the data sets obtained from vibration signals and current signals of the induction motors are used. The results indicate that the CART–ANFIS model has potential for fault diagnosis of induction motors.