30 resultados para COMBINING CLASSIFIERS
em Universidad Politécnica de Madrid
Resumo:
In the recent years, the computer vision community has shown great interest on depth-based applications thanks to the performance and flexibility of the new generation of RGB-D imagery. In this paper, we present an efficient background subtraction algorithm based on the fusion of multiple region-based classifiers that processes depth and color data provided by RGB-D cameras. Foreground objects are detected by combining a region-based foreground prediction (based on depth data) with different background models (based on a Mixture of Gaussian algorithm) providing color and depth descriptions of the scene at pixel and region level. The information given by these modules is fused in a mixture of experts fashion to improve the foreground detection accuracy. The main contributions of the paper are the region-based models of both background and foreground, built from the depth and color data. The obtained results using different database sequences demonstrate that the proposed approach leads to a higher detection accuracy with respect to existing state-of-the-art techniques.
Resumo:
This paper presents the 2005 Miracle’s team approach to the Ad-Hoc Information Retrieval tasks. The goal for the experiments this year was twofold: to continue testing the effect of combination approaches on information retrieval tasks, and improving our basic processing and indexing tools, adapting them to new languages with strange encoding schemes. The starting point was a set of basic components: stemming, transforming, filtering, proper nouns extraction, paragraph extraction, and pseudo-relevance feedback. Some of these basic components were used in different combinations and order of application for document indexing and for query processing. Second-order combinations were also tested, by averaging or selective combination of the documents retrieved by different approaches for a particular query. In the multilingual track, we concentrated our work on the merging process of the results of monolingual runs to get the overall multilingual result, relying on available translations. In both cross-lingual tracks, we have used available translation resources, and in some cases we have used a combination approach.
Resumo:
This article proposes a MAS architecture for network diagnosis under uncertainty. Network diagnosis is divided into two inference processes: hypothesis generation and hypothesis confirmation. The first process is distributed among several agents based on a MSBN, while the second one is carried out by agents using semantic reasoning. A diagnosis ontology has been defined in order to combine both inference processes. To drive the deliberation process, dynamic data about the influence of observations are taken during diagnosis process. In order to achieve quick and reliable diagnoses, this influence is used to choose the best action to perform. This approach has been evaluated in a P2P video streaming scenario. Computational and time improvements are highlight as conclusions.
Resumo:
This paper discusses a novel hybrid approach for text categorization that combines a machine learning algorithm, which provides a base model trained with a labeled corpus, with a rule-based expert system, which is used to improve the results provided by the previous classifier, by filtering false positives and dealing with false negatives. The main advantage is that the system can be easily fine-tuned by adding specific rules for those noisy or conflicting categories that have not been successfully trained. We also describe an implementation based on k-Nearest Neighbor and a simple rule language to express lists of positive, negative and relevant (multiword) terms appearing in the input text. The system is evaluated in several scenarios, including the popular Reuters-21578 news corpus for comparison to other approaches, and categorization using IPTC metadata, EUROVOC thesaurus and others. Results show that this approach achieves a precision that is comparable to top ranked methods, with the added value that it does not require a demanding human expert workload to train
Resumo:
Ontologies and taxonomies are widely used to organize concepts providing the basis for activities such as indexing, and as background knowledge for NLP tasks. As such, translation of these resources would prove useful to adapt these systems to new languages. However, we show that the nature of these resources is significantly different from the "free-text" paradigm used to train most statistical machine translation systems. In particular, we see significant differences in the linguistic nature of these resources and such resources have rich additional semantics. We demonstrate that as a result of these linguistic differences, standard SMT methods, in particular evaluation metrics, can produce poor performance. We then look to the task of leveraging these semantics for translation, which we approach in three ways: by adapting the translation system to the domain of the resource; by examining if semantics can help to predict the syntactic structure used in translation; and by evaluating if we can use existing translated taxonomies to disambiguate translations. We present some early results from these experiments, which shed light on the degree of success we may have with each approach
Resumo:
This article considers static analysis based on abstract interpretation of logic programs over combined domains. It is known that analyses over combined domains provide more information potentially than obtained by the independent analyses. However, the construction of a combined analysis often requires redefining the basic operations for the combined domain. A practical approach to maintain precision in combined analyses of logic programs which reuses the individual analyses and does not redefine the basic operations is illustrated. The advantages of the approach are that proofs of correctness for the new domains are not required and implementations can be reused. The approach is demonstrated by showing that a combined sharing analysis — constructed from "old" proposals — compares well with other "new" proposals suggested in recent literature both from the point of view of efficiency and accuracy.
Resumo:
All-terrain robot locomotion is an active topic of research. Search and rescue maneuvers and exploratory missions could benefit from robots with the abilities of real animals. However, technological barriers exist to ultimately achieving the actuation system, which is able to meet the exigent requirements of these robots. This paper describes the locomotioncontrol of a leg prototype, designed and developed to make a quadruped walk dynamically while exhibiting compliant interaction with the environment. The actuation system of the leg is based on the hybrid use of series elasticity and magneto-rheological dampers, which provide variable compliance for natural-looking motion and improved interaction with the ground. The locomotioncontrol architecture has been proposed to exploit natural leg dynamics in order to improve energy efficiency. Results show that the controller achieves a significant reduction in energy consumption during the leg swing phase thanks to the exploitation of inherent leg dynamics. Added to this, experiments with the real leg prototype show that the combined use of series elasticity and magneto-rheologicaldamping at the knee provide a 20 % reduction in the energy wasted in braking the knee during its extension in the leg stance phase.
Resumo:
The main purpose of a gene interaction network is to map the relationships of the genes that are out of sight when a genomic study is tackled. DNA microarrays allow the measure of gene expression of thousands of genes at the same time. These data constitute the numeric seed for the induction of the gene networks. In this paper, we propose a new approach to build gene networks by means of Bayesian classifiers, variable selection and bootstrap resampling. The interactions induced by the Bayesian classifiers are based both on the expression levels and on the phenotype information of the supervised variable. Feature selection and bootstrap resampling add reliability and robustness to the overall process removing the false positive findings. The consensus among all the induced models produces a hierarchy of dependences and, thus, of variables. Biologists can define the depth level of the model hierarchy so the set of interactions and genes involved can vary from a sparse to a dense set. Experimental results show how these networks perform well on classification tasks. The biological validation matches previous biological findings and opens new hypothesis for future studies
Resumo:
Effective static analyses have been proposed which infer bounds on the number of resolutions. These have the advantage of being independent from the platform on which the programs are executed and have been shown to be useful in a number of applications, such as granularity control in parallel execution. On the other hand, in distributed computation scenarios where platforms with different capabilities come into play, it is necessary to express costs in metrics that include the characteristics of the platform. In particular, it is specially interesting to be able to infer upper and lower bounds on actual execution times. With this objective in mind, we propose an approach which combines compile-time analysis for cost bounds with a one-time profiling of a given platform in order to determine the valúes of certain parameters for that platform. These parameters calibrate a cost model which, from then on, is able to compute statically time bound functions for procedures and to predict with a significant degree of accuracy the execution times of such procedures in that concrete platform. The approach has been implemented and integrated in the CiaoPP system.
Resumo:
Nondeterminism and partially instantiated data structures give logic programming expressive power beyond that of functional programming. However, functional programming often provides convenient syntactic features, such as having a designated implicit output argument, which allow function cali nesting and sometimes results in more compact code. Functional programming also sometimes allows a more direct encoding of lazy evaluation, with its ability to deal with infinite data structures. We present a syntactic functional extensión, used in the Ciao system, which can be implemented in ISO-standard Prolog systems and covers function application, predefined evaluable functors, functional definitions, quoting, and lazy evaluation. The extensión is also composable with higher-order features and can be combined with other extensions to ISO-Prolog such as constraints. We also highlight the features of the Ciao system which help implementation and present some data on the overhead of using lazy evaluation with respect to eager evaluation.
Resumo:
OGOLOD is a Linked Open Data dataset derived from different biomedical resources by an automated pipeline, using a tailored ontology as a scaffold. The key contribution of OGOLOD is that it links, in new RDF triples, genetic human diseases and orthologous genes, paving the way for a more efficient translational biomedical research exploiting the Linked Open Data cloud.
Resumo:
Effective static analyses have been proposed which allow inferring functions which bound the number of resolutions or reductions. These have the advantage of being independent from the platform on which the programs are executed and such bounds have been shown useful in a number of applications, such as granularity control in parallel execution. On the other hand, in certain distributed computation scenarios where different platforms come into play, with each platform having different capabilities, it is more interesting to express costs in metrics that include the characteristics of the platform. In particular, it is specially interesting to be able to infer upper and lower bounds on actual execution time. With this objective in mind, we propose a method which allows inferring upper and lower bounds on the execution times of procedures of a program in a given execution platform. The approach combines compile-time cost bounds analysis with a one-time profiling of the platform in order to determine the values of certain constants for that platform. These constants calibrate a cost model which from then on is able to compute statically time bound functions for procedures and to predict with a significant degree of accuracy the execution times of such procedures in the given platform. The approach has been implemented and integrated in the CiaoPP system.
Resumo:
Multi-dimensional Bayesian network classifiers (MBCs) are probabilistic graphical models recently proposed to deal with multi-dimensional classification problems, where each instance in the data set has to be assigned to more than one class variable. In this paper, we propose a Markov blanket-based approach for learning MBCs from data. Basically, it consists of determining the Markov blanket around each class variable using the HITON algorithm, then specifying the directionality over the MBC subgraphs. Our approach is applied to the prediction problem of the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39) in order to estimate the health-related quality of life of Parkinson’s patients. Fivefold cross-validation experiments were carried out on randomly generated synthetic data sets, Yeast data set, as well as on a real-world Parkinson’s disease data set containing 488 patients. The experimental study, including comparison with additional Bayesian network-based approaches, back propagation for multi-label learning, multi-label k-nearest neighbor, multinomial logistic regression, ordinary least squares, and censored least absolute deviations, shows encouraging results in terms of predictive accuracy as well as the identification of dependence relationships among class and feature variables.
Resumo:
In this position paper, we claim that the need for time consuming data preparation and result interpretation tasks in knowledge discovery, as well as for costly expert consultation and consensus building activities required for ontology building can be reduced through exploiting the interplay of data mining and ontology engineering. The aim is to obtain in a semi-automatic way new knowledge from distributed data sources that can be used for inference and reasoning, as well as to guide the extraction of further knowledge from these data sources. The proposed approach is based on the creation of a novel knowledge discovery method relying on the combination, through an iterative ?feedbackloop?, of (a) data mining techniques to make emerge implicit models from data and (b) pattern-based ontology engineering to capture these models in reusable, conceptual and inferable artefacts.
Resumo:
The boundary element method (BEM) has been applied successfully to many engineering problems during the last decades. Compared with domain type methods like the finite element method (FEM) or the finite difference method (FDM) the BEM can handle problems where the medium extends to infinity much easier than domain type methods as there is no need to develop special boundary conditions (quiet or absorbing boundaries) or infinite elements at the boundaries introduced to limit the domain studied. The determination of the dynamic stiffness of arbitrarily shaped footings is just one of these fields where the BEM has been the method of choice, especially in the 1980s. With the continuous development of computer technology and the available hardware equipment the size of the problems under study grew and, as the flop count for solving the resulting linear system of equations grows with the third power of the number of equations, there was a need for the development of iterative methods with better performance. In [1] the GMRES algorithm was presented which is now widely used for implementations of the collocation BEM. While the FEM results in sparsely populated coefficient matrices, the BEM leads, in general, to fully or densely populated ones, depending on the number of subregions, posing a serious memory problem even for todays computers. If the geometry of the problem permits the surface of the domain to be meshed with equally shaped elements a lot of the resulting coefficients will be calculated and stored repeatedly. The present paper shows how these unnecessary operations can be avoided reducing the calculation time as well as the storage requirement. To this end a similar coefficient identification algorithm (SCIA), has been developed and implemented in a program written in Fortran 90. The vertical dynamic stiffness of a single pile in layered soil has been chosen to test the performance of the implementation. The results obtained with the 3-d model may be compared with those obtained with an axisymmetric formulation which are considered to be the reference values as the mesh quality is much better. The entire 3D model comprises more than 35000 dofs being a soil region with 21168 dofs the biggest single region. Note that the memory necessary to store all coefficients of this single region is about 6.8 GB, an amount which is usually not available with personal computers. In the problem under study the interface zone between the two adjacent soil regions as well as the surface of the top layer may be meshed with equally sized elements. In this case the application of the SCIA leads to an important reduction in memory requirements. The maximum memory used during the calculation has been reduced to 1.2 GB. The application of the SCIA thus permits problems to be solved on personal computers which otherwise would require much more powerful hardware.