Biblioteca Digital

871 resultados para Artificial Intelligence|Computer Science

On learning context-free and context-sensitive languages

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The long short-term memory (LSTM) is not the only neural network which learns a context sensitive language. Second-order sequential cascaded networks (SCNs) are able to induce means from a finite fragment of a context-sensitive language for processing strings outside the training set. The dynamical behavior of the SCN is qualitatively distinct from that observed in LSTM networks. Differences in performance and dynamics are discussed.

Model-based clustering in gene expression microarrays: An application to breast cancer data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In microarray studies, the application of clustering techniques is often used to derive meaningful insights into the data. In the past, hierarchical methods have been the primary clustering tool employed to perform this task. The hierarchical algorithms have been mainly applied heuristically to these cluster analysis problems. Further, a major limitation of these methods is their inability to determine the number of clusters. Thus there is a need for a model-based approach to these. clustering problems. To this end, McLachlan et al. [7] developed a mixture model-based algorithm (EMMIX-GENE) for the clustering of tissue samples. To further investigate the EMMIX-GENE procedure as a model-based -approach, we present a case study involving the application of EMMIX-GENE to the breast cancer data as studied recently in van 't Veer et al. [10]. Our analysis considers the problem of clustering the tissue samples on the basis of the genes which is a non-standard problem because the number of genes greatly exceed the number of tissue samples. We demonstrate how EMMIX-GENE can be useful in reducing the initial set of genes down to a more computationally manageable size. The results from this analysis also emphasise the difficulty associated with the task of separating two tissue groups on the basis of a particular subset of genes. These results also shed light on why supervised methods have such a high misallocation error rate for the breast cancer data.

Mapping the royal road and other hierarchical functions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a technique for visualising hierarchical and symmetric, multimodal fitness functions that have been investigated in the evolutionary computation literature. The focus of this technique is on landscapes in moderate-dimensional, binary spaces (i.e., fitness functions defined over {0, 1}(n), for n less than or equal to 16). The visualisation approach involves an unfolding of the hyperspace into a two-dimensional graph, whose layout represents the topology of the space using a recursive relationship, and whose shading defines the shape of the cost surface defined on the space. Using this technique we present case-study explorations of three fitness functions: royal road, hierarchical-if-and-only-if (H-IFF), and hierarchically decomposable functions (HDF). The visualisation approach provides an insight into the properties of these functions, particularly with respect to the size and shape of the basins of attraction around each of the local optima.

A Survey on Graphical Methods for Classification Predictive Performance Evaluation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Predictive performance evaluation is a fundamental issue in design, development, and deployment of classification systems. As predictive performance evaluation is a multidimensional problem, single scalar summaries such as error rate, although quite convenient due to its simplicity, can seldom evaluate all the aspects that a complete and reliable evaluation must consider. Due to this, various graphical performance evaluation methods are increasingly drawing the attention of machine learning, data mining, and pattern recognition communities. The main advantage of these types of methods resides in their ability to depict the trade-offs between evaluation aspects in a multidimensional space rather than reducing these aspects to an arbitrarily chosen (and often biased) single scalar measure. Furthermore, to appropriately select a suitable graphical method for a given task, it is crucial to identify its strengths and weaknesses. This paper surveys various graphical methods often used for predictive performance evaluation. By presenting these methods in the same framework, we hope this paper may shed some light on deciding which methods are more suitable to use in different situations.

Efficiency issues of evolutionary k-means

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the top ten most influential data mining algorithms, k-means, is known for being simple and scalable. However, it is sensitive to initialization of prototypes and requires that the number of clusters be specified in advance. This paper shows that evolutionary techniques conceived to guide the application of k-means can be more computationally efficient than systematic (i.e., repetitive) approaches that try to get around the above-mentioned drawbacks by repeatedly running the algorithm from different configurations for the number of clusters and initial positions of prototypes. To do so, a modified version of a (k-means based) fast evolutionary algorithm for clustering is employed. Theoretical complexity analyses for the systematic and evolutionary algorithms under interest are provided. Computational experiments and statistical analyses of the results are presented for artificial and text mining data sets. (C) 2010 Elsevier B.V. All rights reserved.

OntoMap: an ontology-based architecture to perform the semantic mapping between an interlingua and software components

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper is about the use of natural language to communicate with computers. Most researches that have pursued this goal consider only requests expressed in English. A way to facilitate the use of several languages in natural language systems is by using an interlingua. An interlingua is an intermediary representation for natural language information that can be processed by machines. We propose to convert natural language requests into an interlingua [universal networking language (UNL)] and to execute these requests using software components. In order to achieve this goal, we propose OntoMap, an ontology-based architecture to perform the semantic mapping between UNL sentences and software components. OntoMap also performs component search and retrieval based on semantic information formalized in ontologies and rules.

On the efficiency of evolutionary fuzzy clustering

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper tackles the problem of showing that evolutionary algorithms for fuzzy clustering can be more efficient than systematic (i.e. repetitive) approaches when the number of clusters in a data set is unknown. To do so, a fuzzy version of an Evolutionary Algorithm for Clustering (EAC) is introduced. A fuzzy cluster validity criterion and a fuzzy local search algorithm are used instead of their hard counterparts employed by EAC. Theoretical complexity analyses for both the systematic and evolutionary algorithms under interest are provided. Examples with computational experiments and statistical analyses are also presented.

Generating Segmented Quality Meshes from Images

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Techniques devoted to generating triangular meshes from intensity images either take as input a segmented image or generate a mesh without distinguishing individual structures contained in the image. These facts may cause difficulties in using such techniques in some applications, such as numerical simulations. In this work we reformulate a previously developed technique for mesh generation from intensity images called Imesh. This reformulation makes Imesh more versatile due to an unified framework that allows an easy change of refinement metric, rendering it effective for constructing meshes for applications with varied requirements, such as numerical simulation and image modeling. Furthermore, a deeper study about the point insertion problem and the development of geometrical criterion for segmentation is also reported in this paper. Meshes with theoretical guarantee of quality can also be obtained for each individual image structure as a post-processing step, a characteristic not usually found in other methods. The tests demonstrate the flexibility and the effectiveness of the approach.

Topological triangle characterization with application to object detection from images

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A novel mathematical framework inspired on Morse Theory for topological triangle characterization in 2D meshes is introduced that is useful for applications involving the creation of mesh models of objects whose geometry is not known a priori. The framework guarantees a precise control of topological changes introduced as a result of triangle insertion/removal operations and enables the definition of intuitive high-level operators for managing the mesh while keeping its topological integrity. An application is described in the implementation of an innovative approach for the detection of 2D objects from images that integrates the topological control enabled by geometric modeling with traditional image processing techniques. (C) 2008 Published by Elsevier B.V.

EXPLOITING A GENERIC APPROACH TO CONSTRUCT COMPONENT-BASED SYSTEMS SOFTWARE IN LINUX ENVIRONMENTS

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Component-based software engineering has recently emerged as a promising solution to the development of system-level software. Unfortunately, current approaches are limited to specific platforms and domains. This lack of generality is particularly problematic as it prevents knowledge sharing and generally drives development costs up. In the past, we have developed a generic approach to component-based software engineering for system-level software called OpenCom. In this paper, we present OpenComL an instantiation of OpenCom to Linux environments and show how it can be profiled to meet a range of system-level software in Linux environments. For this, we demonstrate its application to constructing a programmable router platform and a middleware for parallel environments.

Inference From Aging Information

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For many learning tasks the duration of the data collection can be greater than the time scale for changes of the underlying data distribution. The question we ask is how to include the information that data are aging. Ad hoc methods to achieve this include the use of validity windows that prevent the learning machine from making inferences based on old data. This introduces the problem of how to define the size of validity windows. In this brief, a new adaptive Bayesian inspired algorithm is presented for learning drifting concepts. It uses the analogy of validity windows in an adaptive Bayesian way to incorporate changes in the data distribution over time. We apply a theoretical approach based on information geometry to the classification problem and measure its performance in simulations. The uncertainty about the appropriate size of the memory windows is dealt with in a Bayesian manner by integrating over the distribution of the adaptive window size. Thus, the posterior distribution of the weights may develop algebraic tails. The learning algorithm results from tracking the mean and variance of the posterior distribution of the weights. It was found that the algebraic tails of this posterior distribution give the learning algorithm the ability to cope with an evolving environment by permitting the escape from local traps.

Medical image retrieval based on complexity analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Texture is one of the most important visual attributes used in image analysis. It is used in many content-based image retrieval systems, where it allows the identification of a larger number of images from distinct origins. This paper presents a novel approach for image analysis and retrieval based on complexity analysis. The approach consists of a texture segmentation step, performed by complexity analysis through BoxCounting fractal dimension, followed by the estimation of complexity of each computed region by multiscale fractal dimension. Experiments have been performed with MRI database in both pattern recognition and image retrieval contexts. Results show the accuracy of the method and also indicate how the performance changes as the texture segmentation process is altered.

A non-parametric vessel detection method for complex vascular structures

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modern medical imaging techniques enable the acquisition of in vivo high resolution images of the vascular system. Most common methods for the detection of vessels in these images, such as multiscale Hessian-based operators and matched filters, rely on the assumption that at each voxel there is a single cylinder. Such an assumption is clearly violated at the multitude of branching points that are easily observed in all, but the Most focused vascular image studies. In this paper, we propose a novel method for detecting vessels in medical images that relaxes this single cylinder assumption. We directly exploit local neighborhood intensities and extract characteristics of the local intensity profile (in a spherical polar coordinate system) which we term as the polar neighborhood intensity profile. We present a new method to capture the common properties shared by polar neighborhood intensity profiles for all the types of vascular points belonging to the vascular system. The new method enables us to detect vessels even near complex extreme points, including branching points. Our method demonstrates improved performance over standard methods on both 2D synthetic images and 3D animal and clinical vascular images, particularly close to vessel branching regions. (C) 2008 Elsevier B.V. All rights reserved.

Particle Competition and Cooperation in Networks for Semi-Supervised Learning

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Semi-supervised learning is one of the important topics in machine learning, concerning with pattern classification where only a small subset of data is labeled. In this paper, a new network-based (or graph-based) semi-supervised classification model is proposed. It employs a combined random-greedy walk of particles, with competition and cooperation mechanisms, to propagate class labels to the whole network. Due to the competition mechanism, the proposed model has a local label spreading fashion, i.e., each particle only visits a portion of nodes potentially belonging to it, while it is not allowed to visit those nodes definitely occupied by particles of other classes. In this way, a "divide-and-conquer" effect is naturally embedded in the model. As a result, the proposed model can achieve a good classification rate while exhibiting low computational complexity order in comparison to other network-based semi-supervised algorithms. Computer simulations carried out for synthetic and real-world data sets provide a numeric quantification of the performance of the method.

Efficient Forest Data Structure for Evolutionary Algorithms Applied to Network Design

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The design of a network is a solution to several engineering and science problems. Several network design problems are known to be NP-hard, and population-based metaheuristics like evolutionary algorithms (EAs) have been largely investigated for such problems. Such optimization methods simultaneously generate a large number of potential solutions to investigate the search space in breadth and, consequently, to avoid local optima. Obtaining a potential solution usually involves the construction and maintenance of several spanning trees, or more generally, spanning forests. To efficiently explore the search space, special data structures have been developed to provide operations that manipulate a set of spanning trees (population). For a tree with n nodes, the most efficient data structures available in the literature require time O(n) to generate a new spanning tree that modifies an existing one and to store the new solution. We propose a new data structure, called node-depth-degree representation (NDDR), and we demonstrate that using this encoding, generating a new spanning forest requires average time O(root n). Experiments with an EA based on NDDR applied to large-scale instances of the degree-constrained minimum spanning tree problem have shown that the implementation adds small constants and lower order terms to the theoretical bound.

«
1
2
3
4
5
6
7
8
...
58
59
»