Biblioteca Digital

948 resultados para Data structures (Computer science)

Hierarchical Prediction Machines and Big Data Analytics

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An emerging consensus in cognitive science views the biological brain as a hierarchically-organized predictive processing system. This is a system in which higher-order regions are continuously attempting to predict the activity of lower-order regions at a variety of (increasingly abstract) spatial and temporal scales. The brain is thus revealed as a hierarchical prediction machine that is constantly engaged in the effort to predict the flow of information originating from the sensory surfaces. Such a view seems to afford a great deal of explanatory leverage when it comes to a broad swathe of seemingly disparate psychological phenomena (e.g., learning, memory, perception, action, emotion, planning, reason, imagination, and conscious experience). In the most positive case, the predictive processing story seems to provide our first glimpse at what a unified (computationally-tractable and neurobiological plausible) account of human psychology might look like. This obviously marks out one reason why such models should be the focus of current empirical and theoretical attention. Another reason, however, is rooted in the potential of such models to advance the current state-of-the-art in machine intelligence and machine learning. Interestingly, the vision of the brain as a hierarchical prediction machine is one that establishes contact with work that goes under the heading of 'deep learning'. Deep learning systems thus often attempt to make use of predictive processing schemes and (increasingly abstract) generative models as a means of supporting the analysis of large data sets. But are such computational systems sufficient (by themselves) to provide a route to general human-level analytic capabilities? I will argue that they are not and that closer attention to a broader range of forces and factors (many of which are not confined to the neural realm) may be required to understand what it is that gives human cognition its distinctive (and largely unique) flavour. The vision that emerges is one of 'homomimetic deep learning systems', systems that situate a hierarchically-organized predictive processing core within a larger nexus of developmental, behavioural, symbolic, technological and social influences. Relative to that vision, I suggest that we should see the Web as a form of 'cognitive ecology', one that is as much involved with the transformation of machine intelligence as it is with the progressive reshaping of our own cognitive capabilities.

Data assurance in opaque computations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The chess endgame is increasingly being seen through the lens of, and therefore effectively defined by, a data ‘model’ of itself. It is vital that such models are clearly faithful to the reality they purport to represent. This paper examines that issue and systems engineering responses to it, using the chess endgame as the exemplar scenario. A structured survey has been carried out of the intrinsic challenges and complexity of creating endgame data by reviewing the past pattern of errors during work in progress, surfacing in publications and occurring after the data was generated. Specific measures are proposed to counter observed classes of error-risk, including a preliminary survey of techniques for using state-of-the-art verification tools to generate EGTs that are correct by construction. The approach may be applied generically beyond the game domain.

Chess endgames: 6-man data and strategy

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While Nalimov’s endgame tables for Western Chess are the most used today, their Depth-to-Mate metric is not the most efficient or effective in use. The authors have developed and used new programs to create tables to alternative metrics and recommend better strategies for endgame play.

A new linear initialization in SOM for biomolecular data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the past decade, the amount of data in biological field has become larger and larger; Bio-techniques for analysis of biological data have been developed and new tools have been introduced. Several computational methods are based on unsupervised neural network algorithms that are widely used for multiple purposes including clustering and visualization, i.e. the Self Organizing Maps (SOM). Unfortunately, even though this method is unsupervised, the performances in terms of quality of result and learning speed are strongly dependent from the neuron weights initialization. In this paper we present a new initialization technique based on a totally connected undirected graph, that report relations among some intersting features of data input. Result of experimental tests, where the proposed algorithm is compared to the original initialization techniques, shows that our technique assures faster learning and better performance in terms of quantization error.

Automated pre-processing strategies for species occurrence data used in biodiversity modelling

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To construct Biodiversity richness maps from Environmental Niche Models (ENMs) of thousands of species is time consuming. A separate species occurrence data pre-processing phase enables the experimenter to control test AUC score variance due to species dataset size. Besides, removing duplicate occurrences and points with missing environmental data, we discuss the need for coordinate precision, wide dispersion, temporal and synonymity filters. After species data filtering, the final task of a pre-processing phase should be the automatic generation of species occurrence datasets which can then be directly ’plugged-in’ to the ENM. A software application capable of carrying out all these tasks will be a valuable time-saver particularly for large scale biodiversity studies.

Modula-2 applied

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Programming is a skill which requires knowledge of both the basic constructs of the computer language used and techniques employing these constructs. How these are used in any given application is determined intuitively, and this intuition is based on experience of programs already written. One aim of this book is to describe the techniques and give practical examples of the techniques in action - to provide some experience. Another aim of the book is to show how a program should be developed, in particular how a relatively large program should be tackled in a structured manner. These aims are accomplished essentially by describing the writing of one large program, a diagram generator package, in which a number of useful programming techniques are employed. Also, the book provides a useful program, with an in-built manual describing not only how the program works, but also how it does it, with full source code listings. This means that the user can, if required, modify the package to meet particular requirements. A floppy disk is available from the publishers containing the program, including listings of the source code. All the programs are written in Modula-2, using JPI's Top Speed Modula-2 system running on IBM-PCs and compatibles. This language was chosen as it is an ideal language for implementing large programs and it is the main language taught in the Cybernetics Department at the University of Reading. There are some aspects of the Top Speed implementation which are not standard, so suitable comments are given when these occur. Although implemented in Modula-2, many of the techniques described here are appropriate to other languages, like Pascal of C, for example. The book and programs are based on a second year undergraduate course taught at Reading to Cybernetics students, entitled Algorithms and Data Structures. Useful techniques are described for the reader to use, applications where they are appropriate are recommended, but detailed analyses of the techniques are not given.

Surface based hypothesis verification in intensity images using geometric and appearance data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we discuss current work concerning Appearance-based and CAD-based vision; two opposing vision strategies. CAD-based vision is geometry based, reliant on having complete object centred models. Appearance-based vision builds view dependent models from training images. Existing CAD-based vision systems that work with intensity images have all used one and zero dimensional features, for example lines, arcs, points and corners. We describe a system we have developed for combining these two strategies. Geometric models are extracted from a commercial CAD library of industry standard parts. Surface appearance characteristics are then learnt automatically by observing actual object instances. This information is combined with geometric information and is used in hypothesis evaluation. This augmented description improves the systems robustness to texture, specularities and other artifacts which are hard to model with geometry alone, whilst maintaining the advantages of a geometric description.

Virtual reality as a visualisation tool: benefits and constraints

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The benefits and applications of virtual reality (VR) in the construction industry have been investigated for almost a decade. However, the practical implementation of VR in the construction industry has yet to reach maturity owing to technical constraints. The need for effective information management presents challenges: both transfer of building data to, and organisation of building information within, the virtual environment require consideration. This paper reviews the applications and benefits of VR in the built environment field and reports on a collaboration between Loughborough University and South Bank University to overcome constraints on the use of the overall VR model for whole lifecycle visualisation. The work at each research centre is concerned with an aspect of information management within VR applications for the built environment, and both data transfer and internal data organisation have been investigated. In this paper, similarities and differences between computer-aided design (CAD) and VR packages are first discussed. Three different approaches to the creation of VR models during the design stage are identified and described, with a view to providing sharing understanding across the interdiscipliary groups involved. The suitable organisation of building information within the virtual environment is then further investigated. This work focused on the visualisation of the degradation of a building, through its lifespan, with the view to provide a visual aid for developing an effective and economic project maintenance programme. Finally consideration is given to the potential of emerging standards to facilitate an integrated use of VR. The convergence towards similar data structures in VR and other construction packages may enable visualisation to be better utilised in the overall lifecycle model.

Collaboration in data mining virtual organisation

Relevância:

100.00% 100.00%

Publicador:

Linear-wavelet networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a nonlinear regression structure comprising a wavelet network and a linear term. The introduction of the linear term is aimed at providing a more parsimonious interpolation in high-dimensional spaces when the modelling samples are sparse. A constructive procedure for building such structures, termed linear-wavelet networks, is described. For illustration, the proposed procedure is employed in the framework of dynamic system identification. In an example involving a simulated fermentation process, it is shown that a linear-wavelet network yields a smaller approximation error when compared with a wavelet network with the same number of regressors. The proposed technique is also applied to the identification of a pressure plant from experimental data. In this case, the results show that the introduction of wavelets considerably improves the prediction ability of a linear model. Standard errors on the estimated model coefficients are also calculated to assess the numerical conditioning of the identification process.

The Climate-G testbed: towards large scale distributed data management for climate change

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Climate-G is a large scale distributed testbed devoted to climate change research. It is an unfunded effort started in 2008 and involving a wide community both in Europe and US. The testbed is an interdisciplinary effort involving partners from several institutions and joining expertise in the field of climate change and computational science. Its main goal is to allow scientists carrying out geographical and cross-institutional data discovery, access, analysis, visualization and sharing of climate data. It represents an attempt to address, in a real environment, challenging data and metadata management issues. This paper presents a complete overview about the Climate-G testbed highlighting the most important results that have been achieved since the beginning of this project.

Distributed classification for pocket data mining

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed and collaborative data stream mining in a mobile computing environment is referred to as Pocket Data Mining PDM. Large amounts of available data streams to which smart phones can subscribe to or sense, coupled with the increasing computational power of handheld devices motivates the development of PDM as a decision making system. This emerging area of study has shown to be feasible in an earlier study using technological enablers of mobile software agents and stream mining techniques [1]. A typical PDM process would start by having mobile agents roam the network to discover relevant data streams and resources. Then other (mobile) agents encapsulating stream mining techniques visit the relevant nodes in the network in order to build evolving data mining models. Finally, a third type of mobile agents roam the network consulting the mining agents for a final collaborative decision, when required by one or more users. In this paper, we propose the use of distributed Hoeffding trees and Naive Bayes classifers in the PDM framework over vertically partitioned data streams. Mobile policing, health monitoring and stock market analysis are among the possible applications of PDM. An extensive experimental study is reported showing the effectiveness of the collaborative data mining with the two classifers.

Homogeneous and heterogeneous distributed classification for pocket data mining

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pocket Data Mining (PDM) describes the full process of analysing data streams in mobile ad hoc distributed environments. Advances in mobile devices like smart phones and tablet computers have made it possible for a wide range of applications to run in such an environment. In this paper, we propose the adoption of data stream classification techniques for PDM. Evident by a thorough experimental study, it has been proved that running heterogeneous/different, or homogeneous/similar data stream classification techniques over vertically partitioned data (data partitioned according to the feature space) results in comparable performance to batch and centralised learning techniques.

Sequence-specific reconstruction from fragmentary databases using seed sequences: implementation and validation on SAGE, proteome and generic sequencing data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.

Slicing the metric space to provide quick indexing of complex data in the main memory

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Searching in a dataset for elements that are similar to a given query element is a core problem in applications that manage complex data, and has been aided by metric access methods (MAMs). A growing number of applications require indices that must be built faster and repeatedly, also providing faster response for similarity queries. The increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper. we propose the Onion-tree, a new and robust dynamic memory-based MAM that slices the metric space into disjoint subspaces to provide quick indexing of complex data. It introduces three major characteristics: (i) a partitioning method that controls the number of disjoint subspaces generated at each node; (ii) a replacement technique that can change the leaf node pivots in insertion operations; and (iii) range and k-NN extended query algorithms to support the new partitioning method, including a new visit order of the subspaces in k-NN queries. Performance tests with both real-world and synthetic datasets showed that the Onion-tree is very compact. Comparisons of the Onion-tree with the MM-tree and a memory-based version of the Slim-tree showed that the Onion-tree was always faster to build the index. The experiments also showed that the Onion-tree significantly improved range and k-NN query processing performance and was the most efficient MAM, followed by the MM-tree, which in turn outperformed the Slim-tree in almost all the tests. (C) 2010 Elsevier B.V. All rights reserved.

«
1
2
...
28
29
30
31
32
33
34
...
63
64
»