948 resultados para Data structures (Computer science)
Resumo:
Normal mixture models are being increasingly used to model the distributions of a wide variety of random phenomena and to cluster sets of continuous multivariate data. However, for a set of data containing a group or groups of observations with longer than normal tails or atypical observations, the use of normal components may unduly affect the fit of the mixture model. In this paper, we consider a more robust approach by modelling the data by a mixture of t distributions. The use of the ECM algorithm to fit this t mixture model is described and examples of its use are given in the context of clustering multivariate data in the presence of atypical observations in the form of background noise.
Resumo:
Continuous-valued recurrent neural networks can learn mechanisms for processing context-free languages. The dynamics of such networks is usually based on damped oscillation around fixed points in state space and requires that the dynamical components are arranged in certain ways. It is shown that qualitatively similar dynamics with similar constraints hold for a(n)b(n)c(n), a context-sensitive language. The additional difficulty with a(n)b(n)c(n), compared with the context-free language a(n)b(n), consists of 'counting up' and 'counting down' letters simultaneously. The network solution is to oscillate in two principal dimensions, one for counting up and one for counting down. This study focuses on the dynamics employed by the sequential cascaded network, in contrast to the simple recurrent network, and the use of backpropagation through time. Found solutions generalize well beyond training data, however, learning is not reliable. The contribution of this study lies in demonstrating how the dynamics in recurrent neural networks that process context-free languages can also be employed in processing some context-sensitive languages (traditionally thought of as requiring additional computation resources). This continuity of mechanism between language classes contributes to our understanding of neural networks in modelling language learning and processing.
Resumo:
The movement of chemicals through the soil to the groundwater or discharged to surface waters represents a degradation of these resources. In many cases, serious human and stock health implications are associated with this form of pollution. The chemicals of interest include nutrients, pesticides, salts, and industrial wastes. Recent studies have shown that current models and methods do not adequately describe the leaching of nutrients through soil, often underestimating the risk of groundwater contamination by surface-applied chemicals, and overestimating the concentration of resident solutes. This inaccuracy results primarily from ignoring soil structure and nonequilibrium between soil constituents, water, and solutes. A multiple sample percolation system (MSPS), consisting of 25 individual collection wells, was constructed to study the effects of localized soil heterogeneities on the transport of nutrients (NO3-, Cl-, PO43-) in the vadose zone of an agricultural soil predominantly dominated by clay. Very significant variations in drainage patterns across a small spatial scale were observed tone-way ANOVA, p < 0.001) indicating considerable heterogeneity in water flow patterns and nutrient leaching. Using data collected from the multiple sample percolation experiments, this paper compares the performance of two mathematical models for predicting solute transport, the advective-dispersion model with a reaction term (ADR), and a two-region preferential flow model (TRM) suitable for modelling nonequilibrium transport. These results have implications for modelling solute transport and predicting nutrient loading on a larger scale. (C) 2001 Elsevier Science Ltd. All rights reserved.
Resumo:
The World Wide Web (WWW) is useful for distributing scientific data. Most existing web data resources organize their information either in structured flat files or relational databases with basic retrieval capabilities. For databases with one or a few simple relations, these approaches are successful, but they can be cumbersome when there is a data model involving multiple relations between complex data. We believe that knowledge-based resources offer a solution in these cases. Knowledge bases have explicit declarations of the concepts in the domain, along with the relations between them. They are usually organized hierarchically, and provide a global data model with a controlled vocabulary, We have created the OWEB architecture for building online scientific data resources using knowledge bases. OWEB provides a shell for structuring data, providing secure and shared access, and creating computational modules for processing and displaying data. In this paper, we describe the translation of the online immunological database MHCPEP into an OWEB system called MHCWeb. This effort involved building a conceptual model for the data, creating a controlled terminology for the legal values for different types of data, and then translating the original data into the new structure. The 0 WEB environment allows for flexible access to the data by both users and computer programs.
Resumo:
Sum: Plant biologists in fields of ecology, evolution, genetics and breeding frequently use multivariate methods. This paper illustrates Principal Component Analysis (PCA) and Gabriel's biplot as applied to microarray expression data from plant pathology experiments. Availability: An example program in the publicly distributed statistical language R is available from the web site (www.tpp.uq.edu.au) and by e-mail from the contact. Contact: scott.chapman@csiro.au.
Resumo:
This paper is concerned with the use of scientific visualization methods for the analysis of feedforward neural networks (NNs). Inevitably, the kinds of data associated with the design and implementation of neural networks are of very high dimensionality, presenting a major challenge for visualization. A method is described using the well-known statistical technique of principal component analysis (PCA). This is found to be an effective and useful method of visualizing the learning trajectories of many learning algorithms such as back-propagation and can also be used to provide insight into the learning process and the nature of the error surface.
Resumo:
In this paper, genetic algorithm (GA) is applied to the optimum design of reinforced concrete liquid retaining structures, which comprise three discrete design variables, including slab thickness, reinforcement diameter and reinforcement spacing. GA, being a search technique based on the mechanics of natural genetics, couples a Darwinian survival-of-the-fittest principle with a random yet structured information exchange amongst a population of artificial chromosomes. As a first step, a penalty-based strategy is entailed to transform the constrained design problem into an unconstrained problem, which is appropriate for GA application. A numerical example is then used to demonstrate strength and capability of the GA in this domain problem. It is shown that, only after the exploration of a minute portion of the search space, near-optimal solutions are obtained at an extremely converging speed. The method can be extended to application of even more complex optimization problems in other domains.
Resumo:
While multimedia data, image data in particular, is an integral part of most websites and web documents, our quest for information so far is still restricted to text based search. To explore the World Wide Web more effectively, especially its rich repository of truly multimedia information, we are facing a number of challenging problems. Firstly, we face the ambiguous and highly subjective nature of defining image semantics and similarity. Secondly, multimedia data could come from highly diversified sources, as a result of automatic image capturing and generation processes. Finally, multimedia information exists in decentralised sources over the Web, making it difficult to use conventional content-based image retrieval (CBIR) techniques for effective and efficient search. In this special issue, we present a collection of five papers on visual and multimedia information management and retrieval topics, addressing some aspects of these challenges. These papers have been selected from the conference proceedings (Kluwer Academic Publishers, ISBN: 1-4020- 7060-8) of the Sixth IFIP 2.6 Working Conference on Visual Database Systems (VDB6), held in Brisbane, Australia, on 29–31 May 2002.
Resumo:
Purpose - The purpose of this paper is to provide a framework for radio frequency identification (RFID) technology adoption considering company size and five dimensions of analysis: RFID applications, expected benefits business drivers or motivations barriers and inhibitors, and organizational factors. Design/methodology/approach - A framework for RFID adoption derived from literature and the practical experience on the subject is developed. This framework provides a conceptual basis for analyzing a survey conducted with 114 companies in Brazil. Findings - Many companies have been developing RFID initiatives in order to identify potential applications and map benefits associated with their implementation. The survey highlights the importance business drivers in the RFID implementation stage, and that companies implement RFID focusing on a few specific applications. However, there is a weak association between expected benefits and business challenges with the current level of RFID technology adoption in Brazil. Research limitations/implications - The paper is not exhaustive, since RFID adoption in Brazil is at early stages during the survey timeline. Originality/value - The main contribution of the paper is that it yields a framework for analyzing RFID technology adoption. The authors use this framework to analyze RFID adoption in Brazil, which proved to be a useful one for identifying key issues for technology adoption. The paper is useful to any researchers or practitioners who are focused on technology adoption, in particular, RFID technology.
Resumo:
Purpose - The purpose of this paper is to verify if Brazilian companies are adopting environmental requirements in the supplier selection process. Further, this paper intends to analyze whether there is a relation between the level of environmental management maturity and the inclusion of environmental criteria in the companies` selection of suppliers. Design/methodology/approach - A review of mainstream literature on environmental management, traditional criteria in the supplier selection process and the incorporation of environmental requirements in this context. The empirical study`s strategy is based on five Brazilian case studies with industrial companies. Face-to-face interviews and informal conversations are to be held, explanations made by e-mail with representatives from the purchasing, environmental management, logistics and other areas, and observation and the collection of company documents are also employed. Findings - Based on the cases, it is concluded that companies still use traditional criteria to select suppliers, such as quality and cost, and do not adopt environmental requirements in the supplier selection process in a uniform manner. Evidence found shows that the level of environmental management maturity influences the depth with which companies adopt environmental criteria when selecting suppliers. Thus, a company with more advanced environmental management adopts more formal procedures for selecting environmentally appropriate suppliers than others. Originality/value - This is the first known study to verify if Brazilian companies are adopting environmental requirements in the supplier selection process.
Resumo:
Purpose - The purpose of this research is to shed light on the main barriers faced by Mozambican micro and small enterprises (MSEs) and their implications in respect to the support policies available for these enterprises. Design/methodology/approach - A literature review was made on those barriers faced by the MSEs and on the policies and governmental instruments of assistance available for MSEs. Then, a two-step research was conducted. The first phase consisted of collecting data from 21 MSEs in Mozambique, mainly by means of interviews where the main barriers faced by those interviewed were identified and hence, this led to the second phase, which was interviewing governmental/support entities in order to know what they had done to minimize those barriers which had been identified by the entrepreneurs. Findings - The results show that financial and competitive barriers are the main barriers faced by the analyzed MSEs. These barriers vary according to the field of activity of the enterprises. Originality/value - This study serves to enrich the state of the art on the subject of smaller enterprises in Africa and will specially. help to fill the lack of academic research available about Mozambique.
Resumo:
We discuss the expectation propagation (EP) algorithm for approximate Bayesian inference using a factorizing posterior approximation. For neural network models, we use a central limit theorem argument to make EP tractable when the number of parameters is large. For two types of models, we show that EP can achieve optimal generalization performance when data are drawn from a simple distribution.
Resumo:
PHWAT is a new model that couples a geochemical reaction model (PHREEQC-2) with a density-dependent groundwater flow and solute transport model (SEAWAT) using the split-operator approach. PHWAT was developed to simulate multi-component reactive transport in variable density groundwater flow. Fluid density in PHWAT depends not on only the concentration of a single species as in SEAWAT, but also the concentrations of other dissolved chemicals that can be subject to reactive processes. Simulation results of PHWAT and PHREEQC-2 were compared in their predictions of effluent concentration from a column experiment. Both models produced identical results, showing that PHWAT has correctly coupled the sub-packages. PHWAT was then applied to the simulation of a tank experiment in which seawater intrusion was accompanied by cation exchange. The density dependence of the intrusion and the snow-plough effect in the breakthrough curves were reflected in the model simulations, which were in good agreement with the measured breakthrough data. Comparison simulations that, in turn, excluded density effects and reactions allowed us to quantify the marked effect of ignoring these processes. Next, we explored numerical issues involved in the practical application of PHWAT using the example of a dense plume flowing into a tank containing fresh water. It was shown that PHWAT could model physically unstable flow and that numerical instabilities were suppressed. Physical instability developed in the model in accordance with the increase of the modified Rayleigh number for density-dependent flow, in agreement with previous research. (c) 2004 Elsevier Ltd. All rights reserved.
Resumo:
Test templates and a test template framework are introduced as useful concepts in specification-based testing. The framework can be defined using any model-based specification notation and used to derive tests from model-based specifications-in this paper, it is demonstrated using the Z notation. The framework formally defines test data sets and their relation to the operations in a specification and to other test data sets, providing structure to the testing process. Flexibility is preserved, so that many testing strategies can be used. Important application areas of the framework are discussed, including refinement of test data, regression testing, and test oracles.
Resumo:
In order to separate the effects of experience from other characteristics of word frequency (e.g., orthographic distinctiveness), computer science and psychology students rated their experience with computer science technical items and nontechnical items from a wide range of word frequencies prior to being tested for recognition memory of the rated items. For nontechnical items, there was a curvilinear relationship between recognition accuracy and word frequency for both groups of students. The usual superiority of low-frequency words was demonstrated and high-frequency words were recognized least well. For technical items, a similar curvilinear relationship was evident for the psychology students, but for the computer science students, recognition accuracy was inversely related to word frequency. The ratings data showed that subjective experience rather than background word frequency was the better predictor of recognition accuracy.