908 resultados para Data Structure Operations
Resumo:
The design of a network is a solution to several engineering and science problems. Several network design problems are known to be NP-hard, and population-based metaheuristics like evolutionary algorithms (EAs) have been largely investigated for such problems. Such optimization methods simultaneously generate a large number of potential solutions to investigate the search space in breadth and, consequently, to avoid local optima. Obtaining a potential solution usually involves the construction and maintenance of several spanning trees, or more generally, spanning forests. To efficiently explore the search space, special data structures have been developed to provide operations that manipulate a set of spanning trees (population). For a tree with n nodes, the most efficient data structures available in the literature require time O(n) to generate a new spanning tree that modifies an existing one and to store the new solution. We propose a new data structure, called node-depth-degree representation (NDDR), and we demonstrate that using this encoding, generating a new spanning forest requires average time O(root n). Experiments with an EA based on NDDR applied to large-scale instances of the degree-constrained minimum spanning tree problem have shown that the implementation adds small constants and lower order terms to the theoretical bound.
Resumo:
The objective of the present study was to investigate the effect of data structure on estimated genetic parameters and predicted breeding values of direct and maternal genetic effects for weaning weight (WW) and weight gain from birth to weaning (BWG), including or not the genetic covariance between direct and maternal effects. Records of 97,490 Nellore animals born between 1993 and 2006, from the Jacarezinho cattle raising farm, were used. Two different data sets were analyzed: DI_all, which included all available progenies of dams without their own performance; DII_all, which included DI_all + 20% of recorded progenies with maternal phenotypes. Two subsets were obtained from each data set (DI_all and DII_all): DI_1 and DII_1, which included only dams with three or fewer progenies; DI_5 and DII_5, which included only dams with five or more progenies. (Co)variance components and heritabilities were estimated by Bayesian inference through Gibbs sampling using univariate animal models. In general, for the population and traits studied, the proportion of dams with known phenotypic information and the number of progenies per dam influenced direct and maternal heritabilities, as well as the contribution of maternal permanent environmental variance to phenotypic variance. Only small differences were observed in the genetic and environmental parameters when the genetic covariance between direct and maternal effects was set to zero in the data sets studied. Thus, the inclusion or not of the genetic covariance between direct and maternal effects had little effect on the ranking of animals according to their breeding values for WW and BWG. Accurate estimation of genetic correlations between direct and maternal genetic effects depends on the data structure. Thus, this covariance should be set to zero in Nellore data sets in which the proportion of dams with phenotypic information is low, the number of progenies per dam is small, and pedigree relationships are poorly known. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
This paper aims to present the use of a learning object (CADILAG), developed to facilitate understanding data structure operations by using visual presentations and animations. The CADILAG allows visualizing the behavior of algorithms usually discussed during Computer Science and Information System courses. For each data structure it is possible visualizing its content and its operation dynamically. Its use was evaluated an the results are presented. © 2012 AISTI.
Resumo:
Population measures for genetic programs are defined and analysed in an attempt to better understand the behaviour of genetic programming. Some measures are simple, but do not provide sufficient insight. The more meaningful ones are complex and take extra computation time. Here we present a unified view on the computation of population measures through an information hypertree (iTree). The iTree allows for a unified and efficient calculation of population measures via a basic tree traversal. © Springer-Verlag 2004.
Resumo:
Abstract not available
Resumo:
We present building blocks for algorithms for the efficient reduction of square factor, i.e. direct repetitions in strings. So the basic problem is this: given a string, compute all strings that can be obtained by reducing factors of the form zz to z. Two types of algorithms are treated: an offline algorithm is one that can compute a data structure on the given string in advance before the actual search for the square begins; in contrast, online algorithms receive all input only at the time when a request is made. For offline algorithms we treat the following problem: Let u and w be two strings such that w is obtained from u by reducing a square factor zz to only z. If we further are given the suffix table of u, how can we derive the suffix table for w without computing it from scratch? As the suffix table plays a key role in online algorithms for the detection of squares in a string, this derivation can make the iterated reduction of squares more efficient. On the other hand, we also show how a suffix array, used for the offline detection of squares, can be adapted to the new string resulting from the deletion of a square. Because the deletion is a very local change, this adaption is more eficient than the computation of the new suffix array from scratch.
Resumo:
This thesis work describes the creation of a pipework data structure for design system integration. Work is completed in pulp and paper plant delivery company with global engineering network operations in mind. User case of process design to 3D pipework design is introduced with influence of subcontracting engineering offices. Company data element list is gathered by using key person interviews and results are processed into a pipework data element list. Inter-company co-operation is completed in standardization association and common standard for pipework data elements is found. As result inter-company created pipework data element list is introduced. Further list usage, development and relations to design software vendors are evaluated.
Resumo:
Data from civil engineering projects can inform the operation of built infrastructure. This paper captures lessons for such data handover, from projects into operations, through interviews with leading clients and their supply chain. Clients are found to value receiving accurate and complete data. They recognise opportunities to use high quality information in decision-making about capital and operational expenditure; as well as in ensuring compliance with regulatory requirements. Providing this value to clients is a motivation for information management in projects. However, data handover is difficult as key people leave before project completion; and different data formats and structures are used in project delivery and operations. Lessons learnt from leading practice include defining data requirements at the outset, getting operations teams involved early, shaping the evolution of interoperable systems and standards, developing handover processes to check data rather than documentation, and fostering skills to use and update project data in operations
Resumo:
Spatial data warehouses (SDWs) allow for spatial analysis together with analytical multidimensional queries over huge volumes of data. The challenge is to retrieve data related to ad hoc spatial query windows according to spatial predicates, avoiding the high cost of joining large tables. Therefore, mechanisms to provide efficient query processing over SDWs are essential. In this paper, we propose two efficient indices for SDW: the SB-index and the HSB-index. The proposed indices share the following characteristics. They enable multidimensional queries with spatial predicate for SDW and also support predefined spatial hierarchies. Furthermore, they compute the spatial predicate and transform it into a conventional one, which can be evaluated together with other conventional predicates by accessing a star-join Bitmap index. While the SB-index has a sequential data structure, the HSB-index uses a hierarchical data structure to enable spatial objects clustering and a specialized buffer-pool to decrease the number of disk accesses. The advantages of the SB-index and the HSB-index over the DBMS resources for SDW indexing (i.e. star-join computation and materialized views) were investigated through performance tests, which issued roll-up operations extended with containment and intersection range queries. The performance results showed that improvements ranged from 68% up to 99% over both the star-join computation and the materialized view. Furthermore, the proposed indices proved to be very compact, adding only less than 1% to the storage requirements. Therefore, both the SB-index and the HSB-index are excellent choices for SDW indexing. Choosing between the SB-index and the HSB-index mainly depends on the query selectivity of spatial predicates. While low query selectivity benefits the HSB-index, the SB-index provides better performance for higher query selectivity.
Resumo:
Managing large medical image collections is an increasingly demanding important issue in many hospitals and other medical settings. A huge amount of this information is daily generated, which requires robust and agile systems. In this paper we present a distributed multi-agent system capable of managing very large medical image datasets. In this approach, agents extract low-level information from images and store them in a data structure implemented in a relational database. The data structure can also store semantic information related to images and particular regions. A distinctive aspect of our work is that a single image can be divided so that the resultant sub-images can be stored and managed separately by different agents to improve performance in data accessing and processing. The system also offers the possibility of applying some region-based operations and filters on images, facilitating image classification. These operations can be performed directly on data structures in the database.
Resumo:
The schema of an information system can significantly impact the ability of end users to efficiently and effectively retrieve the information they need. Obtaining quickly the appropriate data increases the likelihood that an organization will make good decisions and respond adeptly to challenges. This research presents and validates a methodology for evaluating, ex ante, the relative desirability of alternative instantiations of a model of data. In contrast to prior research, each instantiation is based on a different formal theory. This research theorizes that the instantiation that yields the lowest weighted average query complexity for a representative sample of information requests is the most desirable instantiation for end-user queries. The theory was validated by an experiment that compared end-user performance using an instantiation of a data structure based on the relational model of data with performance using the corresponding instantiation of the data structure based on the object-relational model of data. Complexity was measured using three different Halstead metrics: program length, difficulty, and effort. For a representative sample of queries, the average complexity using each instantiation was calculated. As theorized, end users querying the instantiation with the lower average complexity made fewer semantic errors, i.e., were more effective at composing queries. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
With the proliferation of relational database programs for PC's and other platforms, many business end-users are creating, maintaining, and querying their own databases. More importantly, business end-users use the output of these queries as the basis for operational, tactical, and strategic decisions. Inaccurate data reduce the expected quality of these decisions. Implementing various input validation controls, including higher levels of normalisation, can reduce the number of data anomalies entering the databases. Even in well-maintained databases, however, data anomalies will still accumulate. To improve the quality of data, databases can be queried periodically to locate and correct anomalies. This paper reports the results of two experiments that investigated the effects of different data structures on business end-users' abilities to detect data anomalies in a relational database. The results demonstrate that both unnormalised and higher levels of normalisation lower the effectiveness and efficiency of queries relative to the first normal form. First normal form databases appear to provide the most effective and efficient data structure for business end-users formulating queries to detect data anomalies.
Resumo:
This paper presents the SmartClean tool. The purpose of this tool is to detect and correct the data quality problems (DQPs). Compared with existing tools, SmartClean has the following main advantage: the user does not need to specify the execution sequence of the data cleaning operations. For that, an execution sequence was developed. The problems are manipulated (i.e., detected and corrected) following that sequence. The sequence also supports the incremental execution of the operations. In this paper, the underlying architecture of the tool is presented and its components are described in detail. The tool's validity and, consequently, of the architecture is demonstrated through the presentation of a case study. Although SmartClean has cleaning capabilities in all other levels, in this paper are only described those related with the attribute value level.