803 resultados para Computer science, Information systems
Resumo:
This paper proposes and describes an architecture that allows the both engineer and programmer for defining and quantifying which peripheral of a microcontroller will be important to the particular project. For each application, it is necessary to use different types of peripherals. In this study, we have verified the possibility for emulating the behavior of peripheral in specifically CPUs. These CPUs hold a RAM memory, where code spaces specifically written for them could represent the behavior of some target peripheral, which are loaded and executed on it. We believed that the proposed architecture will provide larger flexibility in the use of the microcontrolles since this ""dedicated hardware components"" don`t execute to a special function, but it is a hardware capable to self adapt to the needs of each project. This research had as fundament a comparative study of four current microcontrollers. Preliminary tests using VHDL and FPGAs were done.
Resumo:
Thymidine monophosphate kinase (TMPK) has emerged as an attractive target for developing inhibitors of Mycobacterium tuberculosis growth. In this study the receptor-independent (RI) 4D-QSAR formalism has been used to develop QSAR models and corresponding 3D-pharmacophores for a set of 5`-thiourea-substituted alpha-thymidine inhibitors. Models were developed for the entire training set and for a subset of the training set consisting of the most potent inhibitors. The optimized (RI) 4D-QSAR models are statistically significant (r(2) = 0.90, q(2) = 0.83 entire set, r(2) = 0.86, q(2) = 0.80 high potency subset) and also possess good predictivity based on test set predictions. The most and least potent inhibitors, in their respective postulated active conformations derived from the models, were docked in the active site of the TMPK crystallographic structure. There is a solid consistency between the 3D-pharmacophore sites defined by the QSAR models and interactions with binding site residues. This model identifies new regions of the inhibitors that contain pharmacophore sites, such as the sugar-pyrimidine ring structure and the region of the 5`-arylthiourea moiety. These new regions of the ligands can be further explored and possibly exploited to identify new, novel, and, perhaps, better antituberculosis inhibitors of TMPKmt. Furthermore, the 3D-pharmacophores defined by these models can be used as a starting point for future receptor-dependent antituberculosis drug design as well as to elucidate candidate sites for substituent addition to optimize ADMET properties of analog inhibitors.
Resumo:
Recently, we have built a classification model that is capable of assigning a given sesquiterpene lactone (STL) into exactly one tribe of the plant family Asteraceae from which the STL has been isolated. Although many plant species are able to biosynthesize a set of peculiar compounds, the occurrence of the same secondary metabolites in more than one tribe of Asteraceae is frequent. Building on our previous work, in this paper, we explore the possibility of assigning an STL to more than one tribe (class) simultaneously. When an object may belong to more than one class simultaneously, it is called multilabeled. In this work, we present a general overview of the techniques available to examine multilabeled data. The problem of evaluating the performance of a multilabeled classifier is discussed. Two particular multilabeled classification methods-cross-training with support vector machines (ct-SVM) and multilabeled k-nearest neighbors (M-L-kNN)were applied to the classification of the STLs into seven tribes from the plant family Asteraceae. The results are compared to a single-label classification and are analyzed from a chemotaxonomic point of view. The multilabeled approach allowed us to (1) model the reality as closely as possible, (2) improve our understanding of the relationship between the secondary metabolite profiles of different Asteraceae tribes, and (3) significantly decrease the number of plant sources to be considered for finding a certain STL. The presented classification models are useful for the targeted collection of plants with the objective of finding plant sources of natural compounds that are biologically active or possess other specific properties of interest.
Resumo:
Coset enumeration is a most important procedure for investigating finitely presented groups. We present a practical parallel procedure for coset enumeration on shared memory processors. The shared memory architecture is particularly interesting because such parallel computation is both faster and cheaper. The lower cost comes when the program requires large amounts of memory, and additional CPU's. allow us to lower the time that the expensive memory is being used. Rather than report on a suite of test cases, we take a single, typical case, and analyze the performance factors in-depth. The parallelization is achieved through a master-slave architecture. This results in an interesting phenomenon, whereby the CPU time is divided into a sequential and a parallel portion, and the parallel part demonstrates a speedup that is linear in the number of processors. We describe an early version for which only 40% of the program was parallelized, and we describe how this was modified to achieve 90% parallelization while using 15 slave processors and a master. In the latter case, a sequential time of 158 seconds was reduced to 29 seconds using 15 slaves.
Resumo:
This paper develops an interactive approach for exploratory spatial data analysis. Measures of attribute similarity and spatial proximity are combined in a clustering model to support the identification of patterns in spatial information. Relationships between the developed clustering approach, spatial data mining and choropleth display are discussed. Analysis of property crime rates in Brisbane, Australia is presented. A surprising finding in this research is that there are substantial inconsistencies in standard choropleth display options found in two widely used commercial geographical information systems, both in terms of definition and performance. The comparative results demonstrate the usefulness and appeal of the developed approach in a geographical information system environment for exploratory spatial data analysis.
Resumo:
The World Wide Web (WWW) is useful for distributing scientific data. Most existing web data resources organize their information either in structured flat files or relational databases with basic retrieval capabilities. For databases with one or a few simple relations, these approaches are successful, but they can be cumbersome when there is a data model involving multiple relations between complex data. We believe that knowledge-based resources offer a solution in these cases. Knowledge bases have explicit declarations of the concepts in the domain, along with the relations between them. They are usually organized hierarchically, and provide a global data model with a controlled vocabulary, We have created the OWEB architecture for building online scientific data resources using knowledge bases. OWEB provides a shell for structuring data, providing secure and shared access, and creating computational modules for processing and displaying data. In this paper, we describe the translation of the online immunological database MHCPEP into an OWEB system called MHCWeb. This effort involved building a conceptual model for the data, creating a controlled terminology for the legal values for different types of data, and then translating the original data into the new structure. The 0 WEB environment allows for flexible access to the data by both users and computer programs.
Resumo:
The principle of using induction rules based on spatial environmental data to model a soil map has previously been demonstrated Whilst the general pattern of classes of large spatial extent and those with close association with geology were delineated small classes and the detailed spatial pattern of the map were less well rendered Here we examine several strategies to improve the quality of the soil map models generated by rule induction Terrain attributes that are better suited to landscape description at a resolution of 250 m are introduced as predictors of soil type A map sampling strategy is developed Classification error is reduced by using boosting rather than cross validation to improve the model Further the benefit of incorporating the local spatial context for each environmental variable into the rule induction is examined The best model was achieved by sampling in proportion to the spatial extent of the mapped classes boosting the decision trees and using spatial contextual information extracted from the environmental variables.
Resumo:
Minimal perfect hash functions are used for memory efficient storage and fast retrieval of items from static sets. We present an infinite family of efficient and practical algorithms for generating order preserving minimal perfect hash functions. We show that almost all members of the family construct space and time optimal order preserving minimal perfect hash functions, and we identify the one with minimum constants. Members of the family generate a hash function in two steps. First a special kind of function into an r-graph is computed probabilistically. Then this function is refined deterministically to a minimal perfect hash function. We give strong theoretical evidence that the first step uses linear random time. The second step runs in linear deterministic time. The family not only has theoretical importance, but also offers the fastest known method for generating perfect hash functions.
Resumo:
In this paper, we propose a method based on association rule-mining to enhance the diagnosis of medical images (mammograms). It combines low-level features automatically extracted from images and high-level knowledge from specialists to search for patterns. Our method analyzes medical images and automatically generates suggestions of diagnoses employing mining of association rules. The suggestions of diagnosis are used to accelerate the image analysis performed by specialists as well as to provide them an alternative to work on. The proposed method uses two new algorithms, PreSAGe and HiCARe. The PreSAGe algorithm combines, in a single step, feature selection and discretization, and reduces the mining complexity. Experiments performed on PreSAGe show that this algorithm is highly suitable to perform feature selection and discretization in medical images. HiCARe is a new associative classifier. The HiCARe algorithm has an important property that makes it unique: it assigns multiple keywords per image to suggest a diagnosis with high values of accuracy. Our method was applied to real datasets, and the results show high sensitivity (up to 95%) and accuracy (up to 92%), allowing us to claim that the use of association rules is a powerful means to assist in the diagnosing task.
Resumo:
An important feature of some conceptual modelling grammars is the features they provide to allow database designers to show real-world things may or may not possess a particular attribute or relationship. In the entity-relationship model, for example, the fact that a thing may not possess an attribute can be represented by using a special symbol to indicate that the attribute is optional. Similarly, the fact that a thing may or may not be involved in a relationship can be represented by showing the minimum cardinality of the relationship as zero. Whether these practices should be followed, however, is a contentious issue. An alternative approach is to eliminate optional attributes and relationships from conceptual schema diagrams by using subtypes that have only mandatory attributes and relationships. In this paper, we first present a theory that led us to predict that optional attributes and relationships should be used in conceptual schema diagrams only when users of the diagrams require a surface-level understanding of the domain being represented by the diagrams. When users require a deep-level understanding, however, optional attributes and relationships should not be used because they undermine users' abilities to grasp important domain semantics. We describe three experiments which we then undertook to test our predictions. The results of the experiments support our predictions.
Resumo:
With the proliferation of relational database programs for PC's and other platforms, many business end-users are creating, maintaining, and querying their own databases. More importantly, business end-users use the output of these queries as the basis for operational, tactical, and strategic decisions. Inaccurate data reduce the expected quality of these decisions. Implementing various input validation controls, including higher levels of normalisation, can reduce the number of data anomalies entering the databases. Even in well-maintained databases, however, data anomalies will still accumulate. To improve the quality of data, databases can be queried periodically to locate and correct anomalies. This paper reports the results of two experiments that investigated the effects of different data structures on business end-users' abilities to detect data anomalies in a relational database. The results demonstrate that both unnormalised and higher levels of normalisation lower the effectiveness and efficiency of queries relative to the first normal form. First normal form databases appear to provide the most effective and efficient data structure for business end-users formulating queries to detect data anomalies.
Resumo:
Map algebra is a data model and simple functional notation to study the distribution and patterns of spatial phenomena. It uses a uniform representation of space as discrete grids, which are organized into layers. This paper discusses extensions to map algebra to handle neighborhood operations with a new data type called a template. Templates provide general windowing operations on grids to enable spatial models for cellular automata, mathematical morphology, and local spatial statistics. A programming language for map algebra that incorporates templates and special processing constructs is described. The programming language is called MapScript. Example program scripts are presented to perform diverse and interesting neighborhood analysis for descriptive, model-based and processed-based analysis.
Resumo:
Most Internet search engines are keyword-based. They are not efficient for the queries where geographical location is important, such as finding hotels within an area or close to a place of interest. A natural interface for spatial searching is a map, which can be used not only to display locations of search results but also to assist forming search conditions. A map-based search engine requires a well-designed visual interface that is intuitive to use yet flexible and expressive enough to support various types of spatial queries as well as aspatial queries. Similar to hyperlinks for text and images in an HTML page, spatial objects in a map should support hyperlinks. Such an interface needs to be scalable with the size of the geographical regions and the number of websites it covers. In spite of handling typically a very large amount of spatial data, a map-based search interface should meet the expectation of fast response time for interactive applications. In this paper we discuss general requirements and the design for a new map-based web search interface, focusing on integration with the WWW and visual spatial query interface. A number of current and future research issues are discussed, and a prototype for the University of Queensland is presented. (C) 2001 Published by Elsevier Science Ltd.
Resumo:
A data warehouse is a data repository which collects and maintains a large amount of data from multiple distributed, autonomous and possibly heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the integrated data. One of the most important decisions in designing a data warehouse is the selection of views for materialization. The objective is to select an appropriate set of views that minimizes the total query response time with the constraint that the total maintenance time for these materialized views is within a given bound. This view selection problem is totally different from the view selection problem under the disk space constraint. In this paper the view selection problem under the maintenance time constraint is investigated. Two efficient, heuristic algorithms for the problem are proposed. The key to devising the proposed algorithms is to define good heuristic functions and to reduce the problem to some well-solved optimization problems. As a result, an approximate solution of the known optimization problem will give a feasible solution of the original problem. (C) 2001 Elsevier Science B.V. All rights reserved.