969 resultados para Databases - Duplicate tuples
Resumo:
Aiming to ensure greater reliability and consistency of data stored in the database, the data cleaning stage is set early in the process of Knowledge Discovery in Databases (KDD) and is responsible for eliminating problems and adjust the data for the later stages, especially for the stage of data mining. Such problems occur in the instance level and schema, namely, missing values, null values, duplicate tuples, values outside the domain, among others. Several algorithms were developed to perform the cleaning step in databases, some of them were developed specifically to work with the phonetics of words, since a word can be written in different ways. Within this perspective, this work presents as original contribution an optimization of algorithm for the detection of duplicate tuples in databases through phonetic based on multithreading without the need for trained data, as well as an independent environment of language to be supported for this. © 2011 IEEE.
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Impact of Commercial Search Engines and International Databases on Engineering Teaching and Research
Resumo:
For the last three decades, the engineering higher education and professional environments have been completely transformed by the "electronic/digital information revolution" that has included the introduction of personal computer, the development of email and world wide web, and broadband Internet connections at home. Herein the writer compares the performances of several digital tools with traditional library resources. While new specialised search engines and open access digital repositories may fill a gap between conventional search engines and traditional references, these should be not be confused with real libraries and international scientific databases that encompass textbooks and peer-reviewed scholarly works. An absence of listing in some Internet search listings, databases and repositories is not an indication of standing. Researchers, engineers and academics should remember these key differences in assessing the quality of bibliographic "research" based solely upon Internet searches.
Resumo:
The new technologies for Knowledge Discovery from Databases (KDD) and data mining promise to bring new insights into a voluminous growing amount of biological data. KDD technology is complementary to laboratory experimentation and helps speed up biological research. This article contains an introduction to KDD, a review of data mining tools, and their biological applications. We discuss the domain concepts related to biological data and databases, as well as current KDD and data mining developments in biology.
Resumo:
Over recent years databases have become an extremely important resource for biomedical research. Immunology research is increasingly dependent on access to extensive biological databases to extract existing information, plan experiments, and analyse experimental results. This review describes 15 immunological databases that have appeared over the last 30 years. In addition, important issues regarding database design and the potential for misuse of information contained within these databases are discussed. Access pointers are provided for the major immunological databases and also for a number of other immunological resources accessible over the World Wide Web (WWW). (C) 2000 Elsevier Science B.V. All rights reserved.
Resumo:
A significant number of chimeric 16S rDNA sequences of diverse origin were identified in the public databases by partial treeing analysis. This suggests that chimeric sequences, representing phylogenetically novel non-existent organisms, are routinely being overlooked in molecular phylogenetic surveys despite a general awareness of PCR-generated artefacts amongst researchers.
Resumo:
Allergies represent a significant medical and industrial problem. Molecular and clinical data on allergens are growing exponentially and in this article we have reviewed nine specialized allergen databases and identified data sources related to protein allergens contained in general purpose molecular databases. An analysis of allergens contained in public databases indicates a high level of redundancy of entries and a relatively low coverage of allergens by individual databases. From this analysis we identify current database needs for allergy research and, in particular, highlight the need for a centralized reference allergen database.
Resumo:
Spatial data has now been used extensively in the Web environment, providing online customized maps and supporting map-based applications. The full potential of Web-based spatial applications, however, has yet to be achieved due to performance issues related to the large sizes and high complexity of spatial data. In this paper, we introduce a multiresolution approach to spatial data management and query processing such that the database server can choose spatial data at the right resolution level for different Web applications. One highly desirable property of the proposed approach is that the server-side processing cost and network traffic can be reduced when the level of resolution required by applications are low. Another advantage is that our approach pushes complex multiresolution structures and algorithms into the spatial database engine. That is, the developer of spatial Web applications needs not to be concerned with such complexity. This paper explains the basic idea, technical feasibility and applications of multiresolution spatial databases.
Resumo:
There is a considerable body of new information on Gynecology and Obstetrics. To aid in keeping gynecologists updated, renowned periodicals publish review articles. Review articles enable the reader to obtain the best evidence for clinical or research issues from several individual articles. This enables the professional to make clinical decisions in the light of current knowledge. The different types of reviews and database that may be used for the elaboration of reviews are discussed in the present article. It is suggested that future reviews on Gynecology and Obstetrics include articles published in other idioms apart from English and that a larger number of database is researched. Thus, reviews will be not only more inclusive but more representative of the international literature.
Resumo:
This publication is a support and resource document for the "National Action Plan for Promotion, Prevention and Early Intervention for Mental Health 2000". It includes indicators, measurement tools and databases relevant to assessing the implementation of the outcomes and strategies identified in the action plan.
Resumo:
The changes introduced into the European Higher Education Area (EHEA) by the Bologna Process, together with renewed pedagogical and methodological practices, have created a new teaching-learning paradigm: Student-Centred Learning. In addition, the last few years have been characterized by the application of Information Technologies, especially the Semantic Web, not only to the teaching-learning process, but also to administrative processes within learning institutions. On one hand, the aim of this study was to present a model for identifying and classifying Competencies and Learning Outcomes and, on the other hand, the computer applications of the information management model were developed, namely a relational Database and an Ontology.