937 resultados para Relational Databases
Resumo:
O aumento da quantidade de dados gerados que se tem verificado nos últimos anos e a que se tem vindo a dar o nome de Big Data levou a que a tecnologia relacional começasse a demonstrar algumas fragilidades no seu armazenamento e manuseamento o que levou ao aparecimento das bases de dados NoSQL. Estas estão divididas por quatro tipos distintos nomeadamente chave/valor, documentos, grafos e famílias de colunas. Este artigo é focado nas bases de dados do tipo column-based e nele serão analisados os dois sistemas deste tipo considerados mais relevantes: Cassandra e HBase.
Resumo:
Modeling Extract-Transform-Load (ETL) processes of a Data Warehousing System has always been a challenge. The heterogeneity of the sources, the quality of the data obtained and the conciliation process are some of the issues that must be addressed in the design phase of this critical component. Commercial ETL tools often provide proprietary diagrammatic components and modeling languages that are not standard, thus not providing the ideal separation between a modeling platform and an execution platform. This separation in conjunction with the use of standard notations and languages is critical in a system that tends to evolve through time and which cannot be undermined by a normally expensive tool that becomes an unsatisfactory component. In this paper we demonstrate the application of Relational Algebra as a modeling language of an ETL system as an effort to standardize operations and provide a basis for uncommon ETL execution platforms.
Resumo:
The MAP-i Doctoral Programme in Informatics, of the Universities of Minho, Aveiro and Porto
Resumo:
Tese de Doutoramento em Psicologia Básica
Resumo:
...Diese Dissertation zeigt, wie wir Datenbankmanagementsysteme bauen können, die heterogene Prozessoren effizient und zuverlässig zur Beschleunigung der Anfrageverarbeitung nutzen können. Daher untersuchen wir typische Entwurfsentscheidungen von coprozessorbeschleunigten Datenbankmanagementsystemen und leiten darauf aufbauend eine generische Architektur für solche Systeme ab. Unsere Untersuchungen zeigen, dass eines der wichtigsten Probleme für solche Datenbankmanagementsysteme die Entscheidung ist, welche Operatoren einer Anfrage auf welchem Prozessor ausgeführt werden sollen...
Resumo:
This paper considers a long-term relationship between two agents who both undertake a costly action or investment that together produces a joint benefit. Agents have an opportunity to expropriate some of the joint benefit for their own use. Two cases are considered: (i) where agents are risk neutral and are subject to limited liability constraints and (ii) where agents are risk averse, have quasi-linear preferences in consumption and actions but where limited liability constraints do not bind. The question asked is how to structure the investments and division of the surplus over time so as to avoid expropriation. In the risk-neutral case, there may be an initial phase in which one agent overinvests and the other underinvests. However, both actions and surplus converge monotonically to a stationary state in which there is no overinvestment and surplus is at its maximum subject to the constraints. In the risk-averse case, there is no overinvestment. For this case, we establish that dynamics may or may not be monotonic depending on whether or not it is possible to sustain a first-best allocation. If the first-best allocation is not sustainable, then there is a trade-off between risk sharing and surplus maximization. In general, surplus will not be at its constrained maximum even in the long run.
Resumo:
High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits. isb-sib.ch).
Resumo:
The DNA microarray technology has arguably caught the attention of the worldwide life science community and is now systematically supporting major discoveries in many fields of study. The majority of the initial technical challenges of conducting experiments are being resolved, only to be replaced with new informatics hurdles, including statistical analysis, data visualization, interpretation, and storage. Two systems of databases, one containing expression data and one containing annotation data are quickly becoming essential knowledge repositories of the research community. This present paper surveys several databases, which are considered "pillars" of research and important nodes in the network. This paper focuses on a generalized workflow scheme typical for microarray experiments using two examples related to cancer research. The workflow is used to reference appropriate databases and tools for each step in the process of array experimentation. Additionally, benefits and drawbacks of current array databases are addressed, and suggestions are made for their improvement.
Resumo:
The SIB Swiss Institute of Bioinformatics (www.isb-sib.ch) was created in 1998 as an institution to foster excellence in bioinformatics. It is renowned worldwide for its databases and software tools, such as UniProtKB/Swiss-Prot, PROSITE, SWISS-MODEL, STRING, etc, that are all accessible on ExPASy.org, SIB's Bioinformatics Resource Portal. This article provides an overview of the scientific and training resources SIB has consistently been offering to the life science community for more than 15 years.
Resumo:
In the last decade microsatellites have become one of the most useful genetic markers used in a large number of organisms due to their abundance and high level of polymorphism. Microsatellites have been used for individual identification, paternity tests, forensic studies and population genetics. Data on microsatellite abundance comes preferentially from microsatellite enriched libraries and DNA sequence databases. We have conducted a search in GenBank of more than 16,000 Schistosoma mansoni ESTs and 42,000 BAC sequences. In addition, we obtained 300 sequences from CA and AT microsatellite enriched genomic libraries. The sequences were searched for simple repeats using the RepeatMasker software. Of 16,022 ESTs, we detected 481 (3%) sequences that contained 622 microsatellites (434 perfect, 164 imperfect and 24 compounds). Of the 481 ESTs, 194 were grouped in 63 clusters containing 2 to 15 ESTs per cluster. Polymorphisms were observed in 16 clusters. The 287 remaining ESTs were orphan sequences. Of the 42,017 BAC end sequences, 1,598 (3.8%) contained microsatellites (2,335 perfect, 287 imperfect and 79 compounds). The 1,598 BAC end sequences 80 were grouped into 17 clusters containing 3 to 17 BAC end sequences per cluster. Microsatellites were present in 67 out of 300 sequences from microsatellite enriched libraries (55 perfect, 38 imperfect and 15 compounds). From all of the observed loci 55 were selected for having the longest perfect repeats and flanking regions that allowed the design of primers for PCR amplification. Additionally we describe two new polymorphic microsatellite loci.
Resumo:
Una empresa ens contracta ja que té la necessitat de controlar la gestió del manteniment de tots els aparells que té instal·lats en els 32 centres de treball de que disposa (com poden ser extintors, calderes, aire condicionat, fotocopiadores, escàners, impressores, plotters, etc.), i el manteniment el porten a terme diferents empreses de serveis especialitzades. Aquesta gestió i l'accés a tota la informació necessària es realitzarà mitjançant procediments de Base de Dades.La Base de Dades que es crearà ha de ser capaç de portar a terme la gestió de manteniment de tot el maquinari existent en cadascun dels centres de l'empresa, i ha de disposar d'eines que permetin, per una banda, tractar correctament la informació referent als diferents tipus de manteniment (preventiu, correctiu, adaptatiu) que es pot aplicar als diferents equips, i per l'altra, que li permetin incorporar a la Base de Dades aquelles necessites futures, és a dir, que li permetin ser escalable.Aquesta Base de Dades també disposarà de mecanismes que li permetin tractar possibles problemes d'integració amb la resta del sistema, com per exemple, la creació d'un log de les diferents accions fetes amb la Base de Dades, o diferents mecanismes per fer diversos tests de les funcionalitats del sistema.A part de la gestió de manteniment, es disposarà també un magatzem de dades (DataWarehouse) per tal de poder analitzar la informació que conté la Base de Dades anteriorment esmentada, amb l'objectiu d'ajudar a prendre decisions sobre les dades existents en un moment donat.El projecte està dividit en dos blocs principals, on el primer es dissenyarà i s'implementarà una Base de Dades relacional, amb la seva definició de requeriments, l'anàlisi i disseny, desenvolupament de procediments de base de dades i proves per testejar el correcte funcionament, mentre que el segon es dissenyarà i s'implementarà un magatzem de dades (DataWarehouse) per explotar les dades existents en la base de dades relacional.
Resumo:
L'objectiu fonamental d'aquest TFC és la creació i posada en funcionament d'un magatzem de dades, partint de la informació procedent de bases de dades transaccionals.
Resumo:
El que es vol, en aquest projecte, és generar una sèrie d'eines que permetin a una entitat externa d'interactuar en una base de dades relacional sense necessitat de coneixements específics sobre bases de dades. Quan es diu interactuar, s'hi inclou des de la creació de la base de dades fins a la consulta de la informació, passant per l'actualització d'aquesta base de dades.
Resumo:
L'objectiu és estudiar les característiques orientades a l'objecte de l'estàndard SQL: 1999 i posar-les a prova amb un producte comercial que les suporti.