988 resultados para distributed databases


Relevância:

60.00% 60.00%

Publicador:

Resumo:

A significant set of information stored in different databases around the world, can be shared through peer-topeer databases. With that, is obtained a large base of knowledge, without the need for large investments because they are used existing databases, as well as the infrastructure in place. However, the structural characteristics of peer-topeer, makes complex the process of finding such information. On the other side, these databases are often heterogeneous in their schemas, but semantically similar in their content. A good peer-to-peer databases systems should allow the user access information from databases scattered across the network and receive only the information really relate to your topic of interest. This paper proposes to use ontologies in peer-to-peer database queries to represent the semantics inherent to the data. The main contribution of this work is enable integration between heterogeneous databases, improve the performance of such queries and use the algorithm of optimization Ant Colony to solve the problem of locating information on peer-to-peer networks, which presents an improve of 18% in results. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The development of new technologies that use peer-to-peer networks grows every day, with the object to supply the need of sharing information, resources and services of databases around the world. Among them are the peer-to-peer databases that take advantage of peer-to-peer networks to manage distributed knowledge bases, allowing the sharing of information semantically related but syntactically heterogeneous. However, it is a challenge to ensure the efficient search for information without compromising the autonomy of each node and network flexibility, given the structural characteristics of these networks. On the other hand, some studies propose the use of ontology semantics by assigning standardized categorization of information. The main original contribution of this work is the approach of this problem with a proposal for optimization of queries supported by the Ant Colony algorithm and classification though ontologies. The results show that this strategy enables the semantic support to the searches in peer-to-peer databases, aiming to expand the results without compromising network performance. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Las aplicaciones distribuidas que precisan de un servicio multipunto fiable son muy numerosas, y entre otras es posible citar las siguientes: bases de datos distribuidas, sistemas operativos distribuidos, sistemas de simulación interactiva distribuida y aplicaciones de distribución de software, publicaciones o noticias. Aunque en sus orígenes el dominio de aplicación de tales sistemas distribuidos estaba reducido a una única subred (por ejemplo una Red de Área Local) posteriormente ha surgido la necesidad de ampliar su aplicabilidad a interredes. La aproximación tradicional al problema del multipunto fiable en interredes se ha basado principalmente en los dos siguientes puntos: (1) proporcionar en un mismo protocolo muchas garantías de servicio (por ejemplo fiabilidad, atomicidad y ordenación) y a su vez algunas de éstas en distintos grados, sin tener en cuenta que muchas aplicaciones multipunto que precisan fiabilidad no necesitan otras garantías; y (2) extender al entorno multipunto las soluciones ya adoptadas en el entorno punto a punto sin considerar las características diferenciadoras; y de aquí, que se haya tratado de resolver el problema de la fiabilidad multipunto con protocolos extremo a extremo (protocolos de transporte) y utilizando esquemas de recuperación de errores, centralizados (las retransmisiones se hacen desde un único punto, normalmente la fuente) y globales (los paquetes solicitados se vuelven a enviar al grupo completo). En general, estos planteamientos han dado como resultado protocolos que son ineficientes en tiempo de ejecución, tienen problemas de escalabilidad, no hacen un uso óptimo de los recursos de red y no son adecuados para aplicaciones sensibles al retardo. En esta Tesis se investiga el problema de la fiabilidad multipunto en interredes operando en modo datagrama y se presenta una forma novedosa de enfocar el problema: es más óptimo resolver el problema de la fiabilidad multipunto a nivel de red y separar la fiabilidad de otras garantías de servicio, que pueden ser proporcionadas por un protocolo de nivel superior o por la propia aplicación. Siguiendo este nuevo enfoque se ha diseñado un protocolo multipunto fiable que opera a nivel de red (denominado RMNP). Las características más representativas del RMNP son las siguientes; (1) sigue una aproximación orientada al emisor, lo cual permite lograr un grado muy alto de fiabilidad; (2) plantea un esquema de recuperación de errores distribuido (las retransmisiones se hacen desde ciertos encaminadores intermedios que siempre estarán más cercanos a los miembros que la propia fuente) y de ámbito restringido (el alcance de las retransmisiones está restringido a un cierto número de miembros). Este esquema hace posible optimizar el retardo medio de distribución y disminuir la sobrecarga introducida por las retransmisiones; (3) incorpora en ciertos encaminadores funciones de agregación y filtrado de paquetes de control, que evitan problemas de implosión y reducen el tráfico que fluye hacia la fuente. Con el fin de evaluar el comportamiento del protocolo diseñado, se han realizado pruebas de simulación obteniéndose como principales conclusiones que, el RMNP escala correctamente con el tamaño del grupo, hace un uso óptimo de los recursos de red y es adecuado para aplicaciones sensibles al retardo.---ABSTRACT---There are many distributed applications that require a reliable multicast service, including: distributed databases, distributed operating systems, distributed interactive simulation systems and distribution applications of software, publications or news. Although the application domain of distributed systems of this type was originally confíned to a single subnetwork (for example, a Local Área Network), it later became necessary extend their applicability to internetworks. The traditional approach to the reliable multicast problem in internetworks is based mainly on the following two points: (1) provide a lot of service guarantees in one and the same protocol (for example, reliability, atomicity and ordering) and different levéis of guarantee in some cases, without taking into account that many multicast applications that require reliability do not need other guarantees, and (2) extend solutions adopted in the unicast environment to the multicast environment without taking into account their distinctive characteristics. So, the attempted solutions to the multicast reliability problem were end-to-end protocols (transport protocols) and centralized error recovery schemata (retransmissions made from a single point, normally the source) and global error retrieval schemata (the requested packets are retransmitted to the whole group). Generally, these approaches have resulted in protocols that are inefficient in execution time, have scaling problems, do not make optimum use of network resources and are not suitable for delay-sensitive applications. Here, the multicast reliability problem is investigated in internetworks operating in datagram mode and a new way of approaching the problem is presented: it is better to solve to the multicast reliability problem at network level and sepárate reliability from other service guarantees that can be supplied by a higher protocol or the application itself. A reliable multicast protocol that operates at network level (called RMNP) has been designed on the basis of this new approach. The most representative characteristics of the RMNP are as follows: (1) it takes a transmitter-oriented approach, which provides for a very high reliability level; (2) it provides for an error retrieval schema that is distributed (the retransmissions are made from given intermedíate routers that will always be closer to the members than the source itself) and of restricted scope (the scope of the retransmissions is confined to a given number of members), and this schema makes it possible to optimize the mean distribution delay and reduce the overload caused by retransmissions; (3) some routers include control packet aggregation and filtering functions that prevent implosión problems and reduce the traffic flowing towards the source. Simulation test have been performed in order to evalúate the behaviour of the protocol designed. The main conclusions are that the RMNP scales correctly with group size, makes optimum use of network resources and is suitable for delay-sensitive applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The authors analyse some of the research outcomes achieved during the implementation of the EC GUIDE research project “Creating an European Identity Management Architecture for eGovernment”, as well as their personal experience. The project goals and achievements are however considered in a broader context. The key role of Identity in the Information Society was emphasised, that the research and development in this field is in its initial phase. The scope of research related to Identity, including the one related to Identity Management and Interoperability of Identity Management Systems, is expected to be further extended. The authors analyse the abovementioned issues in the context established by the EC European Interoperability Framework (EIF) as a reference document on interoperability for the Interoperable Delivery of European eGovernment Services to Public Administrations, Business and Citizens (IDABC) Work Programme. This programme aims at supporting the pan-European delivery of electronic government services.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A replicação de base de dados tem como objectivo a cópia de dados entre bases de dados distribuídas numa rede de computadores. A replicação de dados é importante em várias situações, desde a realização de cópias de segurança da informação, ao balanceamento de carga, à distribuição da informação por vários locais, até à integração de sistemas heterogéneos. A replicação possibilita uma diminuição do tráfego de rede, pois os dados ficam disponíveis localmente possibilitando também o seu acesso no caso de indisponibilidade da rede. Esta dissertação baseia-se na realização de um trabalho que consistiu no desenvolvimento de uma aplicação genérica para a replicação de bases de dados a disponibilizar como open source software. A aplicação desenvolvida possibilita a integração de dados entre vários sistemas, com foco na integração de dados heterogéneos, na fragmentação de dados e também na possibilidade de adaptação a várias situações. ABSTRACT: Data replication is a mechanism to synchronize and integrate data between distributed databases over a computer network. Data replication is an important tool in several situations, such as the creation of backup systems, load balancing between various nodes, distribution of information between various locations, integration of heterogeneous systems. Replication enables a reduction in network traffic, because data remains available locally even in the event of a temporary network failure. This thesis is based on the work carried out to develop an application for database replication to be made accessible as open source software. The application that was built allows for data integration between various systems, with particular focus on, amongst others, the integration of heterogeneous data, the fragmentation of data, replication in cascade, data format changes between replicas, master/slave and multi master synchronization.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The principles of organization of the distributed system of databases on properties of inorganic substances and materials based on the use of a special reference database are considered. The last includes not only information on a site of the data about the certain substance in other databases but also brief information on the most widespread properties of inorganic substances. The proposed principles were successfully realized at the creation of the distributed system of databases on properties of inorganic compounds developed by A.A.Baikov Institute of Metallurgy and Materials Science of the Russian Academy of Sciences.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a distributed hierarchical multiagent architecture for detecting SQL injection attacks against databases. It uses a novel strategy, which is supported by a Case-Based Reasoning mechanism, which provides to the classifier agents with a great capacity of learning and adaptation to face this type of attack. The architecture combines strategies of intrusion detection systems such as misuse detection and anomaly detection. It has been tested and the results are presented in this paper.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the emergence of Internet, the global connectivity of computers has become a reality. Internet has progressed to provide many user-friendly tools like Gopher, WAIS, WWW etc. for information publishing and access. The WWW, which integrates all other access tools, also provides a very convenient means for publishing and accessing multimedia and hypertext linked documents stored in computers spread across the world. With the emergence of WWW technology, most of the information activities are becoming Web-centric. Once the information is published on the Web, a user can access this information from any part of the world. A Web browser like Netscape or Internet Explorer is used as a common user interface for accessing information/databases. This will greatly relieve a user from learning the search syntax of individual information systems. Libraries are taking advantage of these developments to provide access to their resources on the Web. CDS/ISIS is a very popular bibliographic information management software used in India. In this tutorial we present details of integrating CDS/ISIS with the WWW. A number of tools are now available for making CDS/ISIS database accessible on the Internet/Web. Some of these are 1) the WAIS_ISIS Server. 2) the WWWISIS Server 3) the IQUERY Server. In this tutorial, we have explained in detail the steps involved in providing Web access to an existing CDS/ISIS database using the freely available software, WWWISIS. This software is developed, maintained and distributed by BIREME, the Latin American & Caribbean Centre on Health Sciences Information. WWWISIS acts as a server for CDS/ISIS databases in a WWW client/server environment. It supports functions for searching, formatting and data entry operations over CDS/ISIS databases. WWWISIS is available for various operating systems. We have tested this software on Windows '95, Windows NT and Red Hat Linux release 5.2 (Appolo) Kernel 2. 0. 36 on an i686. The testing was carried out using IISc's main library's OPAC containing more than 80,000 records and Current Contents issues (bibliographic data) containing more than 25,000 records. WWWISIS is fully compatible with CDS/ISIS 3.07 file structure. However, on a system running Unix or its variant, there is no guarantee of this compatibility. It is therefore safe to recreate the master and the inverted files, using utilities provided by BIREME, under Unix environment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Practical usage of machine learning is gaining strategic importance in enterprises looking for business intelligence. However, most enterprise data is distributed in multiple relational databases with expert-designed schema. Using traditional single-table machine learning techniques over such data not only incur a computational penalty for converting to a flat form (mega-join), even the human-specified semantic information present in the relations is lost. In this paper, we present a practical, two-phase hierarchical meta-classification algorithm for relational databases with a semantic divide and conquer approach. We propose a recursive, prediction aggregation technique over heterogeneous classifiers applied on individual database tables. The proposed algorithm was evaluated on three diverse datasets. namely TPCH, PKDD and UCI benchmarks and showed considerable reduction in classification time without any loss of prediction accuracy. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper shows how knowledge, in the form of fuzzy rules, can be derived from a self-organizing supervised learning neural network called fuzzy ARTMAP. Rule extraction proceeds in two stages: pruning removes those recognition nodes whose confidence index falls below a selected threshold; and quantization of continuous learned weights allows the final system state to be translated into a usable set of rules. Simulations on a medical prediction problem, the Pima Indian Diabetes (PID) database, illustrate the method. In the simulations, pruned networks about 1/3 the size of the original actually show improved performance. Quantization yields comprehensible rules with only slight degradation in test set prediction performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sharing of information with those in need of it has always been an idealistic goal of networked environments. With the proliferation of computer networks, information is so widely distributed among systems, that it is imperative to have well-organized schemes for retrieval and also discovery. This thesis attempts to investigate the problems associated with such schemes and suggests a software architecture, which is aimed towards achieving a meaningful discovery. Usage of information elements as a modelling base for efficient information discovery in distributed systems is demonstrated with the aid of a novel conceptual entity called infotron.The investigations are focused on distributed systems and their associated problems. The study was directed towards identifying suitable software architecture and incorporating the same in an environment where information growth is phenomenal and a proper mechanism for carrying out information discovery becomes feasible. An empirical study undertaken with the aid of an election database of constituencies distributed geographically, provided the insights required. This is manifested in the Election Counting and Reporting Software (ECRS) System. ECRS system is a software system, which is essentially distributed in nature designed to prepare reports to district administrators about the election counting process and to generate other miscellaneous statutory reports.Most of the distributed systems of the nature of ECRS normally will possess a "fragile architecture" which would make them amenable to collapse, with the occurrence of minor faults. This is resolved with the help of the penta-tier architecture proposed, that contained five different technologies at different tiers of the architecture.The results of experiment conducted and its analysis show that such an architecture would help to maintain different components of the software intact in an impermeable manner from any internal or external faults. The architecture thus evolved needed a mechanism to support information processing and discovery. This necessitated the introduction of the noveI concept of infotrons. Further, when a computing machine has to perform any meaningful extraction of information, it is guided by what is termed an infotron dictionary.The other empirical study was to find out which of the two prominent markup languages namely HTML and XML, is best suited for the incorporation of infotrons. A comparative study of 200 documents in HTML and XML was undertaken. The result was in favor ofXML.The concept of infotron and that of infotron dictionary, which were developed, was applied to implement an Information Discovery System (IDS). IDS is essentially, a system, that starts with the infotron(s) supplied as clue(s), and results in brewing the information required to satisfy the need of the information discoverer by utilizing the documents available at its disposal (as information space). The various components of the system and their interaction follows the penta-tier architectural model and therefore can be considered fault-tolerant. IDS is generic in nature and therefore the characteristics and the specifications were drawn up accordingly. Many subsystems interacted with multiple infotron dictionaries that were maintained in the system.In order to demonstrate the working of the IDS and to discover the information without modification of a typical Library Information System (LIS), an Information Discovery in Library Information System (lDLIS) application was developed. IDLIS is essentially a wrapper for the LIS, which maintains all the databases of the library. The purpose was to demonstrate that the functionality of a legacy system could be enhanced with the augmentation of IDS leading to information discovery service. IDLIS demonstrates IDS in action. IDLIS proves that any legacy system could be augmented with IDS effectively to provide the additional functionality of information discovery service.Possible applications of IDS and scope for further research in the field are covered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tycho was conceived in 2003 in response to a need by the GridRM [1] resource-monitoring project for a ldquolight-weightrdquo, scalable and easy to use wide-area distributed registry and messaging system. Since Tycho's first release in 2006 a number of modifications have been made to the system to make it easier to use and more flexible. Since its inception, Tycho has been utilised across a number of application domains including widearea resource monitoring, distributed queries across archival databases, providing services for the nodes of a Cray supercomputer, and as a system for transferring multi-terabyte scientific datasets across the Internet. This paper provides an overview of the initial Tycho system, describes a number of applications that utilise Tycho, discusses a number of new utilities, and how the Tycho infrastructure has evolved in response to experience of building applications with it.