961 resultados para Client-server distributed databases
Resumo:
Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a two-fold "custom wrapper" approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases.
Resumo:
Today, databases have become an integral part of information systems. In the past two decades, we have seen different database systems being developed independently and used in different applications domains. Today's interconnected networks and advanced applications, such as data warehousing, data mining & knowledge discovery and intelligent data access to information on the Web, have created a need for integrated access to such heterogeneous, autonomous, distributed database systems. Heterogeneous/multidatabase research has focused on this issue resulting in many different approaches. However, a single, generally accepted methodology in academia or industry has not emerged providing ubiquitous intelligent data access from heterogeneous, autonomous, distributed information sources. This thesis describes a heterogeneous database system being developed at Highperformance Database Research Center (HPDRC). A major impediment to ubiquitous deployment of multidatabase technology is the difficulty in resolving semantic heterogeneity. That is, identifying related information sources for integration and querying purposes. Our approach considers the semantics of the meta-data constructs in resolving this issue. The major contributions of the thesis work include: (i.) providing a scalable, easy-to-implement architecture for developing a heterogeneous multidatabase system, utilizing Semantic Binary Object-oriented Data Model (Sem-ODM) and Semantic SQL query language to capture the semantics of the data sources being integrated and to provide an easy-to-use query facility; (ii.) a methodology for semantic heterogeneity resolution by investigating into the extents of the meta-data constructs of component schemas. This methodology is shown to be correct, complete and unambiguous; (iii.) a semi-automated technique for identifying semantic relations, which is the basis of semantic knowledge for integration and querying, using shared ontologies for context-mediation; (iv.) resolutions for schematic conflicts and a language for defining global views from a set of component Sem-ODM schemas; (v.) design of a knowledge base for storing and manipulating meta-data and knowledge acquired during the integration process. This knowledge base acts as the interface between integration and query processing modules; (vi.) techniques for Semantic SQL query processing and optimization based on semantic knowledge in a heterogeneous database environment; and (vii.) a framework for intelligent computing and communication on the Internet applying the concepts of our work.
Resumo:
This project is about retrieving data in range without allowing the server to read it, when the database is stored in the server. Basically, our goal is to build a database that allows the client to maintain the confidentiality of the data stored, despite all the data is stored in a different location from the client's hard disk. This means that all the information written on the hard disk can be easily read by another person who can do anything with it. Given that, we need to encrypt that data from eavesdroppers or other people. This is because they could sell it or log into accounts and use them for stealing money or identities. In order to achieve this, we need to encrypt the data stored in the hard drive, so that only the possessor of the key can easily read the information stored, while all the others are going to read only encrypted data. Obviously, according to that, all the data management must be done by the client, otherwise any malicious person can easily retrieve it and use it for any malicious intention. All the methods analysed here relies on encrypting data in transit. In the end of this project we analyse 2 theoretical and practical methods for the creation of the above databases and then we tests them with 3 datasets and with 10, 100 and 1000 queries. The scope of this work is to retrieve a trend that can be useful for future works based on this project.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
Database schemas, in many organizations, are considered one of the critical assets to be protected. From database schemas, it is not only possible to infer the information being collected but also the way organizations manage their businesses and/or activities. One of the ways to disclose database schemas is through the Create, Read, Update and Delete (CRUD) expressions. In fact, their use can follow strict security rules or be unregulated by malicious users. In the first case, users are required to master database schemas. This can be critical when applications that access the database directly, which we call database interface applications (DIA), are developed by third party organizations via outsourcing. In the second case, users can disclose partially or totally database schemas following malicious algorithms based on CRUD expressions. To overcome this vulnerability, we propose a new technique where CRUD expressions cannot be directly manipulated by DIAs any more. Whenever a DIA starts-up, the associated database server generates a random codified token for each CRUD expression and sends it to the DIA that the database servers can use to execute the correspondent CRUD expression. In order to validate our proposal, we present a conceptual architectural model and a proof of concept.
Resumo:
In database applications, access control security layers are mostly developed from tools provided by vendors of database management systems and deployed in the same servers containing the data to be protected. This solution conveys several drawbacks. Among them we emphasize: 1) if policies are complex, their enforcement can lead to performance decay of database servers; 2) when modifications in the established policies implies modifications in the business logic (usually deployed at the client-side), there is no other possibility than modify the business logic in advance and, finally, 3) malicious users can issue CRUD expressions systematically against the DBMS expecting to identify any security gap. In order to overcome these drawbacks, in this paper we propose an access control stack characterized by: most of the mechanisms are deployed at the client-side; whenever security policies evolve, the security mechanisms are automatically updated at runtime and, finally, client-side applications do not handle CRUD expressions directly. We also present an implementation of the proposed stack to prove its feasibility. This paper presents a new approach to enforce access control in database applications, this way expecting to contribute positively to the state of the art in the field.
Resumo:
In database applications, access control security layers are mostly developed from tools provided by vendors of database management systems and deployed in the same servers containing the data to be protected. This solution conveys several drawbacks. Among them we emphasize: (1) if policies are complex, their enforcement can lead to performance decay of database servers; (2) when modifications in the established policies implies modifications in the business logic (usually deployed at the client-side), there is no other possibility than modify the business logic in advance and, finally, 3) malicious users can issue CRUD expressions systematically against the DBMS expecting to identify any security gap. In order to overcome these drawbacks, in this paper we propose an access control stack characterized by: most of the mechanisms are deployed at the client-side; whenever security policies evolve, the security mechanisms are automatically updated at runtime and, finally, client-side applications do not handle CRUD expressions directly. We also present an implementation of the proposed stack to prove its feasibility. This paper presents a new approach to enforce access control in database applications, this way expecting to contribute positively to the state of the art in the field.
Resumo:
Al giorno d'oggi il reinforcement learning ha dimostrato di essere davvero molto efficace nel machine learning in svariati campi, come ad esempio i giochi, il riconoscimento vocale e molti altri. Perciò, abbiamo deciso di applicare il reinforcement learning ai problemi di allocazione, in quanto sono un campo di ricerca non ancora studiato con questa tecnica e perchè questi problemi racchiudono nella loro formulazione un vasto insieme di sotto-problemi con simili caratteristiche, per cui una soluzione per uno di essi si estende ad ognuno di questi sotto-problemi. In questo progetto abbiamo realizzato un applicativo chiamato Service Broker, il quale, attraverso il reinforcement learning, apprende come distribuire l'esecuzione di tasks su dei lavoratori asincroni e distribuiti. L'analogia è quella di un cloud data center, il quale possiede delle risorse interne - possibilmente distribuite nella server farm -, riceve dei tasks dai suoi clienti e li esegue su queste risorse. L'obiettivo dell'applicativo, e quindi del data center, è quello di allocare questi tasks in maniera da minimizzare il costo di esecuzione. Inoltre, al fine di testare gli agenti del reinforcement learning sviluppati è stato creato un environment, un simulatore, che permettesse di concentrarsi nello sviluppo dei componenti necessari agli agenti, invece che doversi anche occupare di eventuali aspetti implementativi necessari in un vero data center, come ad esempio la comunicazione con i vari nodi e i tempi di latenza di quest'ultima. I risultati ottenuti hanno dunque confermato la teoria studiata, riuscendo a ottenere prestazioni migliori di alcuni dei metodi classici per il task allocation.
Resumo:
The main objective of my thesis work is to exploit the Google native and open-source platform Kubeflow, specifically using Kubeflow pipelines, to execute a Federated Learning scalable ML process in a 5G-like and simplified test architecture hosting a Kubernetes cluster and apply the largely adopted FedAVG algorithm and FedProx its optimization empowered by the ML platform ‘s abilities to ease the development and production cycle of this specific FL process. FL algorithms are more are and more promising and adopted both in Cloud application development and 5G communication enhancement through data coming from the monitoring of the underlying telco infrastructure and execution of training and data aggregation at edge nodes to optimize the global model of the algorithm ( that could be used for example for resource provisioning to reach an agreed QoS for the underlying network slice) and after a study and a research over the available papers and scientific articles related to FL with the help of the CTTC that suggests me to study and use Kubeflow to bear the algorithm we found out that this approach for the whole FL cycle deployment was not documented and may be interesting to investigate more in depth. This study may lead to prove the efficiency of the Kubeflow platform itself for this need of development of new FL algorithms that will support new Applications and especially test the FedAVG algorithm performances in a simulated client to cloud communication using a MNIST dataset for FL as benchmark.
Resumo:
New DNA-based predictive tests for physical characteristics and inference of ancestry are highly informative tools that are being increasingly used in forensic genetic analysis. Two eye colour prediction models: a Bayesian classifier - Snipper and a multinomial logistic regression (MLR) system for the Irisplex assay, have been described for the analysis of unadmixed European populations. Since multiple SNPs in combination contribute in varying degrees to eye colour predictability in Europeans, it is likely that these predictive tests will perform in different ways amongst admixed populations that have European co-ancestry, compared to unadmixed Europeans. In this study we examined 99 individuals from two admixed South American populations comparing eye colour versus ancestry in order to reveal a direct correlation of light eye colour phenotypes with European co-ancestry in admixed individuals. Additionally, eye colour prediction following six prediction models, using varying numbers of SNPs and based on Snipper and MLR, were applied to the study populations. Furthermore, patterns of eye colour prediction have been inferred for a set of publicly available admixed and globally distributed populations from the HGDP-CEPH panel and 1000 Genomes databases with a special emphasis on admixed American populations similar to those of the study samples.
Resumo:
Background: Recent studies have reported the clinical importance of CYP2C19 and ABCB1 polymorphisms in an individualized approach to clopidogrel treatment. The aims of this study were to evaluate the frequencies of CYP2C19 and ABCB1 polymorphisms and to identify the clopidogrel-predicted metabolic phenotypes according to ethnic groups in a sample of individuals representative of a highly admixtured population. Methods: One hundred and eighty-three Amerindians and 1,029 subjects of the general population of 4 regions of the country were included. Genotypes for the ABCB1c.C3435T (rs1045642), CYP2C19*2 (rs4244285), CYP2C19*3 (rs4986893), CYP2C19*4 (rs28399504), CYP2C19*5 (rs56337013), and CYP2C19*17 (rs12248560) polymorphisms were detected by polymerase chain reaction followed by high resolution melting analysis. The CYP2C19*3, CYP2C19*4 and CYP2C19*5 variants were genotyped in a subsample of subjects (300 samples randomly selected). Results: The CYP2C19*3 and CYP2C19*5 variant alleles were not detected and the CYP2C19*4 variant allele presented a frequency of 0.3%. The allelic frequencies for the ABCB1c.C3435T, CYP2C19*2 and CYP2C19*17 polymorphisms were differently distributed according to ethnicity: Amerindian (51.4%, 10.4%, 15.8%); Caucasian descent (43.2%, 16.9%, 18.0%); Mulatto (35.9%, 16.5%, 21.3%); and African descent (32.8%, 20.2%, 26.3%) individuals, respectively. As a result, self-referred ethnicity was able to predict significantly different clopidogrel-predicted metabolic phenotypes prevalence even for a highly admixtured population. Conclusion: Our findings indicate the existence of inter-ethnic differences in the ABCB1 and CYP2C19 variant allele frequencies in the Brazilian general population plus Amerindians. This information could help in stratifying individuals from this population regarding clopidogrel-predicted metabolic phenotypes and design more cost-effective programs towards individualization of clopidogrel therapy.
Resumo:
Background: Various neuroimaging studies, both structural and functional, have provided support for the proposal that a distributed brain network is likely to be the neural basis of intelligence. The theory of Distributed Intelligent Processing Systems (DIPS), first developed in the field of Artificial Intelligence, was proposed to adequately model distributed neural intelligent processing. In addition, the neural efficiency hypothesis suggests that individuals with higher intelligence display more focused cortical activation during cognitive performance, resulting in lower total brain activation when compared with individuals who have lower intelligence. This may be understood as a property of the DIPS. Methodology and Principal Findings: In our study, a new EEG brain mapping technique, based on the neural efficiency hypothesis and the notion of the brain as a Distributed Intelligence Processing System, was used to investigate the correlations between IQ evaluated with WAIS (Whechsler Adult Intelligence Scale) and WISC (Wechsler Intelligence Scale for Children), and the brain activity associated with visual and verbal processing, in order to test the validity of a distributed neural basis for intelligence. Conclusion: The present results support these claims and the neural efficiency hypothesis.
Resumo:
Recent advances in energy technology generation and new directions in electricity regulation have made distributed generation (DG) more widespread, with consequent significant impacts on the operational characteristics of distribution networks. For this reason, new methods for identifying such impacts are needed, together with research and development of new tools and resources to maintain and facilitate continued expansion towards DG. This paper presents a study aimed at determining appropriate DG sites for distribution systems. The main considerations which determine DG sites are also presented, together with an account of the advantages gained from correct DG placement. The paper intends to define some quantitative and qualitative parameters evaluated by Digsilent (R), GARP3 (R) and DSA-GD software. A multi-objective approach based on the Bellman-Zadeh algorithm and fuzzy logic is used to determine appropriate DG sites. The study also aims to find acceptable DG locations both for distribution system feeders, as well as for nodes inside a given feeder. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents a novel graphical approach to adjust and evaluate frequency-based relays employed in anti-islanding protection schemes of distributed synchronous generators, in order to meet the anti-islanding and abnormal frequency variation requirements, simultaneously. The proposed method defines a region in the power mismatch space, inside which the relay non-detection zone should be located, if the above-mentioned requirements must be met. Such region is called power imbalance application region. Results show that this method can help protection engineers to adjust frequency-based relays to improve the anti-islanding capability and to minimize false operation occurrences, keeping the abnormal frequency variation utility requirements satisfied. Moreover, the proposed method can be employed to coordinate different types of frequency-based relays, aiming at improving overall performance of the distributed generator frequency protection scheme. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Wireless Sensor Networks (WSNs) have a vast field of applications, including deployment in hostile environments. Thus, the adoption of security mechanisms is fundamental. However, the extremely constrained nature of sensors and the potentially dynamic behavior of WSNs hinder the use of key management mechanisms commonly applied in modern networks. For this reason, many lightweight key management solutions have been proposed to overcome these constraints. In this paper, we review the state of the art of these solutions and evaluate them based on metrics adequate for WSNs. We focus on pre-distribution schemes well-adapted for homogeneous networks (since this is a more general network organization), thus identifying generic features that can improve some of these metrics. We also discuss some challenges in the area and future research directions. (C) 2010 Elsevier B.V. All rights reserved.