938 resultados para Optimistic data replication system
Resumo:
Nowadays, data available and used by companies is growing very fast creating the need to use and manage this data in the most efficient way. To this end, data is replicated overmultiple datacenters and use different replication protocols, according to their needs, like more availability or stronger consistency level. The costs associated with full data replication can be very high, and most of the times, full replication is not needed since information can be logically partitioned. Another problem, is that by using datacenters to store and process information clients become heavily dependent on them. We propose a partial replication protocol called ParTree, which replicates data to clients, and organizes clients in a hierarchy, using communication between them to propagate information. This solution addresses some of these problems, namely by supporting partial data replication and offline execution mode. Given the complexity of the protocol, the use of formal verification is crucial to ensure the protocol two correctness properties: causal consistency and preservation of data. The use of TLA+ language and tools to formally specificity and verify the proposed protocol are also described.
Resumo:
During the last few years many research efforts have been done to improve the design of ETL (Extract-Transform-Load) systems. ETL systems are considered very time-consuming, error-prone and complex involving several participants from different knowledge domains. ETL processes are one of the most important components of a data warehousing system that are strongly influenced by the complexity of business requirements, their changing and evolution. These aspects influence not only the structure of a data warehouse but also the structures of the data sources involved with. To minimize the negative impact of such variables, we propose the use of ETL patterns to build specific ETL packages. In this paper, we formalize this approach using BPMN (Business Process Modelling Language) for modelling more conceptual ETL workflows, mapping them to real execution primitives through the use of a domain-specific language that allows for the generation of specific instances that can be executed in an ETL commercial tool.
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
In this final project the high availability options for PostgreSQL database management system were explored and evaluated. The primary objective of the project was to find a reliable replication system and implement it to a production environment. The secondary objective was to explore different load balancing methods and compare their performance. The potential replication methods were thoroughly examined, and the most promising was implemented to a database system gathering weather information in Lithuania. The different load balancing methods were tested performance wise with different load scenarios and the results were analysed. As a result for this project a functioning PostgreSQL database replication system was built to the Lithuanian Hydrometeorological Service's headquarters, and definite guidelines for future load balancing needs were produced. This study includes the actual implementation of a replication system to a demanding production environment, but only guidelines for building a load balancing system to the same production environment.
Resumo:
The Iowa Correctional Offender Network (ICON) is a data collection system that was first deployed in community corrections in 2000 after two years of planning, and was integrated with the institutions in 2004. The purpose of ICON is to collect and organize the data necessary to make informed decisions.
Resumo:
This project is part of an effort conducted by the Justice Research and Statistics Association (JRSA) under a grant whose objective is to provide states with descriptions of existing methodologies to collect Domestic Violence (DV) and Sexual Assault (SA) data. JRSA has identified three different methodologies to collect such data: · Incident-based reporting as part of the Uniform Crime Reports · Specialized data collection from law enforcement through a separate data collection system · Specialized data collection coming directly from service providers. One state has been selected as an example of each type of data collection above, with Iowa selected as a representative of states with incident based reporting (IBR) as part of the UCR system.
Resumo:
Efficient initiation of SV40 DNA replication requires transcription factors that bind auxiliary sequences flanking the minimally required origin. To evaluate the possibility that transcription factors may activate SV40 replication by acting on the chromatin structure of the origin, we used an in vivo replication system in which we targeted GAL4 fusion proteins to the minimally required origin. We found that the proline-rich transcriptional activation domain of nuclear factor I (NF-I), which has been previously shown to interact with histone H3, specifically activates replication. Evaluation of a series of deletion and point mutants of NF-I indicates that the H3-binding domain and the replication activity coincide perfectly. Assays with other transcription factors, such as Sp1, confirmed the correlation between the interaction with H3 and the activation of replication. These findings imply that transcription factors such as NF-I can activate SV40 replication via direct interaction with chromatin components, thereby contributing to the relief of nucleosomal repression at the SV40 origin.
Resumo:
In this paper we introduce a highly efficient reversible data hiding system. It is based on dividing the image into tiles and shifting the histograms of each image tile between its minimum and maximum frequency. Data are then inserted at the pixel level with the largest frequency to maximize data hiding capacity. It exploits the special properties of medical images, where the histogram of their nonoverlapping image tiles mostly peak around some gray values and the rest of the spectrum is mainlyempty. The zeros (or minima) and peaks (maxima) of the histograms of the image tiles are then relocated to embed the data. The grey values of some pixels are therefore modified.High capacity, high fidelity, reversibility and multiple data insertions are the key requirements of data hiding in medical images. We show how histograms of image tiles of medical images can be exploited to achieve these requirements. Compared with data hiding method applied to the whole image, our scheme can result in 30%-200% capacity improvement and still with better image quality, depending on the medical image content. Additional advantages of the proposed method include hiding data in the regions of non-interest and better exploitation of spatial masking.
Resumo:
Tämän tutkimustyön kohteena on TietoEnator Oy:n kehittämän Fenix-tietojärjestelmän kapasiteettitarpeen ennustaminen. Työn tavoitteena on tutustua Fenix-järjestelmän eri osa-alueisiin, löytää tapa eritellä ja mallintaa eri osa-alueiden vaikutus järjestelmän kuormitukseen ja selvittää alustavasti mitkä parametrit vaikuttavat kyseisten osa-alueiden luomaan kuormitukseen. Osa tätä työtä on tutkia eri vaihtoehtoja simuloinnille ja selvittää eri vaihtoehtojen soveltuvuus monimutkaisten järjestelmien mallintamiseen. Kerätyn tiedon pohjaltaluodaan järjestelmäntietovaraston kuormitusta kuvaava simulaatiomalli. Hyödyntämällä mallista saatua tietoa ja tuotantojärjestelmästä mitattua tietoa mallia kehitetään vastaamaan yhä lähemmin todellisen järjestelmän toimintaa. Mallista tarkastellaan esimerkiksi simuloitua järjestelmäkuormaa ja jonojen käyttäytymistä. Tuotantojärjestelmästä mitataan eri kuormalähteiden käytösmuutoksia esimerkiksi käyttäjämäärän ja kellonajan suhteessa. Tämän työn tulosten on tarkoitus toimia pohjana myöhemmin tehtävälle jatkotutkimukselle, jossa osa-alueiden parametrisointia tarkennetaan lisää, mallin kykyä kuvata todellista järjestelmää tehostetaanja mallin laajuutta kasvatetaan.
Resumo:
The research of power-line communications has been concentrated on home automation, broadband indoor communications and broadband data transfer in a low voltage distribution network between home andtransformer station. There has not been carried out much research work that is focused on the high frequency characteristics of industrial low voltage distribution networks. The industrial low voltage distribution network may be utilised as a communication channel to data transfer required by the on-line condition monitoring of electric motors. The advantage of using power-line data transfer is that it does not require the installing of new cables. In the first part of this work, the characteristics of industrial low voltage distribution network components and the pilot distribution network are measured and modelled with respect topower-line communications frequencies up to 30 MHz. The distributed inductances, capacitances and attenuation of MCMK type low voltage power cables are measured in the frequency band 100 kHz - 30 MHz and an attenuation formula for the cables is formed based on the measurements. The input impedances of electric motors (15-250 kW) are measured using several signal couplings and measurement based input impedance model for electric motor with a slotted stator is formed. The model is designed for the frequency band 10 kHz - 30 MHz. Next, the effect of DC (direct current) voltage link inverter on power line data transfer is briefly analysed. Finally, a pilot distribution network is formed and signal attenuation in communication channels in the pilot environment is measured. The results are compared with the simulations that are carried out utilising the developed models and measured parameters for cables and motors. In the second part of this work, a narrowband power-line data transfer system is developed for the data transfer ofon-line condition monitoring of electric motors. It is developed using standardintegrated circuits. The system is tested in the pilot environment and the applicability of the system for the data transfer required by the on-line condition monitoring of electric motors is analysed.
Resumo:
Research focus of this thesis is to explore options for building systems for business critical web applications. Business criticality here includes requirements for data protection and system availability. The focus is on open source software. Goals are to identify robust technologies and engineering practices to implement such systems. Research methods include experiments made with sample systems built around chosen software packages that represent certain technologies. The main research focused on finding a good method for database data replication, a key functionality for high-availability, database-driven web applications. Research included also finding engineering best practices from books written by administrators of high traffic web applications. Experiment with database replication showed, that block level synchronous replication offered by DRBD replication software offered considerably more robust data protection and high-availability functionality compared to leading open source database product MySQL, and its built-in asynchronous replication. For master-master database setups, block level replication is more recommended way to build high-availability into the system. Based on thesis research, building high-availability web applications is possible using a combination of open source software and engineering best practices for data protection, availability planning and scaling.
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
The whole research of the current Master Thesis project is related to Big Data transfer over Parallel Data Link and my main objective is to assist the Saint-Petersburg National Research University ITMO research team to accomplish this project and apply Green IT methods for the data transfer system. The goal of the team is to transfer Big Data by using parallel data links with SDN Openflow approach. My task as a team member was to compare existing data transfer applications in case to verify which results the highest data transfer speed in which occasions and explain the reasons. In the context of this thesis work a comparison between 5 different utilities was done, which including Fast Data Transfer (FDT), BBCP, BBFTP, GridFTP, and FTS3. A number of scripts where developed which consist of creating random binary data to be incompressible to have fair comparison between utilities, execute the Utilities with specified parameters, create log files, results, system parameters, and plot graphs to compare the results. Transferring such an enormous variety of data can take a long time, and hence, the necessity appears to reduce the energy consumption to make them greener. In the context of Green IT approach, our team used Cloud Computing infrastructure called OpenStack. It’s more efficient to allocated specific amount of hardware resources to test different scenarios rather than using the whole resources from our testbed. Testing our implementation with OpenStack infrastructure results that the virtual channel does not consist of any traffic and we can achieve the highest possible throughput. After receiving the final results we are in place to identify which utilities produce faster data transfer in different scenarios with specific TCP parameters and we can use them in real network data links.
Resumo:
Remote Data acquisition and analysing systems developed for fisheries and related environmental studies have been reported. It consists of three units. The first one namely multichannel remote data acquisition system is installed at the remote place powered by a rechargeable battery. It acquires and stores the 16 channel environmental data on a battery backed up RAM. The second unit called the Field data analyser is used for insitue display and analysis of the data stored in the backed up RAM. The third unit namely Laboratory data analyser is an IBM compatible PC based unit for detailed analysis and interpretation of the data after bringing the RAM unit to the laboratory. The data collected using the system has been analysed and presented in the form of a graph. The system timer operated at negligibly low current, switches on the power to the entire remote operated system at prefixed time interval of 2 hours.Data storage at remote site on low power battery backedupRAM and retrieval and analysis of data using PC are the special i ty of the system. The remote operated system takes about 7 seconds including the 5 second stabilization time to acquire and store data and is very ideal for remote operation on rechargeable bat tery. The system can store 16 channel data scanned at 2 hour interval for 10 days on 2K backed up RAM with memory expansion facility for 8K RAM.
Resumo:
Natural systems are inherently non linear. Recurrent behaviours are typical of natural systems. Recurrence is a fundamental property of non linear dynamical systems which can be exploited to characterize the system behaviour effectively. Cross recurrence based analysis of sensor signals from non linear dynamical system is presented in this thesis. The mutual dependency among relatively independent components of a system is referred as coupling. The analysis is done for a mechanically coupled system specifically designed for conducting experiment. Further, cross recurrence method is extended to the actual machining process in a lathe to characterize the chatter during turning. The result is verified by permutation entropy method. Conventional linear methods or models are incapable of capturing the critical and strange behaviours associated with the dynamical process. Hence any effective feature extraction methodologies should invariably gather information thorough nonlinear time series analysis. The sensor signals from the dynamical system normally contain noise and non stationarity. In an effort to get over these two issues to the maximum possible extent, this work adopts the cross recurrence quantification analysis (CRQA) methodology since it is found to be robust against noise and stationarity in the signals. The study reveals that the CRQA is capable of characterizing even weak coupling among system signals. It also divulges the dependence of certain CRQA variables like percent determinism, percent recurrence and entropy to chatter unambiguously. The surrogate data test shows that the results obtained by CRQA are the true properties of the temporal evolution of the dynamics and contain a degree of deterministic structure. The results are verified using permutation entropy (PE) to detect the onset of chatter from the time series. The present study ascertains that this CRP based methodology is capable of recognizing the transition from regular cutting to the chatter cutting irrespective of the machining parameters or work piece material. The results establish this methodology to be feasible for detection of chatter in metal cutting operation in a lathe.