991 resultados para database performance


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Modern software applications are becoming more dependent on database management systems (DBMSs). DBMSs are usually used as black boxes by software developers. For example, Object-Relational Mapping (ORM) is one of the most popular database abstraction approaches that developers use nowadays. Using ORM, objects in Object-Oriented languages are mapped to records in the database, and object manipulations are automatically translated to SQL queries. As a result of such conceptual abstraction, developers do not need deep knowledge of databases; however, all too often this abstraction leads to inefficient and incorrect database access code. Thus, this thesis proposes a series of approaches to improve the performance of database-centric software applications that are implemented using ORM. Our approaches focus on troubleshooting and detecting inefficient (i.e., performance problems) database accesses in the source code, and we rank the detected problems based on their severity. We first conduct an empirical study on the maintenance of ORM code in both open source and industrial applications. We find that ORM performance-related configurations are rarely tuned in practice, and there is a need for tools that can help improve/tune the performance of ORM-based applications. Thus, we propose approaches along two dimensions to help developers improve the performance of ORM-based applications: 1) helping developers write more performant ORM code; and 2) helping developers configure ORM configurations. To provide tooling support to developers, we first propose static analysis approaches to detect performance anti-patterns in the source code. We automatically rank the detected anti-pattern instances according to their performance impacts. Our study finds that by resolving the detected anti-patterns, the application performance can be improved by 34% on average. We then discuss our experience and lessons learned when integrating our anti-pattern detection tool into industrial practice. We hope our experience can help improve the industrial adoption of future research tools. However, as static analysis approaches are prone to false positives and lack runtime information, we also propose dynamic analysis approaches to further help developers improve the performance of their database access code. We propose automated approaches to detect redundant data access anti-patterns in the database access code, and our study finds that resolving such redundant data access anti-patterns can improve application performance by an average of 17%. Finally, we propose an automated approach to tune performance-related ORM configurations using both static and dynamic analysis. Our study shows that our approach can help improve application throughput by 27--138%. Through our case studies on real-world applications, we show that all of our proposed approaches can provide valuable support to developers and help improve application performance significantly.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thesis (Ph.D, Computing) -- Queen's University, 2016-09-30 09:55:51.506

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Current computer systems have evolved from featuring only a single processing unit and limited RAM, in the order of kilobytes or few megabytes, to include several multicore processors, o↵ering in the order of several tens of concurrent execution contexts, and have main memory in the order of several tens to hundreds of gigabytes. This allows to keep all data of many applications in the main memory, leading to the development of inmemory databases. Compared to disk-backed databases, in-memory databases (IMDBs) are expected to provide better performance by incurring in less I/O overhead. In this dissertation, we present a scalability study of two general purpose IMDBs on multicore systems. The results show that current general purpose IMDBs do not scale on multicores, due to contention among threads running concurrent transactions. In this work, we explore di↵erent direction to overcome the scalability issues of IMDBs in multicores, while enforcing strong isolation semantics. First, we present a solution that requires no modification to either database systems or to the applications, called MacroDB. MacroDB replicates the database among several engines, using a master-slave replication scheme, where update transactions execute on the master, while read-only transactions execute on slaves. This reduces contention, allowing MacroDB to o↵er scalable performance under read-only workloads, while updateintensive workloads su↵er from performance loss, when compared to the standalone engine. Second, we delve into the database engine and identify the concurrency control mechanism used by the storage sub-component as a scalability bottleneck. We then propose a new locking scheme that allows the removal of such mechanisms from the storage sub-component. This modification o↵ers performance improvement under all workloads, when compared to the standalone engine, while scalability is limited to read-only workloads. Next we addressed the scalability limitations for update-intensive workloads, and propose the reduction of locking granularity from the table level to the attribute level. This further improved performance for intensive and moderate update workloads, at a slight cost for read-only workloads. Scalability is limited to intensive-read and read-only workloads. Finally, we investigate the impact applications have on the performance of database systems, by studying how operation order inside transactions influences the database performance. We then propose a Read before Write (RbW) interaction pattern, under which transaction perform all read operations before executing write operations. The RbW pattern allowed TPC-C to achieve scalable performance on our modified engine for all workloads. Additionally, the RbW pattern allowed our modified engine to achieve scalable performance on multicores, almost up to the total number of cores, while enforcing strong isolation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Tietokoneiden tallennuskapasiteetin ja sekä tietokoneiden että verkkojen nopeuden kasvaessa myös käyttäjien odotukset kasvavat. Tietoa talletetaan yhä enemmän ja näistä tiedoista laaditaan yhä monimutkaisempia raportteja. Raporttien monimutkaisuuden kasvaessa niiden tarvitseman tiedon keräämiseen kuluva aika ei kuitenkaan saisi oleellisesti kasvaa. Tämän työn tarkoituksena on tutkia ja parantaa kansainvälisen metsäteollisuusyrityksen myynnin ja logistiikan järjestelmän raportointitietokannan tehokkuutta etenkin raporttien tietojen keräämiseen kuluvalla ajalla mitattuna. Työssä keskitytään kartoittamaan nykyisen järjestelmän pullonkauloja ja pyritään parantamaan järjestelmän suorituskykyä. Tulevaisuudessa suorituskykyä tarvitaan kuitenkin lisää, joten työssä tarkastellaan myös nykyisen, yleiskäyttöisen tietokannan, korvaamista erityisesti raportointia varten suunnitellulla tietokannalla. Työn tuloksena järjestelmän raporttien tietojen keräämiseen kuluvaa aikaa pystyttiin pienentämään ja pahimmat pullonkaulat selvittämään. Käyttäjämäärän kasvaessa tietokannan suorituskyvyn rajat tulevat kuitenkin pian vastaan. Tietokanta joudutaan tulevaisuudessa vaihtamaan erityisesti raportointitietokannaksi suunniteltuun.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The performance of active and passive fund management has been extensively studied especially in the US. This thesis is focused on the performance of active and passive fund management in the Finnish and European stock markets during a five-year time span from 3/2011 to 3/2016. The aim of this study is to find out which strategy will result in better returns for the small-scale investor. The thesis questions also which strategy leads to a better profit-risk rate and how well the fund managers perform in creating added value. The data of the study consists of 44 active Finnish funds and two passive exchange traded funds available for Finnish investors. Indexes of both Finnish and European markets and a risk-free rate are used to support the analysis. The data for the thesis is collected from the DataStream database. Performance indicators that are used in the study are: return, volatility, Sharpe ratio and Jensen’s alpha. Based on the results of this study it can be concluded that in the Finnish stock market the passive strategy yielded a little better profits than the average of active funds. In the European stock market, the profits for the passive fund were significantly better than the average of active funds. Considering the profit-risk rate, neither strategy out- performed. The results of this thesis are in line with the previous studies, that encourage to favor the passive strategy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

[EN]In this paper, we address the challenge of gender classi - cation using large databases of images with two goals. The rst objective is to evaluate whether the error rate decreases compared to smaller databases. The second goal is to determine if the classi er that provides the best classi cation rate for one database, improves the classi cation results for other databases, that is, the cross-database performance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Acquired brain injury (ABI) is one of the leading causes of death and disability in the world and is associated with high health care costs as a result of the acute treatment and long term rehabilitation involved. Different algorithms and methods have been proposed to predict the effectiveness of rehabilitation programs. In general, research has focused on predicting the overall improvement of patients with ABI. The purpose of this study is the novel application of data mining (DM) techniques to predict the outcomes of cognitive rehabilitation in patients with ABI. We generate three predictive models that allow us to obtain new knowledge to evaluate and improve the effectiveness of the cognitive rehabilitation process. Decision tree (DT), multilayer perceptron (MLP) and general regression neural network (GRNN) have been used to construct the prediction models. 10-fold cross validation was carried out in order to test the algorithms, using the Institut Guttmann Neurorehabilitation Hospital (IG) patients database. Performance of the models was tested through specificity, sensitivity and accuracy analysis and confusion matrix analysis. The experimental results obtained by DT are clearly superior with a prediction average accuracy of 90.38%, while MLP and GRRN obtained a 78.7% and 75.96%, respectively. This study allows to increase the knowledge about the contributing factors of an ABI patient recovery and to estimate treatment efficacy in individual patients.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

As end-user computing becomes more pervasive, an organization's success increasingly depends on the ability of end-users, usually in managerial positions, to extract appropriate data from both internal and external sources. Many of these data sources include or are derived from the organization's accounting information systems. Managerial end-users with different personal characteristics and approaches are likely to compose queries of differing levels of accuracy when searching the data contained within these accounting information systems. This research investigates how cognitive style elements of personality influence managerial end-user performance in database querying tasks. A laboratory experiment was conducted in which participants generated queries to retrieve information from an accounting information system to satisfy typical information requirements. The experiment investigated the influence of personality on the accuracy of queries of varying degrees of complexity. Relying on the Myers–Briggs personality instrument, results show that perceiving individuals (as opposed to judging individuals) who rely on intuition (as opposed to sensing) composed queries more accurately. As expected, query complexity and academic performance also explain the success of data extraction tasks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The poster demonstrates the preparatory steps of a digital multi-text edition that are abstracted from the experiences made in the Parzival Project, based at the University of Bern, the Berlin-Brandenburg Academy of Sciences and the University of Erlangen. This edition of Wolfram von Eschenbach’s German Grail novel, written shortly after 1200 and transmitted during several centuries in ca. hundred witnesses, has now been completed by more than a half of the textual corpus. As the text is transmitted in medieval manuscripts the witnesses have to be transcribed according to specific encoding rules. The transcriptions then are collated following certain ideas and concepts of how the transmission process could have developed. The transcriptions and collations finally have to be transferred to a digital edition that allows the users to explore the characteristics of single witnesses as well as the history of a text, which is delivered in variants and in different versions. A dynamically organized database offering various components and adapted to the needs of diverse user-profiles is nowadays the right tool for this purpose.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The poster demonstrates the preparatory steps of a digital multi-text edition that are abstracted from the experiences made in the Parzival Project, based at the University of Bern, the Berlin-Brandenburg Academy of Sciences and the University of Erlangen. This edition of Wolfram von Eschenbach’s German Grail novel, written shortly after 1200 and transmitted during several centuries in ca. hundred witnesses, has now been completed by more than a half of the textual corpus.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The CoastColour project Round Robin (CCRR) project (http://www.coastcolour.org) funded by the European Space Agency (ESA) was designed to bring together a variety of reference datasets and to use these to test algorithms and assess their accuracy for retrieving water quality parameters. This information was then developed to help end-users of remote sensing products to select the most accurate algorithms for their coastal region. To facilitate this, an inter-comparison of the performance of algorithms for the retrieval of in-water properties over coastal waters was carried out. The comparison used three types of datasets on which ocean colour algorithms were tested. The description and comparison of the three datasets are the focus of this paper, and include the Medium Resolution Imaging Spectrometer (MERIS) Level 2 match-ups, in situ reflectance measurements and data generated by a radiative transfer model (HydroLight). The datasets mainly consisted of 6,484 marine reflectance associated with various geometrical (sensor viewing and solar angles) and sky conditions and water constituents: Total Suspended Matter (TSM) and Chlorophyll-a (CHL) concentrations, and the absorption of Coloured Dissolved Organic Matter (CDOM). Inherent optical properties were also provided in the simulated datasets (5,000 simulations) and from 3,054 match-up locations. The distributions of reflectance at selected MERIS bands and band ratios, CHL and TSM as a function of reflectance, from the three datasets are compared. Match-up and in situ sites where deviations occur are identified. The distribution of the three reflectance datasets are also compared to the simulated and in situ reflectances used previously by the International Ocean Colour Coordinating Group (IOCCG, 2006) for algorithm testing, showing a clear extension of the CCRR data which covers more turbid waters.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

One of the most demanding needs in cloud computing is that of having scalable and highly available databases. One of the ways to attend these needs is to leverage the scalable replication techniques developed in the last decade. These techniques allow increasing both the availability and scalability of databases. Many replication protocols have been proposed during the last decade. The main research challenge was how to scale under the eager replication model, the one that provides consistency across replicas. In this paper, we examine three eager database replication systems available today: Middle-R, C-JDBC and MySQL Cluster using TPC-W benchmark. We analyze their architecture, replication protocols and compare the performance both in the absence of failures and when there are failures.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

One of the most demanding needs in cloud computing and big data is that of having scalable and highly available databases. One of the ways to attend these needs is to leverage the scalable replication techniques developed in the last decade. These techniques allow increasing both the availability and scalability of databases. Many replication protocols have been proposed during the last decade. The main research challenge was how to scale under the eager replication model, the one that provides consistency across replicas. This thesis provides an in depth study of three eager database replication systems based on relational systems: Middle-R, C-JDBC and MySQL Cluster and three systems based on In-Memory Data Grids: JBoss Data Grid, Oracle Coherence and Terracotta Ehcache. Thesis explore these systems based on their architecture, replication protocols, fault tolerance and various other functionalities. It also provides experimental analysis of these systems using state-of-the art benchmarks: TPC-C and TPC-W (for relational systems) and Yahoo! Cloud Serving Benchmark (In- Memory Data Grids). Thesis also discusses three Graph Databases, Neo4j, Titan and Sparksee based on their architecture and transactional capabilities and highlights the weaker transactional consistencies provided by these systems. It discusses an implementation of snapshot isolation in Neo4j graph database to provide stronger isolation guarantees for transactions.