933 resultados para Databases as Topic


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Projeto de Pós-Graduação/Dissertação apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Mestre em Medicina Dentária

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a new class of Concurrency Control Algorithms that is especially suited for real-time database applications. Our approach relies on the use of (potentially) redundant computations to ensure that serializable schedules are found and executed as early as possible, thus, increasing the chances of a timely commitment of transactions with strict timing constraints. Due to its nature, we term our concurrency control algorithms Speculative. The aforementioned description encompasses many algorithms that we call collectively Speculative Concurrency Control (SCC) algorithms. SCC algorithms combine the advantages of both Pessimistic and Optimistic Concurrency Control (PCC and OCC) algorithms, while avoiding their disadvantages. On the one hand, SCC resembles PCC in that conflicts are detected as early as possible, thus making alternative schedules available in a timely fashion in case they are needed. On the other hand, SCC resembles OCC in that it allows conflicting transactions to proceed concurrently, thus avoiding unnecessary delays that may jeopardize their timely commitment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Various concurrency control algorithms differ in the time when conflicts are detected, and in the way they are resolved. In that respect, the Pessimistic and Optimistic Concurrency Control (PCC and OCC) alternatives represent two extremes. PCC locking protocols detect conflicts as soon as they occur and resolve them using blocking. OCC protocols detect conflicts at transaction commit time and resolve them using rollbacks (restarts). For real-time databases, blockages and rollbacks are hazards that increase the likelihood of transactions missing their deadlines. We propose a Speculative Concurrency Control (SCC) technique that minimizes the impact of blockages and rollbacks. SCC relies on the use of added system resources to speculate on potential serialization orders and to ensure that if such serialization orders materialize, the hazards of blockages and roll-backs are minimized. We present a number of SCC-based algorithms that differ in the level of speculation they introduce, and the amount of system resources (mainly memory) they require. We show the performance gains (in terms of number of satisfied timing constraints) to be expected when a representative SCC algorithm (SCC-2S) is adopted.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we discuss a new type of query in Spatial Databases, called Trip Planning Query (TPQ). Given a set of points P in space, where each point belongs to a category, and given two points s and e, TPQ asks for the best trip that starts at s, passes through exactly one point from each category, and ends at e. An example of a TPQ is when a user wants to visit a set of different places and at the same time minimize the total travelling cost, e.g. what is the shortest travelling plan for me to visit an automobile shop, a CVS pharmacy outlet, and a Best Buy shop along my trip from A to B? The trip planning query is an extension of the well-known TSP problem and therefore is NP-hard. The difficulty of this query lies in the existence of multiple choices for each category. In this paper, we first study fast approximation algorithms for the trip planning query in a metric space, assuming that the data set fits in main memory, and give the theory analysis of their approximation bounds. Then, the trip planning query is examined for data sets that do not fit in main memory and must be stored on disk. For the disk-resident data, we consider two cases. In one case, we assume that the points are located in Euclidean space and indexed with an Rtree. In the other case, we consider the problem of points that lie on the edges of a spatial network (e.g. road network) and the distance between two points is defined using the shortest distance over the network. Finally, we give an experimental evaluation of the proposed algorithms using synthetic data sets generated on real road networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We describe a method for shape-based image database search that uses deformable prototypes to represent categories. Rather than directly comparing a candidate shape with all shape entries in the database, shapes are compared in terms of the types of nonrigid deformations (differences) that relate them to a small subset of representative prototypes. To solve the shape correspondence and alignment problem, we employ the technique of modal matching, an information-preserving shape decomposition for matching, describing, and comparing shapes despite sensor variations and nonrigid deformations. In modal matching, shape is decomposed into an ordered basis of orthogonal principal components. We demonstrate the utility of this approach for shape comparison in 2-D image databases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose and evaluate an admission control paradigm for RTDBS, in which a transaction is submitted to the system as a pair of processes: a primary task, and a recovery block. The execution requirements of the primary task are not known a priori, whereas those of the recovery block are known a priori. Upon the submission of a transaction, an Admission Control Mechanism is employed to decide whether to admit or reject that transaction. Once admitted, a transaction is guaranteed to finish executing before its deadline. A transaction is considered to have finished executing if exactly one of two things occur: Either its primary task is completed (successful commitment), or its recovery block is completed (safe termination). Committed transactions bring a profit to the system, whereas a terminated transaction brings no profit. The goal of the admission control and scheduling protocols (e.g., concurrency control, I/O scheduling, memory management) employed in the system is to maximize system profit. We describe a number of admission control strategies and contrast (through simulations) their relative performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This report summarizes the technical presentations and discussions that took place during RTDB'96: the First International Workshop on Real-Time Databases, which was held on March 7 and 8, 1996 in Newport Beach, California. The main goals of this project were to (1) review recent advances in real-time database systems research, (2) to promote interaction among real-time database researchers and practitioners, and (3) to evaluate the maturity and directions of real-time database technology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In an outsourced database system the data owner publishes information through a number of remote, untrusted servers with the goal of enabling clients to access and query the data more efficiently. As clients cannot trust servers, query authentication is an essential component in any outsourced database system. Clients should be given the capability to verify that the answers provided by the servers are correct with respect to the actual data published by the owner. While existing work provides authentication techniques for selection and projection queries, there is a lack of techniques for authenticating aggregation queries. This article introduces the first known authenticated index structures for aggregation queries. First, we design an index that features good performance characteristics for static environments, where few or no updates occur to the data. Then, we extend these ideas and propose more involved structures for the dynamic case, where the database owner is allowed to update the data arbitrarily. Our structures feature excellent average case performance for authenticating queries with multiple aggregate attributes and multiple selection predicates. We also implement working prototypes of the proposed techniques and experimentally validate the correctness of our ideas.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper shows how knowledge, in the form of fuzzy rules, can be derived from a self-organizing supervised learning neural network called fuzzy ARTMAP. Rule extraction proceeds in two stages: pruning removes those recognition nodes whose confidence index falls below a selected threshold; and quantization of continuous learned weights allows the final system state to be translated into a usable set of rules. Simulations on a medical prediction problem, the Pima Indian Diabetes (PID) database, illustrate the method. In the simulations, pruned networks about 1/3 the size of the original actually show improved performance. Quantization yields comprehensible rules with only slight degradation in test set prediction performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

INTRODUCTION: Accessing new knowledge as the evidence base for hospice and palliative care grows has specific challenges for the discipline. This study aimed to describe conversion rates of palliative and hospice care conference abstracts to journal articles and to highlight that some palliative care literature may not be retrievable because it is not indexed on bibliographic databases. METHODS: Substudy A tracked the journal publication of conference abstracts selected for inclusion in a gray literature database on www.caresearch.com.au . Abstracts were included in the gray literature database following handsearching of proceedings of over 100 Australian conferences likely to have some hospice or palliative care content that were held between 1980 and 1999. Substudy B looked at indexing from first publication until 2001 of three international hospice and palliative care journals in four widely available bibliographic databases through systematic tracing of all original papers in the journals. RESULTS: Substudy A showed that for the 1338 abstracts identified only 15.9% were published (compared to an average in health of 45%). Published abstracts were found in 78 different journals. Multiauthor abstracts and oral presentations had higher rates of conversion. Substudy B demonstrated lag time between first publication and bibliographic indexing. Even after listing, idiosyncratic noninclusions were identified. DISCUSSION: There are limitations to retrieval of all possible literature through electronic searching of bibliographic databases. Encouraging publication in indexed journals of studies presented at conferences, promoting selection of palliative care journals for database indexing, and searching more than one bibliographic database will improve the accessibility of existing and new knowledge in hospice and palliative care.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: Dropouts and missing data are nearly-ubiquitous in obesity randomized controlled trails, threatening validity and generalizability of conclusions. Herein, we meta-analytically evaluate the extent of missing data, the frequency with which various analytic methods are employed to accommodate dropouts, and the performance of multiple statistical methods. METHODOLOGY/PRINCIPAL FINDINGS: We searched PubMed and Cochrane databases (2000-2006) for articles published in English and manually searched bibliographic references. Articles of pharmaceutical randomized controlled trials with weight loss or weight gain prevention as major endpoints were included. Two authors independently reviewed each publication for inclusion. 121 articles met the inclusion criteria. Two authors independently extracted treatment, sample size, drop-out rates, study duration, and statistical method used to handle missing data from all articles and resolved disagreements by consensus. In the meta-analysis, drop-out rates were substantial with the survival (non-dropout) rates being approximated by an exponential decay curve (e(-lambdat)) where lambda was estimated to be .0088 (95% bootstrap confidence interval: .0076 to .0100) and t represents time in weeks. The estimated drop-out rate at 1 year was 37%. Most studies used last observation carried forward as the primary analytic method to handle missing data. We also obtained 12 raw obesity randomized controlled trial datasets for empirical analyses. Analyses of raw randomized controlled trial data suggested that both mixed models and multiple imputation performed well, but that multiple imputation may be more robust when missing data are extensive. CONCLUSION/SIGNIFICANCE: Our analysis offers an equation for predictions of dropout rates useful for future study planning. Our raw data analyses suggests that multiple imputation is better than other methods for handling missing data in obesity randomized controlled trials, followed closely by mixed models. We suggest these methods supplant last observation carried forward as the primary method of analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: With the globalization of clinical trials, large developing nations have substantially increased their participation in multi-site studies. This participation has raised ethical concerns, among them the fear that local customs, habits and culture are not respected while asking potential participants to take part in study. This knowledge gap is particularly noticeable among Indian subjects, since despite the large number of participants, little is known regarding what factors affect their willingness to participate in clinical trials. METHODS: We conducted a meta-analysis of all studies evaluating the factors and barriers, from the perspective of potential Indian participants, contributing to their participation in clinical trials. We searched both international as well as Indian-specific bibliographic databases, including Pubmed, Cochrane, Openjgate, MedInd, Scirus and Medknow, also performing hand searches and communicating with authors to obtain additional references. We enrolled studies dealing exclusively with the participation of Indians in clinical trials. Data extraction was conducted by three researchers, with disagreement being resolved by consensus. RESULTS: Six qualitative studies and one survey were found evaluating the main themes affecting the participation of Indian subjects. Themes included Personal health benefits, Altruism, Trust in physicians, Source of extra income, Detailed knowledge, Methods for motivating participants as factors favoring, while Mistrust on trial organizations, Concerns about efficacy and safety of trials, Psychological reasons, Trial burden, Loss of confidentiality, Dependency issues, Language as the barriers. CONCLUSION: We identified factors that facilitated and barriers that have negative implications on trial participation decisions in Indian subjects. Due consideration and weightage should be assigned to these factors while planning future trials in India.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pharmacogenomics (PGx) offers the promise of utilizing genetic fingerprints to predict individual responses to drugs in terms of safety, efficacy and pharmacokinetics. Early-phase clinical trial PGx applications can identify human genome variations that are meaningful to study design, selection of participants, allocation of resources and clinical research ethics. Results can inform later-phase study design and pipeline developmental decisions. Nevertheless, our review of the clinicaltrials.gov database demonstrates that PGx is rarely used by drug developers. Of the total 323 trials that included PGx as an outcome, 80% have been conducted by academic institutions after initial regulatory approval. Barriers for the application of PGx are discussed. We propose a framework for the role of PGx in early-phase drug development and recommend PGx be universally considered in study design, result interpretation and hypothesis generation for later-phase studies, but PGx results from underpowered studies should not be used by themselves to terminate drug-development programs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack of a community-wide, consensus-based, human- and machine-interpretable language for describing phenotypes and their genomic and environmental contexts is perhaps the most pressing scientific bottleneck to integration across many key fields in biology, including genomics, systems biology, development, medicine, evolution, ecology, and systematics. Here we survey the current phenomics landscape, including data resources and handling, and the progress that has been made to accurately capture relevant data descriptions for phenotypes. We present an example of the kind of integration across domains that computable phenotypes would enable, and we call upon the broader biology community, publishers, and relevant funding agencies to support efforts to surmount today's data barriers and facilitate analytical reproducibility.