909 resultados para Keys to Database Searching


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This report addresses the problem of fault tolerance to system failures for database systems that are to run on highly concurrent computers. It assumes that, in general, an application may have a wide distribution in the lifetimes of its transactions. Logging remains the method of choice for ensuring fault tolerance. Generational garbage collection techniques manage the limited disk space reserved for log information; this technique does not require periodic checkpoints and is well suited for applications with a broad range of transaction lifetimes. An arbitrarily large collection of parallel log streams provide the necessary disk bandwidth.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sponsorship: This research was made possible by a grant from the Economic and Social Research Council (ESRC Grant No. 000-22-0323)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Whelan, K. E. and King, R. D. Using a logical model to predict the growth of yeast. BMC Bioinformatics 2008, 9:97

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The SIEGE (Smoking Induced Epithelial Gene Expression) database is a clinical resource for compiling and analyzing gene expression data from epithelial cells of the human intra-thoracic airway. This database supports a translational research study whose goal is to profile the changes in airway gene expression that are induced by cigarette smoke. RNA is isolated from airway epithelium obtained at bronchoscopy from current-, former- and never-smoker subjects, and hybridized to Affymetrix HG-U133A Genechips, which measure the level of expression of ~22 500 human transcripts. The microarray data generated along with relevant patient information is uploaded to SIEGE by study administrators using the database's web interface, found at http://pulm.bumc.bu.edu/siegeDB. PERL-coded scripts integrated with SIEGE perform various quality control functions including the processing, filtering and formatting of stored data. The R statistical package is used to import database expression values and execute a number of statistical analyses including t-tests, correlation coefficients and hierarchical clustering. Values from all statistical analyses can be queried through CGI-based tools and web forms found on the �Search� section of the database website. Query results are embedded with graphical capabilities as well as with links to other databases containing valuable gene resources, including Entrez Gene, GO, Biocarta, GeneCards, dbSNP and the NCBI Map Viewer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND:Short (~5 nucleotides) interspersed repeats regulate several aspects of post-transcriptional gene expression. Previously we developed an algorithm (REPFIND) that assigns P-values to all repeated motifs in a given nucleic acid sequence and reliably identifies clusters of short CAC-containing motifs required for mRNA localization in Xenopus oocytes.DESCRIPTION:In order to facilitate the identification of genes possessing clusters of repeats that regulate post-transcriptional aspects of gene expression in mammalian genes, we used REPFIND to create a database of all repeated motifs in the 3' untranslated regions (UTR) of genes from the Mammalian Gene Collection (MGC). The MGC database includes seven vertebrate species: human, cow, rat, mouse and three non-mammalian vertebrate species. A web-based application was developed to search this database of repeated motifs to generate species-specific lists of genes containing specific classes of repeats in their 3'-UTRs. This computational tool is called 3'-UTR SIRF (Short Interspersed Repeat Finder), and it reveals that hundreds of human genes contain an abundance of short CAC-rich and CAG-rich repeats in their 3'-UTRs that are similar to those found in mRNAs localized to the neurites of neurons. We tested four candidate mRNAs for localization in rat hippocampal neurons by in situ hybridization. Our results show that two candidate CAC-rich (Syntaxin 1B and Tubulin beta4) and two candidate CAG-rich (Sec61alpha and Syntaxin 1A) mRNAs are localized to distal neurites, whereas two control mRNAs lacking repeated motifs in their 3'-UTR remain primarily in the cell body.CONCLUSION:Computational data generated with 3'-UTR SIRF indicate that hundreds of mammalian genes have an abundance of short CA-containing motifs that may direct mRNA localization in neurons. In situ hybridization shows that four candidate mRNAs are localized to distal neurites of cultured hippocampal neurons. These data suggest that short CA-containing motifs may be part of a widely utilized genetic code that regulates mRNA localization in vertebrate cells. The use of 3'-UTR SIRF to search for new classes of motifs that regulate other aspects of gene expression should yield important information in future studies addressing cis-regulatory information located in 3'-UTRs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Estimation of 3D hand pose is useful in many gesture recognition applications, ranging from human-computer interaction to automated recognition of sign languages. In this paper, 3D hand pose estimation is treated as a database indexing problem. Given an input image of a hand, the most similar images in a large database of hand images are retrieved. The hand pose parameters of the retrieved images are used as estimates for the hand pose in the input image. Lipschitz embeddings of edge images into a Euclidean space are used to improve the efficiency of database retrieval. In order to achieve interactive retrieval times, similarity queries are initially performed in this Euclidean space. The paper describes ongoing work that focuses on how to best choose reference images, in order to improve retrieval accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The design of programs for broadcast disks which incorporate real-time and fault-tolerance requirements is considered. A generalized model for real-time fault-tolerant broadcast disks is defined. It is shown that designing programs for broadcast disks specified in this model is closely related to the scheduling of pinwheel task systems. Some new results in pinwheel scheduling theory are derived, which facilitate the efficient generation of real-time fault-tolerant broadcast disk programs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transport protocols are an integral part of the inter-process communication (IPC) service used by application processes to communicate over the network infrastructure. With almost 30 years of research on transport, one would have hoped that we have a good handle on the problem. Unfortunately, that is not true. As the Internet continues to grow, new network technologies and new applications continue to emerge putting transport protocols in a never-ending flux as they are continuously adapted for these new environments. In this work, we propose a clean-slate transport architecture that renders all possible transport solutions as simply combinations of policies instantiated on a single common structure. We identify a minimal set of mechanisms that once instantiated with the appropriate policies allows any transport solution to be realized. Given our proposed architecture, we contend that there are no more transport protocols to design—only policies to specify. We implement our transport architecture in a declarative language, Network Datalog (NDlog), making the specification of different transport policies easy, compact, reusable, dynamically configurable and potentially verifiable. In NDlog, transport state is represented as database relations, state is updated/queried using database operations, and transport policies are specified using declarative rules. We identify limitations with NDlog that could potentially threaten the correctness of our specification. We propose several language extensions to NDlog that would significantly improve the programmability of transport policies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An improved method for deformable shape-based image indexing and retrieval is described. A pre-computed index tree is used to improve the speed of our previously reported on-line model fitting method; simple shape features are used as keys in a pre-generated index tree of model instances. In addition, a coarse to fine indexing scheme is used at different levels of the tree to further improve speed while maintaining matching accuracy. Experimental results show that the speedup is significant, while accuracy of shape-based indexing is maintained. A method for shape population-based retrieval is also described. The method allows query formulation based on the population distributions of shapes in each image. Results of population-based image queries for a database of blood cell micrographs are shown.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In outsourced database (ODB) systems the database owner publishes its data through a number of remote servers, with the goal of enabling clients at the edge of the network to access and query the data more efficiently. As servers might be untrusted or can be compromised, query authentication becomes an essential component of ODB systems. Existing solutions for this problem concentrate mostly on static scenarios and are based on idealistic properties for certain cryptographic primitives. In this work, first we define a variety of essential and practical cost metrics associated with ODB systems. Then, we analytically evaluate a number of different approaches, in search for a solution that best leverages all metrics. Most importantly, we look at solutions that can handle dynamic scenarios, where owners periodically update the data residing at the servers. Finally, we discuss query freshness, a new dimension in data authentication that has not been explored before. A comprehensive experimental evaluation of the proposed and existing approaches is used to validate the analytical models and verify our claims. Our findings exhibit that the proposed solutions improve performance substantially over existing approaches, both for static and dynamic environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SyNAPSE program of the Defense Advanced Projects Research Agency (HRL Laboratories LLC, subcontract #801881-BS under DARPA prime contract HR0011-09-C-0001); CELEST, a National Science Foundation Science of Learning Center (SBE-0354378)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genetic Algorithms (GAs) make use of an internal representation of a given system in order to perform optimization functions. The actual structural layout of this representation, called a genome, has a crucial impact on the outcome of the optimization process. The purpose of this paper is to study the effects of different internal representations in a GA, which generates neural networks. A second GA was used to optimize the genome structure. This structure produces an optimized system within a shorter time interval.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this research, which focused on the Irish adult population, was to generate information for policymakers by applying statistical analyses and current technologies to oral health administrative and survey databases. Objectives included identifying socio-demographic influences on oral health and utilisation of dental services, comparing epidemiologically-estimated dental treatment need with treatment provided, and investigating the potential of a dental administrative database to provide information on utilisation of services and the volume and types of treatment provided over time. Information was extracted from the claims databases for the Dental Treatment Benefit Scheme (DTBS) for employed adults and the Dental Treatment Services Scheme (DTSS) for less-well-off adults, the National Surveys of Adult Oral Health, and the 2007 Survey of Lifestyle Attitudes and Nutrition in Ireland. Factors associated with utilisation and retention of natural teeth were analysed using count data models and logistic regression. The chi-square test and the student’s t-test were used to compare epidemiologically-estimated need in a representative sample of adults with treatment provided. Differences were found in dental care utilisation and tooth retention by Socio-Economic Status. An analysis of the five-year utilisation behaviour of a 2003 cohort of DTBS dental attendees revealed that age and being female were positively associated with visiting annually and number of treatments. Number of adults using the DTBS increased, and mean number of treatments per patient decreased, between 1997 and 2008. As a percentage of overall treatments, restorations, dentures, and extractions decreased, while prophylaxis increased. Differences were found between epidemiologically-estimated treatment need and treatment provided for those using the DTBS and DTSS. This research confirms the utility of survey and administrative data to generate knowledge for policymakers. Public administrative databases have not been designed for research purposes, but they have the potential to provide a wealth of knowledge on treatments provided and utilisation patterns.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction: Copayments for prescriptions are associated with decreased adherence to medicines resulting in increased health service utilisation, morbidity and mortality. In October 2010 a 50c copayment per prescription item was introduced on the General Medical Services (GMS) scheme in Ireland, the national public health insurance programme for low-income and older people. The copayment was increased to €1.50 per prescription item in January 2013. To date, the impact of these copayments on adherence to prescription medicines on the GMS scheme has not been assessed. Given that the GMS population comprises more than 40% of the Irish population, this presents an important public health problem. The aim of this thesis was to assess the impact of two prescription copayments, 50c and €1.50, on adherence to medicines.Methods: In Chapter 2 the published literature was systematically reviewed with meta-analysis to a) develop evidence on cost-sharing for prescriptions and adherence to medicines and b) develop evidence for an alternative policy option; removal of copayments. The core research question of this thesis was addressed by a large before and after longitudinal study, with comparator group, using the national pharmacy claims database. New users of essential and less-essential medicines were included in the study with sample sizes ranging from 7,007 to 136,111 individuals in different medication groups. Segmented regression was used with generalised estimating equations to allow for correlations between repeated monthly measurements of adherence. A qualitative study involving 24 individuals was conducted to assess patient attitudes towards the 50c copayment policy. The qualitative and quantitative findings were integrated in the discussion chapter of the thesis. The vast majority of the literature on this topic area is generated in North America, therefore a test of generalisability was carried out in Chapter 5 by comparing the impact of two similar copayment interventions on adherence, one in the U.S. and one in Ireland. The method used to measure adherence in Chapters 3 and 5 was validated in Chapter 6. Results: The systematic review with meta-analysis demonstrated an 11% (95% CI 1.09 to 1.14) increased odds of non-adherence when publicly insured populations were exposed to copayments. The second systematic review found moderate but variable improvements in adherence after removal/reduction of copayments in a general population. The core paper of this thesis found that both the 50c and €1.50 copayments on the GMS scheme were associated with larger reductions in adherence to less-essential medicines than essential medicines directly after the implementation of policies. An important exception to this pattern was observed; adherence to anti-depressant medications declined by a larger extent than adherence to other essential medicines after both copayments. The cross country comparison indicated that North American evidence on cost-sharing for prescriptions is not automatically generalisable to the Irish setting. Irish patients had greater immediate decreases of -5.3% (95% CI -6.9 to -3.7) and -2.8% (95% CI -4.9 to -0.7) in adherence to anti-hypertensives and anti-hyperlipidaemic medicines, respectively, directly after the policy changes, relative to their U.S. counterparts. In the long term, however, the U.S. and Irish populations had similar behaviours. The concordance study highlighted the possibility of a measurement bias occurring for the measurement of adherence to non-steroidal anti-inflammatory drugs in Chapter 3. Conclusions: This thesis has presented two reviews of international cost-sharing policies, an assessment of the generalisability of international evidence and both qualitative and quantitative examinations of cost-sharing policies for prescription medicines on the GMS scheme in Ireland. It was found that the introduction of a 50c copayment and its subsequent increase to €1.50 on the GMS scheme had a larger impact on adherence to less-essential medicines relative to essential medicines, with the exception of anti-depressant medications. This is in line with policy objectives to reduce moral hazard and is therefore demonstrative of the value of such policies. There are however some caveats. The copayment now stands at €2.50 per prescription item. The impact of this increase in copayment has yet to be assessed which is an obvious point for future research. Careful monitoring for adverse effects in socio-economically disadvantaged groups within the GMS population is also warranted. International evidence can be applied to the Irish setting to aid in future decision making in this area, but not without placing it in the local context first. Patients accepted the introduction of the 50c charge, however did voice concerns over a rising price. The challenge for policymakers is to find the ‘optimal copayment’ – whereby moral hazard is decreased, but access to essential chronic disease medicines that provide advantages at the population level is not deterred. This evidence presented in this thesis will be utilisable for future policy-making in Ireland.