980 resultados para Untouchable Databases


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Projeto de Pós-Graduação/Dissertação apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Mestre em Medicina Dentária

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Projeto de Pós-Graduação/Dissertação apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Mestre em Medicina Dentária

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Projeto de Pós-Graduação/Dissertação apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Mestre em Medicina Dentária

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Dissertação apresentada à Universidade Fernando Pessoa como partes dos requisitos para a obtenção do grau de Mestre em Engenharia Informática, ramo de Sistemas de Informação e Multimédia

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Projeto de Pós-Graduação/Dissertação apresentado à Universidade Fernando Pessoa como parte dos requisitos para obtenção do grau de Mestre em Ciências Farmacêuticas

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The SIEGE (Smoking Induced Epithelial Gene Expression) database is a clinical resource for compiling and analyzing gene expression data from epithelial cells of the human intra-thoracic airway. This database supports a translational research study whose goal is to profile the changes in airway gene expression that are induced by cigarette smoke. RNA is isolated from airway epithelium obtained at bronchoscopy from current-, former- and never-smoker subjects, and hybridized to Affymetrix HG-U133A Genechips, which measure the level of expression of ~22 500 human transcripts. The microarray data generated along with relevant patient information is uploaded to SIEGE by study administrators using the database's web interface, found at http://pulm.bumc.bu.edu/siegeDB. PERL-coded scripts integrated with SIEGE perform various quality control functions including the processing, filtering and formatting of stored data. The R statistical package is used to import database expression values and execute a number of statistical analyses including t-tests, correlation coefficients and hierarchical clustering. Values from all statistical analyses can be queried through CGI-based tools and web forms found on the �Search� section of the database website. Query results are embedded with graphical capabilities as well as with links to other databases containing valuable gene resources, including Entrez Gene, GO, Biocarta, GeneCards, dbSNP and the NCBI Map Viewer.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND:In the current climate of high-throughput computational biology, the inference of a protein's function from related measurements, such as protein-protein interaction relations, has become a canonical task. Most existing technologies pursue this task as a classification problem, on a term-by-term basis, for each term in a database, such as the Gene Ontology (GO) database, a popular rigorous vocabulary for biological functions. However, ontology structures are essentially hierarchies, with certain top to bottom annotation rules which protein function predictions should in principle follow. Currently, the most common approach to imposing these hierarchical constraints on network-based classifiers is through the use of transitive closure to predictions.RESULTS:We propose a probabilistic framework to integrate information in relational data, in the form of a protein-protein interaction network, and a hierarchically structured database of terms, in the form of the GO database, for the purpose of protein function prediction. At the heart of our framework is a factorization of local neighborhood information in the protein-protein interaction network across successive ancestral terms in the GO hierarchy. We introduce a classifier within this framework, with computationally efficient implementation, that produces GO-term predictions that naturally obey a hierarchical 'true-path' consistency from root to leaves, without the need for further post-processing.CONCLUSION:A cross-validation study, using data from the yeast Saccharomyces cerevisiae, shows our method offers substantial improvements over both standard 'guilt-by-association' (i.e., Nearest-Neighbor) and more refined Markov random field methods, whether in their original form or when post-processed to artificially impose 'true-path' consistency. Further analysis of the results indicates that these improvements are associated with increased predictive capabilities (i.e., increased positive predictive value), and that this increase is consistent uniformly with GO-term depth. Additional in silico validation on a collection of new annotations recently added to GO confirms the advantages suggested by the cross-validation study. Taken as a whole, our results show that a hierarchical approach to network-based protein function prediction, that exploits the ontological structure of protein annotation databases in a principled manner, can offer substantial advantages over the successive application of 'flat' network-based methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Paper published in PLoS Medicine in 2007.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We describe our work on shape-based image database search using the technique of modal matching. Modal matching employs a deformable shape decomposition that allows users to select example objects and have the computer efficiently sort the set of objects based on the similarity of their shape. Shapes are compared in terms of the types of nonrigid deformations (differences) that relate them. The modal decomposition provides deformation "control knobs" for flexible matching and thus allows for selecting weighted subsets of shape parameters that are deemed significant for a particular category or context. We demonstrate the utility of this approach for shape comparison in 2-D image databases; however, the general formulation is applicable to signals of any dimensionality.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recent work in sensor databases has focused extensively on distributed query problems, notably distributed computation of aggregates. Existing methods for computing aggregates broadcast queries to all sensors and use in-network aggregation of responses to minimize messaging costs. In this work, we focus on uniform random sampling across nodes, which can serve both as an alternative building block for aggregation and as an integral component of many other useful randomized algorithms. Prior to our work, the best existing proposals for uniform random sampling of sensors involve contacting all nodes in the network. We propose a practical method which is only approximately uniform, but contacts a number of sensors proportional to the diameter of the network instead of its size. The approximation achieved is tunably close to exact uniform sampling, and only relies on well-known existing primitives, namely geographic routing, distributed computation of Voronoi regions and von Neumann's rejection method. Ultimately, our sampling algorithm has the same worst-case asymptotic cost as routing a point-to-point message, and thus it is asymptotically optimal among request/reply-based sampling methods. We provide experimental results demonstrating the effectiveness of our algorithm on both synthetic and real sensor topologies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A problem with Speculative Concurrency Control algorithms and other common concurrency control schemes using forward validation is that committing a transaction as soon as it finishes validating, may result in a value loss to the system. Haritsa showed that by making a lower priority transaction wait after it is validated, the number of transactions meeting their deadlines is increased, which may result in a higher value-added to the system. SCC-based protocols can benefit from the introduction of such delays by giving optimistic shadows with high value-added to the system more time to execute and commit instead of being aborted in favor of other validating transactions, whose value-added to the system is lower. In this paper we present and evaluate an extension to SCC algorithms that allows for commit deferments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The design of programs for broadcast disks which incorporate real-time and fault-tolerance requirements is considered. A generalized model for real-time fault-tolerant broadcast disks is defined. It is shown that designing programs for broadcast disks specified in this model is closely related to the scheduling of pinwheel task systems. Some new results in pinwheel scheduling theory are derived, which facilitate the efficient generation of real-time fault-tolerant broadcast disk programs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

ImageRover is a search by image content navigation tool for the world wide web. To gather images expediently, the image collection subsystem utilizes a distributed fleet of WWW robots running on different computers. The image robots gather information about the images they find, computing the appropriate image decompositions and indices, and store this extracted information in vector form for searches based on image content. At search time, users can iteratively guide the search through the selection of relevant examples. Search performance is made efficient through the use of an approximate, optimized k-d tree algorithm. The system employs a novel relevance feedback algorithm that selects the distance metrics appropriate for a particular query.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There is an increased interest in using broadcast disks to support mobile access to real-time databases. However, previous work has only considered the design of real-time immutable broadcast disks, the contents of which do not change over time. This paper considers the design of programs for real-time mutable broadcast disks - broadcast disks whose contents are occasionally updated. Recent scheduling-theoretic results relating to pinwheel scheduling and pfair scheduling are used to design algorithms for the efficient generation of real-time mutable broadcast disk programs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The problem of discovering frequent poly-regions (i.e. regions of high occurrence of a set of items or patterns of a given alphabet) in a sequence is studied, and three efficient approaches are proposed to solve it. The first one is entropy-based and applies a recursive segmentation technique that produces a set of candidate segments which may potentially lead to a poly-region. The key idea of the second approach is the use of a set of sliding windows over the sequence. Each sliding window covers a sequence segment and keeps a set of statistics that mainly include the number of occurrences of each item or pattern in that segment. Combining these statistics efficiently yields the complete set of poly-regions in the given sequence. The third approach applies a technique based on the majority vote, achieving linear running time with a minimal number of false negatives. After identifying the poly-regions, the sequence is converted to a sequence of labeled intervals (each one corresponding to a poly-region). An efficient algorithm for mining frequent arrangements of intervals is applied to the converted sequence to discover frequently occurring arrangements of poly-regions in different parts of DNA, including coding regions. The proposed algorithms are tested on various DNA sequences producing results of significant biological meaning.