923 resultados para Teorema de Bayes


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Structural genomics aims to solve a large number of protein structures that represent the protein space. Currently an exhaustive solution for all structures seems prohibitively expensive, so the challenge is to define a relatively small set of proteins with new, currently unknown folds. This paper presents a method that assigns each protein with a probability of having an unsolved fold. The method makes extensive use of protomap, a sequence-based classification, and scop, a structure-based classification. According to protomap, the protein space encodes the relationship among proteins as a graph whose vertices correspond to 13,354 clusters of proteins. A representative fold for a cluster with at least one solved protein is determined after superposition of all scop (release 1.37) folds onto protomap clusters. Distances within the protomap graph are computed from each representative fold to the neighboring folds. The distribution of these distances is used to create a statistical model for distances among those folds that are already known and those that have yet to be discovered. The distribution of distances for solved/unsolved proteins is significantly different. This difference makes it possible to use Bayes' rule to derive a statistical estimate that any protein has a yet undetermined fold. Proteins that score the highest probability to represent a new fold constitute the target list for structural determination. Our predicted probabilities for unsolved proteins correlate very well with the proportion of new folds among recently solved structures (new scop 1.39 records) that are disjoint from our original training set.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Os motores de indução trifásicos são os principais elementos de conversão de energia elétrica em mecânica motriz aplicados em vários setores produtivos. Identificar um defeito no motor em operação pode fornecer, antes que ele falhe, maior segurança no processo de tomada de decisão sobre a manutenção da máquina, redução de custos e aumento de disponibilidade. Nesta tese são apresentas inicialmente uma revisão bibliográfica e a metodologia geral para a reprodução dos defeitos nos motores e a aplicação da técnica de discretização dos sinais de correntes e tensões no domínio do tempo. É também desenvolvido um estudo comparativo entre métodos de classificação de padrões para a identificação de defeitos nestas máquinas, tais como: Naive Bayes, k-Nearest Neighbor, Support Vector Machine (Sequential Minimal Optimization), Rede Neural Artificial (Perceptron Multicamadas), Repeated Incremental Pruning to Produce Error Reduction e C4.5 Decision Tree. Também aplicou-se o conceito de Sistemas Multiagentes (SMA) para suportar a utilização de múltiplos métodos concorrentes de forma distribuída para reconhecimento de padrões de defeitos em rolamentos defeituosos, quebras nas barras da gaiola de esquilo do rotor e curto-circuito entre as bobinas do enrolamento do estator de motores de indução trifásicos. Complementarmente, algumas estratégias para a definição da severidade dos defeitos supracitados em motores foram exploradas, fazendo inclusive uma averiguação da influência do desequilíbrio de tensão na alimentação da máquina para a determinação destas anomalias. Os dados experimentais foram adquiridos por meio de uma bancada experimental em laboratório com motores de potência de 1 e 2 cv acionados diretamente na rede elétrica, operando em várias condições de desequilíbrio das tensões e variações da carga mecânica aplicada ao eixo do motor.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

O método dos elementos finitos é o método numérico mais difundido na análise de estruturas. Ao longo das últimas décadas foram formulados inúmeros elementos finitos para análise de cascas e placas. As formulações de elementos finitos lidam bem com o campo de deslocamentos, mas geralmente faltam testes que possam validar os resultados obtidos para o campo das tensões. Este trabalho analisa o elemento finito T6-3i, um elemento finito triangular de seis nós proposto dentro de uma formulação geometricamente exata, em relação aos seus resultados de tensões, comparando-os com as teorias analíticas de placas, resultados de tabelas para o cálculo de momentos em placas retangulares e do ANSYSr, um software comercial para análise estrutural, mostrando que o T6-3i pode apresentar resultados insatisfatórios. Na segunda parte deste trabalho, as potencialidades do T6-3i são expandidas, sendo proposta uma formulação dinâmica para análise não linear de cascas. Utiliza-se um modelo Lagrangiano atualizado e a forma fraca é obtida do Teorema dos Trabalhos Virtuais. São feitas simulações numéricas da deformação de domos finos que apresentam vários snap-throughs e snap-backs, incluindo domos com vincos curvos, mostrando a robustez, simplicidade e versatilidade do elemento na sua formulação e na geração das malhas não estruturadas necessárias para as simulações.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A anotação geográfica de documentos consiste na adoção de metadados para a identificação de nomes de locais e a posição de suas ocorrências no texto. Esta informação é útil, por exemplo, para mecanismos de busca. A partir dos topônimos mencionados no texto é possível identificar o contexto espacial em que o assunto do texto está inserido, o que permite agrupar documentos que se refiram a um mesmo contexto, atribuindo ao documento um escopo geográfico. Esta Dissertação de Mestrado apresenta um novo método, batizado de Geofier, para determinação do escopo geográfico de documentos. A novidade apresentada pelo Geofier é a possibilidade da identificação do escopo geográfico de um documento por meio de classificadores de aprendizagem de máquina treinados sem o uso de um gazetteer e sem premissas quanto à língua dos textos analisados. A Wikipédia foi utilizada como fonte de um conjunto de documentos anotados geograficamente para o treinamento de uma hierarquia de Classificadores Naive Bayes e Support Vector Machines (SVMs). Uma comparação de desempenho entre o Geofier e uma reimplementação do sistema Web-a-Where foi realizada em relação à determinação do escopo geográfico dos textos da Wikipédia. A hierarquia do Geofier foi treinada e avaliada de duas formas: usando topônimos do mesmo gazetteer que o Web-a-Where e usando n-gramas extraídos dos documentos de treinamento. Como resultado, o Geofier manteve desempenho superior ao obtido pela reimplementação do Web-a-Where.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Apuntes en formato html que incluyen los siguientes temas de la parte de simulación en la asignatura «simulación y optimización de procesos químicos» TEMA 1. Introducción 1.1 Introducción. 1.2 Desarrollo histórico de la simulación de procesos. Relación entre simulación optimización y síntesis de procesos. 1.3 Tipos de simuladores: Modular secuencial. Modular simultáneo. Basada en ecuaciones. TEMA 2. Simulación Modular Secuencial 2.1 Descomposición de diagramas de flujo (flowsheeting) 2.2 Métodos basados en las matrices booleanas Localización de redes cíclicas máximas. Algoritmo de Sargent y Westerberg. Algoritmo de Tarjan. 2.3 Selección de las corrientes de corte: 2.3.1 Caso general planteamiento como un "set-covering problem" (algoritmo de Pho y Lapidus) 2.3.2 Número mínimo de corrientes de corte (algoritmo de Barkley y Motard) 2.3.3 Conjunto de corrientes de corte no redundante (Algoritmo de Upadhye y Grens) TEMA 3. Simulación Modular Simultánea 3.1 Efecto de las estrategias tipo cuasi Newton sobre la convergencia de los diagramas de flujo. TEMA 4. Simulación Basada en Ecuaciones 4.1 Introducción. Métodos de factorización de matrices dispersas. Métodos a priori y métodos locales. 4.2 Métodos locales: Criterio de Markowitz. 4.3 Métodos a priori: 4.3.1 Triangularización por bloques: a. Base de salida admisible (transversal completo). b. Aplicación de los algoritmos de Sargent y Tarjan a matrices dispersas. c. Reordenación. 4.3.2 Transformación en matriz triangular bordeada. 4.4 Fase numerica. Algoritmo RANKI 4.5 Comparación entre los diferentes sistemas de simulación. Ventajas e Inconvenientes. TEMA 5. Grados de libertad y variables de diseño de un diagrama de flujo 5.1 Teorema de Duhem y regla de las fases 5.2 Grados de libertad de un equipo 5.3 Grados de libertad de un diagrama de flujo 5.4 Elección de las variables de diseño.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Blind deconvolution is the problem of recovering a sharp image and a blur kernel from a noisy blurry image. Recently, there has been a significant effort on understanding the basic mechanisms to solve blind deconvolution. While this effort resulted in the deployment of effective algorithms, the theoretical findings generated contrasting views on why these approaches worked. On the one hand, one could observe experimentally that alternating energy minimization algorithms converge to the desired solution. On the other hand, it has been shown that such alternating minimization algorithms should fail to converge and one should instead use a so-called Variational Bayes approach. To clarify this conundrum, recent work showed that a good image and blur prior is instead what makes a blind deconvolution algorithm work. Unfortunately, this analysis did not apply to algorithms based on total variation regularization. In this manuscript, we provide both analysis and experiments to get a clearer picture of blind deconvolution. Our analysis reveals the very reason why an algorithm based on total variation works. We also introduce an implementation of this algorithm and show that, in spite of its extreme simplicity, it is very robust and achieves a performance comparable to the top performing algorithms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

"This report is based on Teresa Bayes' and Sandra Hough's masters theses ... "--Pref.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mode of access: Internet.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The quality of reporting of studies of diagnostic accuracy is less than optimal. Complete and accurate reporting is necessary to enable readers to assess the potential for bias in the study and to evaluate the generalisability of the results. A group of scientists and editors has developed the STARD (Standards for Reporting of Diagnostic Accuracy) statement to improve the reporting the quality of reporting of studies of diagnostic accuracy. The statement consists of a checklist of 25 items and flow diagram that authors can use to ensure that all relevant information is present. This explanatory document aims to facilitate the use, understanding and dissemination of the checklist. The document contains a clarification of the meaning, rationale and optimal use of each item on the checklist, as well as a short summary of the available evidence on bias and applicability. The STARD statement, checklist, flowchart and this explanation and elaboration document should be useful resources to improve reporting of diagnostic accuracy studies. Complete and informative reporting can only lead to better decisions in healthcare.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Euastacus crayfish are endemic to freshwater ecosystems of the eastern coast of Australia. While recent evolutionary studies have focused on a few of these species, here we provide a comprehensive phylogenetic estimate of relationships among the species within the genus. We sequenced three mitochondrial gene regions (COI, 16S, and 12S) and one nuclear region (28S) from 40 species of the genus Euastacus, as well as one undescribed species. Using these data, we estimated the phylogenetic relationships within the genus using maximum-likelihood, parsimony, and Bayesian Markov Chain Monte Carlo analyses. Using Bayes factors to test different model hypotheses, we found that the best phylogeny supports monophyletic groupings of all but two recognized species and suggests a widespread ancestor that diverged by vicariance. We also show that Eitastacus and Astacopsis are most likely monophyletic sister genera. We use the resulting phylogeny as a framework to test biogeographic hypotheses relating to the diversification of the genus. (c) 2005 Elsevier Inc. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Motivation: An important problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. We provide a straightforward and easily implemented method for estimating the posterior probability that an individual gene is null. The problem can be expressed in a two-component mixture framework, using an empirical Bayes approach. Current methods of implementing this approach either have some limitations due to the minimal assumptions made or with more specific assumptions are computationally intensive. Results: By converting to a z-score the value of the test statistic used to test the significance of each gene, we propose a simple two-component normal mixture that models adequately the distribution of this score. The usefulness of our approach is demonstrated on three real datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we demonstrate that it is possible to gradually improve the performance of support vector machine (SVM) classifiers by using a genetic algorithm to select a sequence of training subsets from the available data. Performance improvement is possible because the SVM solution generally lies some distance away from the Bayes optimal in the space of learning parameters. We illustrate performance improvements on a number of benchmark data sets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A novel approach, based on statistical mechanics, to analyze typical performance of optimum code-division multiple-access (CDMA) multiuser detectors is reviewed. A `black-box' view ot the basic CDMA channel is introduced, based on which the CDMA multiuser detection problem is regarded as a `learning-from-examples' problem of the `binary linear perceptron' in the neural network literature. Adopting Bayes framework, analysis of the performance of the optimum CDMA multiuser detectors is reduced to evaluation of the average of the cumulant generating function of a relevant posterior distribution. The evaluation of the average cumulant generating function is done, based on formal analogy with a similar calculation appearing in the spin glass theory in statistical mechanics, by making use of the replica method, a method developed in the spin glass theory.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper, addresses the problem of novelty detection in the case that the observed data is a mixture of a known 'background' process contaminated with an unknown other process, which generates the outliers, or novel observations. The framework we describe here is quite general, employing univariate classification with incomplete information, based on knowledge of the distribution (the 'probability density function', 'pdf') of the data generated by the 'background' process. The relative proportion of this 'background' component (the 'prior' 'background' 'probability), the 'pdf' and the 'prior' probabilities of all other components are all assumed unknown. The main contribution is a new classification scheme that identifies the maximum proportion of observed data following the known 'background' distribution. The method exploits the Kolmogorov-Smirnov test to estimate the proportions, and afterwards data are Bayes optimally separated. Results, demonstrated with synthetic data, show that this approach can produce more reliable results than a standard novelty detection scheme. The classification algorithm is then applied to the problem of identifying outliers in the SIC2004 data set, in order to detect the radioactive release simulated in the 'oker' data set. We propose this method as a reliable means of novelty detection in the emergency situation which can also be used to identify outliers prior to the application of a more general automatic mapping algorithm. © Springer-Verlag 2007.