868 resultados para Bayesian algorithm
Resumo:
BACKGROUND: Surveillance of multiple congenital anomalies is considered to be more sensitive for the detection of new teratogens than surveillance of all or isolated congenital anomalies. Current literature proposes the manual review of all cases for classification into isolated or multiple congenital anomalies. METHODS: Multiple anomalies were defined as two or more major congenital anomalies, excluding sequences and syndromes. A computer algorithm for classification of major congenital anomaly cases in the EUROCAT database according to International Classification of Diseases (ICD)v10 codes was programmed, further developed, and implemented for 1 year's data (2004) from 25 registries. The group of cases classified with potential multiple congenital anomalies were manually reviewed by three geneticists to reach a final agreement of classification as "multiple congenital anomaly" cases. RESULTS: A total of 17,733 cases with major congenital anomalies were reported giving an overall prevalence of major congenital anomalies at 2.17%. The computer algorithm classified 10.5% of all cases as "potentially multiple congenital anomalies". After manual review of these cases, 7% were agreed to have true multiple congenital anomalies. Furthermore, the algorithm classified 15% of all cases as having chromosomal anomalies, 2% as monogenic syndromes, and 76% as isolated congenital anomalies. The proportion of multiple anomalies varies by congenital anomaly subgroup with up to 35% of cases with bilateral renal agenesis. CONCLUSIONS: The implementation of the EUROCAT computer algorithm is a feasible, efficient, and transparent way to improve classification of congenital anomalies for surveillance and research.
Resumo:
The atomic force microscope is not only a very convenient tool for studying the topography of different samples, but it can also be used to measure specific binding forces between molecules. For this purpose, one type of molecule is attached to the tip and the other one to the substrate. Approaching the tip to the substrate allows the molecules to bind together. Retracting the tip breaks the newly formed bond. The rupture of a specific bond appears in the force-distance curves as a spike from which the binding force can be deduced. In this article we present an algorithm to automatically process force-distance curves in order to obtain bond strength histograms. The algorithm is based on a fuzzy logic approach that permits an evaluation of "quality" for every event and makes the detection procedure much faster compared to a manual selection. In this article, the software has been applied to measure the binding strength between tubuline and microtubuline associated proteins.
Resumo:
Background: The first AO comprehensive pediatric long bone fracture classification system has been established following a structured path of development and validation with experienced pediatric surgeons. Methods: A follow-up series of agreement studies was applied to specify and evaluate a grading system for displacement of pediatric supracondylar fractures. An iterative process comprising an international group of 5 experienced pediatric surgeons (Phase 1) followed by a pragmatic multicenter agreement study involving 26 raters (Phase 2) was used. The last evaluations were conducted on a consecutive collection of 154 supracondylar fractures documented by standard anteroposterior and lateral radiographs. Results: Fractures were classified according to 1 of 4 grades: I = incomplete fracture with no or minimal displacement; II = Incomplete fracture with continuity of the posterior (extension fracture) or anterior cortex (flexion fracture); III = lack of bone continuity (broken cortex), but still some contact between the fracture planes; IV = complete fracture with no bone continuity (broken cortex), and no contact between the fracture planes. A diagnostic algorithm to support the practical application of the grading system in a clinical setting, as well as an aid using a circle placed over the capitellum was proposed. The overall kappa coefficients were 0.68 and 0.61 in the Phase 1 and Phase 2 studies, respectively. In the Phase 1 study, fracture grades I, II, III, and IV were classified with median accuracies of 91%, 82%, 83%, and 99.5%, respectively. Similar median accuracies of 86% (Grade I), 73% (Grade II), 83%(Grade III), and 92% were reported for the Phase 2 study. Reliability was high in distinguishing complete, unstable fractures from stable injuries [ie, kappa coefficients of 0.84 (Phase 1) and 0.83 (Phase 2) were calculated]; in Phase 2, surgeons' accuracies in classifying complete fractures were all above 85%. Conclusions: With clear and unambiguous definition, this new grading system for supracondylar fracture displacement has proved to be sufficiently reliable and accurate when applied by pediatric surgeons in the framework of clinical routine as well as research.
Resumo:
Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.
Resumo:
Both, Bayesian networks and probabilistic evaluation are gaining more and more widespread use within many professional branches, including forensic science. Notwithstanding, they constitute subtle topics with definitional details that require careful study. While many sophisticated developments of probabilistic approaches to evaluation of forensic findings may readily be found in published literature, there remains a gap with respect to writings that focus on foundational aspects and on how these may be acquired by interested scientists new to these topics. This paper takes this as a starting point to report on the learning about Bayesian networks for likelihood ratio based, probabilistic inference procedures in a class of master students in forensic science. The presentation uses an example that relies on a casework scenario drawn from published literature, involving a questioned signature. A complicating aspect of that case study - proposed to students in a teaching scenario - is due to the need of considering multiple competing propositions, which is an outset that may not readily be approached within a likelihood ratio based framework without drawing attention to some additional technical details. Using generic Bayesian networks fragments from existing literature on the topic, course participants were able to track the probabilistic underpinnings of the proposed scenario correctly both in terms of likelihood ratios and of posterior probabilities. In addition, further study of the example by students allowed them to derive an alternative Bayesian network structure with a computational output that is equivalent to existing probabilistic solutions. This practical experience underlines the potential of Bayesian networks to support and clarify foundational principles of probabilistic procedures for forensic evaluation.
Resumo:
It is well known the relationship between source separation and blind deconvolution: If a filtered version of an unknown i.i.d. signal is observed, temporal independence between samples can be used to retrieve the original signal, in the same manner as spatial independence is used for source separation. In this paper we propose the use of a Genetic Algorithm (GA) to blindly invert linear channels. The use of GA is justified in the case of small number of samples, where other gradient-like methods fails because of poor estimation of statistics.
Resumo:
Significant progress has been made with regard to the quantitative integration of geophysical and hydrological data at the local scale. However, extending the corresponding approaches to the regional scale represents a major, and as-of-yet largely unresolved, challenge. To address this problem, we have developed an upscaling procedure based on a Bayesian sequential simulation approach. This method is then applied to the stochastic integration of low-resolution, regional-scale electrical resistivity tomography (ERT) data in combination with high-resolution, local-scale downhole measurements of the hydraulic and electrical conductivities. Finally, the overall viability of this upscaling approach is tested and verified by performing and comparing flow and transport simulation through the original and the upscaled hydraulic conductivity fields. Our results indicate that the proposed procedure does indeed allow for obtaining remarkably faithful estimates of the regional-scale hydraulic conductivity structure and correspondingly reliable predictions of the transport characteristics over relatively long distances.
Resumo:
In this demonstration we present our web services to perform Bayesian learning for classification tasks.
Resumo:
A statewide study was performed to develop regional regression equations for estimating selected annual exceedance- probability statistics for ungaged stream sites in Iowa. The study area comprises streamgages located within Iowa and 50 miles beyond the State’s borders. Annual exceedanceprobability estimates were computed for 518 streamgages by using the expected moments algorithm to fit a Pearson Type III distribution to the logarithms of annual peak discharges for each streamgage using annual peak-discharge data through 2010. The estimation of the selected statistics included a Bayesian weighted least-squares/generalized least-squares regression analysis to update regional skew coefficients for the 518 streamgages. Low-outlier and historic information were incorporated into the annual exceedance-probability analyses, and a generalized Grubbs-Beck test was used to detect multiple potentially influential low flows. Also, geographic information system software was used to measure 59 selected basin characteristics for each streamgage. Regional regression analysis, using generalized leastsquares regression, was used to develop a set of equations for each flood region in Iowa for estimating discharges for ungaged stream sites with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities, which are equivalent to annual flood-frequency recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years, respectively. A total of 394 streamgages were included in the development of regional regression equations for three flood regions (regions 1, 2, and 3) that were defined for Iowa based on landform regions and soil regions. Average standard errors of prediction range from 31.8 to 45.2 percent for flood region 1, 19.4 to 46.8 percent for flood region 2, and 26.5 to 43.1 percent for flood region 3. The pseudo coefficients of determination for the generalized leastsquares equations range from 90.8 to 96.2 percent for flood region 1, 91.5 to 97.9 percent for flood region 2, and 92.4 to 96.0 percent for flood region 3. The regression equations are applicable only to stream sites in Iowa with flows not significantly affected by regulation, diversion, channelization, backwater, or urbanization and with basin characteristics within the range of those used to develop the equations. These regression equations will be implemented within the U.S. Geological Survey StreamStats Web-based geographic information system tool. StreamStats allows users to click on any ungaged site on a river and compute estimates of the eight selected statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged sites also are provided by the Web-based tool. StreamStats also allows users to click on any streamgage in Iowa and estimates computed for these eight selected statistics are provided for the streamgage.
Resumo:
Context: Ovarian tumors (OT) typing is a competency expected from pathologists, with significant clinical implications. OT however come in numerous different types, some rather rare, with the consequence of few opportunities for practice in some departments. Aim: Our aim was to design a tool for pathologists to train in less common OT typing. Method and Results: Representative slides of 20 less common OT were scanned (Nano Zoomer Digital Hamamatsu®) and the diagnostic algorithm proposed by Young and Scully applied to each case (Young RH and Scully RE, Seminars in Diagnostic Pathology 2001, 18: 161-235) to include: recognition of morphological pattern(s); shortlisting of differential diagnosis; proposition of relevant immunohistochemical markers. The next steps of this project will be: evaluation of the tool in several post-graduate training centers in Europe and Québec; improvement of its design based on evaluation results; diffusion to a larger public. Discussion: In clinical medicine, solving many cases is recognized as of utmost importance for a novice to become an expert. This project relies on the virtual slides technology to provide pathologists with a learning tool aimed at increasing their skills in OT typing. After due evaluation, this model might be extended to other uncommon tumors.
Resumo:
In this paper, a hybrid simulation-based algorithm is proposed for the StochasticFlow Shop Problem. The main idea of the methodology is to transform the stochastic problem into a deterministic problem and then apply simulation to the latter. In order to achieve this goal, we rely on Monte Carlo Simulation and an adapted version of a deterministic heuristic. This approach aims to provide flexibility and simplicity due to the fact that it is not constrained by any previous assumption and relies in well-tested heuristics.
Resumo:
3 Summary 3. 1 English The pharmaceutical industry has been facing several challenges during the last years, and the optimization of their drug discovery pipeline is believed to be the only viable solution. High-throughput techniques do participate actively to this optimization, especially when complemented by computational approaches aiming at rationalizing the enormous amount of information that they can produce. In siiico techniques, such as virtual screening or rational drug design, are now routinely used to guide drug discovery. Both heavily rely on the prediction of the molecular interaction (docking) occurring between drug-like molecules and a therapeutically relevant target. Several softwares are available to this end, but despite the very promising picture drawn in most benchmarks, they still hold several hidden weaknesses. As pointed out in several recent reviews, the docking problem is far from being solved, and there is now a need for methods able to identify binding modes with a high accuracy, which is essential to reliably compute the binding free energy of the ligand. This quantity is directly linked to its affinity and can be related to its biological activity. Accurate docking algorithms are thus critical for both the discovery and the rational optimization of new drugs. In this thesis, a new docking software aiming at this goal is presented, EADock. It uses a hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with .the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 R around the center of mass of the ligand position in the crystal structure, and conversely to other benchmarks, our algorithms was fed with optimized ligand positions up to 10 A root mean square deviation 2MSD) from the crystal structure. This validation illustrates the efficiency of our sampling heuristic, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best-ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures in this benchmark could be explained by the presence of crystal contacts in the experimental structure. EADock has been used to understand molecular interactions involved in the regulation of the Na,K ATPase, and in the activation of the nuclear hormone peroxisome proliferatoractivated receptors a (PPARa). It also helped to understand the action of common pollutants (phthalates) on PPARy, and the impact of biotransformations of the anticancer drug Imatinib (Gleevec®) on its binding mode to the Bcr-Abl tyrosine kinase. Finally, a fragment-based rational drug design approach using EADock was developed, and led to the successful design of new peptidic ligands for the a5ß1 integrin, and for the human PPARa. In both cases, the designed peptides presented activities comparable to that of well-established ligands such as the anticancer drug Cilengitide and Wy14,643, respectively. 3.2 French Les récentes difficultés de l'industrie pharmaceutique ne semblent pouvoir se résoudre que par l'optimisation de leur processus de développement de médicaments. Cette dernière implique de plus en plus. de techniques dites "haut-débit", particulièrement efficaces lorsqu'elles sont couplées aux outils informatiques permettant de gérer la masse de données produite. Désormais, les approches in silico telles que le criblage virtuel ou la conception rationnelle de nouvelles molécules sont utilisées couramment. Toutes deux reposent sur la capacité à prédire les détails de l'interaction moléculaire entre une molécule ressemblant à un principe actif (PA) et une protéine cible ayant un intérêt thérapeutique. Les comparatifs de logiciels s'attaquant à cette prédiction sont flatteurs, mais plusieurs problèmes subsistent. La littérature récente tend à remettre en cause leur fiabilité, affirmant l'émergence .d'un besoin pour des approches plus précises du mode d'interaction. Cette précision est essentielle au calcul de l'énergie libre de liaison, qui est directement liée à l'affinité du PA potentiel pour la protéine cible, et indirectement liée à son activité biologique. Une prédiction précise est d'une importance toute particulière pour la découverte et l'optimisation de nouvelles molécules actives. Cette thèse présente un nouveau logiciel, EADock, mettant en avant une telle précision. Cet algorithme évolutionnaire hybride utilise deux pressions de sélections, combinées à une gestion de la diversité sophistiquée. EADock repose sur CHARMM pour les calculs d'énergie et la gestion des coordonnées atomiques. Sa validation a été effectuée sur 37 complexes protéine-ligand cristallisés, incluant 11 protéines différentes. L'espace de recherche a été étendu à une sphère de 151 de rayon autour du centre de masse du ligand cristallisé, et contrairement aux comparatifs habituels, l'algorithme est parti de solutions optimisées présentant un RMSD jusqu'à 10 R par rapport à la structure cristalline. Cette validation a permis de mettre en évidence l'efficacité de notre heuristique de recherche car des modes d'interactions présentant un RMSD inférieur à 2 R par rapport à la structure cristalline ont été classés premier pour 68% des complexes. Lorsque les cinq meilleures solutions sont prises en compte, le taux de succès grimpe à 78%, et 92% lorsque la totalité de la dernière génération est prise en compte. La plupart des erreurs de prédiction sont imputables à la présence de contacts cristallins. Depuis, EADock a été utilisé pour comprendre les mécanismes moléculaires impliqués dans la régulation de la Na,K ATPase et dans l'activation du peroxisome proliferatoractivated receptor a (PPARa). Il a également permis de décrire l'interaction de polluants couramment rencontrés sur PPARy, ainsi que l'influence de la métabolisation de l'Imatinib (PA anticancéreux) sur la fixation à la kinase Bcr-Abl. Une approche basée sur la prédiction des interactions de fragments moléculaires avec protéine cible est également proposée. Elle a permis la découverte de nouveaux ligands peptidiques de PPARa et de l'intégrine a5ß1. Dans les deux cas, l'activité de ces nouveaux peptides est comparable à celles de ligands bien établis, comme le Wy14,643 pour le premier, et le Cilengitide (PA anticancéreux) pour la seconde.
Resumo:
In this paper, a hybrid simulation-based algorithm is proposed for the StochasticFlow Shop Problem. The main idea of the methodology is to transform the stochastic problem into a deterministic problem and then apply simulation to the latter. In order to achieve this goal, we rely on Monte Carlo Simulation and an adapted version of a deterministic heuristic. This approach aims to provide flexibility and simplicity due to the fact that it is not constrained by any previous assumption and relies in well-tested heuristics.
Resumo:
Résumé Ce travail de thèse étudie des moyens de formalisation permettant d'assister l'expert forensique dans la gestion des facteurs influençant l'évaluation des indices scientifiques, tout en respectant des procédures d'inférence établies et acceptables. Selon une vue préconisée par une partie majoritaire de la littérature forensique et juridique - adoptée ici sans réserve comme point de départ - la conceptualisation d'une procédure évaluative est dite 'cohérente' lors qu'elle repose sur une implémentation systématique de la théorie des probabilités. Souvent, par contre, la mise en oeuvre du raisonnement probabiliste ne découle pas de manière automatique et peut se heurter à des problèmes de complexité, dus, par exemple, à des connaissances limitées du domaine en question ou encore au nombre important de facteurs pouvant entrer en ligne de compte. En vue de gérer ce genre de complications, le présent travail propose d'investiguer une formalisation de la théorie des probabilités au moyen d'un environment graphique, connu sous le nom de Réseaux bayesiens (Bayesian networks). L'hypothèse principale que cette recherche envisage d'examiner considère que les Réseaux bayesiens, en concert avec certains concepts accessoires (tels que des analyses qualitatives et de sensitivité), constituent une ressource clé dont dispose l'expert forensique pour approcher des problèmes d'inférence de manière cohérente, tant sur un plan conceptuel que pratique. De cette hypothèse de travail, des problèmes individuels ont été extraits, articulés et abordés dans une série de recherches distinctes, mais interconnectées, et dont les résultats - publiés dans des revues à comité de lecture - sont présentés sous forme d'annexes. D'un point de vue général, ce travail apporte trois catégories de résultats. Un premier groupe de résultats met en évidence, sur la base de nombreux exemples touchant à des domaines forensiques divers, l'adéquation en termes de compatibilité et complémentarité entre des modèles de Réseaux bayesiens et des procédures d'évaluation probabilistes existantes. Sur la base de ces indications, les deux autres catégories de résultats montrent, respectivement, que les Réseaux bayesiens permettent également d'aborder des domaines auparavant largement inexplorés d'un point de vue probabiliste et que la disponibilité de données numériques dites 'dures' n'est pas une condition indispensable pour permettre l'implémentation des approches proposées dans ce travail. Le présent ouvrage discute ces résultats par rapport à la littérature actuelle et conclut en proposant les Réseaux bayesiens comme moyen d'explorer des nouvelles voies de recherche, telles que l'étude de diverses formes de combinaison d'indices ainsi que l'analyse de la prise de décision. Pour ce dernier aspect, l'évaluation des probabilités constitue, dans la façon dont elle est préconisée dans ce travail, une étape préliminaire fondamentale de même qu'un moyen opérationnel.
Resumo:
Well developed experimental procedures currently exist for retrieving and analyzing particle evidence from hands of individuals suspected of being associated with the discharge of a firearm. Although analytical approaches (e.g. automated Scanning Electron Microscopy with Energy Dispersive X-ray (SEM-EDS) microanalysis) allow the determination of the presence of elements typically found in gunshot residue (GSR) particles, such analyses provide no information about a given particle's actual source. Possible origins for which scientists may need to account for are a primary exposure to the discharge of a firearm or a secondary transfer due to a contaminated environment. In order to approach such sources of uncertainty in the context of evidential assessment, this paper studies the construction and practical implementation of graphical probability models (i.e. Bayesian networks). These can assist forensic scientists in making the issue tractable within a probabilistic perspective. The proposed models focus on likelihood ratio calculations at various levels of detail as well as case pre-assessment.