40 resultados para Bayesian belief network

em Helda - Digital Repository of University of Helsinki


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This doctoral dissertation introduces an algorithm for constructing the most probable Bayesian network from data for small domains. The algorithm is used to show that a popular goodness criterion for the Bayesian networks has a severe sensitivity problem. The dissertation then proposes an information theoretic criterion that avoids the problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this research is to draw up a clear construction of an anticipatory communicative decision-making process and a successful implementation of a Bayesian application that can be used as an anticipatory communicative decision-making support system. This study is a decision-oriented and constructive research project, and it includes examples of simulated situations. As a basis for further methodological discussion about different approaches to management research, in this research, a decision-oriented approach is used, which is based on mathematics and logic, and it is intended to develop problem solving methods. The approach is theoretical and characteristic of normative management science research. Also, the approach of this study is constructive. An essential part of the constructive approach is to tie the problem to its solution with theoretical knowledge. Firstly, the basic definitions and behaviours of an anticipatory management and managerial communication are provided. These descriptions include discussions of the research environment and formed management processes. These issues define and explain the background to further research. Secondly, it is processed to managerial communication and anticipatory decision-making based on preparation, problem solution, and solution search, which are also related to risk management analysis. After that, a solution to the decision-making support application is formed, using four different Bayesian methods, as follows: the Bayesian network, the influence diagram, the qualitative probabilistic network, and the time critical dynamic network. The purpose of the discussion is not to discuss different theories but to explain the theories which are being implemented. Finally, an application of Bayesian networks to the research problem is presented. The usefulness of the prepared model in examining a problem and the represented results of research is shown. The theoretical contribution includes definitions and a model of anticipatory decision-making. The main theoretical contribution of this study has been to develop a process for anticipatory decision-making that includes management with communication, problem-solving, and the improvement of knowledge. The practical contribution includes a Bayesian Decision Support Model, which is based on Bayesian influenced diagrams. The main contributions of this research are two developed processes, one for anticipatory decision-making, and the other to produce a model of a Bayesian network for anticipatory decision-making. In summary, this research contributes to decision-making support by being one of the few publicly available academic descriptions of the anticipatory decision support system, by representing a Bayesian model that is grounded on firm theoretical discussion, by publishing algorithms suitable for decision-making support, and by defining the idea of anticipatory decision-making for a parallel version. Finally, according to the results of research, an analysis of anticipatory management for planned decision-making is presented, which is based on observation of environment, analysis of weak signals, and alternatives to creative problem solving and communication.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose an efficient and parameter-free scoring criterion, the factorized conditional log-likelihood (ˆfCLL), for learning Bayesian network classifiers. The proposed score is an approximation of the conditional log-likelihood criterion. The approximation is devised in order to guarantee decomposability over the network structure, as well as efficient estimation of the optimal parameters, achieving the same time and space complexity as the traditional log-likelihood scoring criterion. The resulting criterion has an information-theoretic interpretation based on interaction information, which exhibits its discriminative nature. To evaluate the performance of the proposed criterion, we present an empirical comparison with state-of-the-art classifiers. Results on a large suite of benchmark data sets from the UCI repository show that ˆfCLL-trained classifiers achieve at least as good accuracy as the best compared classifiers, using significantly less computational resources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bayesian networks are compact, flexible, and interpretable representations of a joint distribution. When the network structure is unknown but there are observational data at hand, one can try to learn the network structure. This is called structure discovery. This thesis contributes to two areas of structure discovery in Bayesian networks: space--time tradeoffs and learning ancestor relations. The fastest exact algorithms for structure discovery in Bayesian networks are based on dynamic programming and use excessive amounts of space. Motivated by the space usage, several schemes for trading space against time are presented. These schemes are presented in a general setting for a class of computational problems called permutation problems; structure discovery in Bayesian networks is seen as a challenging variant of the permutation problems. The main contribution in the area of the space--time tradeoffs is the partial order approach, in which the standard dynamic programming algorithm is extended to run over partial orders. In particular, a certain family of partial orders called parallel bucket orders is considered. A partial order scheme that provably yields an optimal space--time tradeoff within parallel bucket orders is presented. Also practical issues concerning parallel bucket orders are discussed. Learning ancestor relations, that is, directed paths between nodes, is motivated by the need for robust summaries of the network structures when there are unobserved nodes at work. Ancestor relations are nonmodular features and hence learning them is more difficult than modular features. A dynamic programming algorithm is presented for computing posterior probabilities of ancestor relations exactly. Empirical tests suggest that ancestor relations can be learned from observational data almost as accurately as arcs even in the presence of unobserved nodes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study deals with language change and variation in the correspondence of the eighteenth-century Bluestocking circle, a social network which provided learned men and women with an informal environment for the pursuit of scholarly entertainment. Elizabeth Montagu (1718 1800), a notable social hostess and a Shakespearean scholar, was one of their key figures. The study presents the reconstruction of Elizabeth Montagu s social networks from her youth to her later years with a special focus on the Bluestocking circle, and linguistic research on private correspondence between Montagu and her Bluestocking friends and family members between the years 1738 1778. The epistolary language use is investigated using the methods and frameworks of corpus linguistics, historical sociolinguistics, and social network analysis. The approach is diachronic and concerns real-time language change. The research is based on a selection of manuscript letters which I have edited and compiled into an electronic corpus (Bluestocking Corpus). I have also devised a network strength scale in order to quantify the strength of network ties and to compare the results of the linguistic research with the network analysis. The studies range from the reconstruction and analysis of Elizabeth Montagu s most prominent social networks to the analysis of changing morphosyntactic features and spelling variation in Montagu s and her network members correspondence. The linguistic studies look at the use of the progressive construction, preposition stranding and pied piping, and spelling variation in terms of preterite and past participle endings in the regular paradigm (-ed, - d, -d, - t, -t) and full / contracted spellings of auxiliary verbs. The results are analysed in terms of social network membership, sociolinguistic variables of the correspondents, and, when relevant, aspects of eighteenth-century linguistic prescriptivism. The studies showed a slight diachronic increase in the use of the progressive, a significant decrease of the stigmatised preposition stranding and increase of pied piping, and relatively informal but socially controlled epistolary spelling. Certain significant changes in Elizabeth Montagu s language use over the years could be attributed to her increasingly prominent social standing and the changes in her social networks, and the strength of ties correlated strongly with the use of the progressive in the Bluestocking Corpus. Gender, social rank, and register in terms of kinship/friendship had a significant influence in language use, and an effect of prescriptivism could also be detected. Elizabeth Montagu s network ties resulted in language variation in terms of network membership, her own position in a given network, and the social factors that controlled eighteenth-century interaction. When all the network ties are strong, linguistic variation seems to be essentially linked to the social variables of the informants.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Distinct endogenous network events, generated independently of sensory input, are a general feature of various structures of the immature central nervous system. In the immature hippocampus, these type of events are seen as "giant depolarizing potentials" (GDPs) in intracellular recordings in vitro. GABA, the major inhibitory neurotransmitter of the adult brain, has a depolarizing action in immature neurons, and GDPs have been proposed to be driven by GABAergic transmission. Moreover, GDPs have been thought to reflect an early pattern that disappears during development in parallel with the maturation of hyperpolarizing GABAergic inhibition. However, the adult hippocampus in vivo also generates endogenous network events known as sharp (positive) waves (SPWs), which reflect synchronous discharges of CA3 pyramidal neurons and are thought to be involved in cognitive functions. In this thesis, mechanisms of GDP generation were studied with intra- and extracellular recordings in the neonatal rat hippocampus in vitro and in vivo. Immature CA3 pyramidal neurons were found to generate intrinsic bursts of spikes and to act as cellular pacemakers for GDP activity whereas depolarizing GABAergic signalling was found to have a temporally non-patterned facilitatory role in the generation of the network events. Furthermore, the data indicate that the intrinsic bursts of neonatal CA3 pyramidal neurons and, consequently, GDPs are driven by a persistent Na+ current and terminated by a slow Ca2+-dependent K+ current. Gramicidin-perforated patch recordings showed that the depolarizing driving force for GABAA receptor-mediated actions is provided by Cl- uptake via the Na-K-C1 cotransporter, NKCC1, in the immature CA3 pyramids. A specific blocker of NKCC1, bumetanide, inhibited SPWs and GDPs in the neonatal rat hippocampus in vivo and in vitro, respectively. Finally, pharmacological blockade of the GABA transporter-1 prolonged the decay of the large GDP-associated GABA transients but not of single postsynaptic GABAA receptor-mediated currents. As a whole the data in this thesis indicate that the mechanism of GDP generation, based on the interconnected network of bursting CA3 pyramidal neurons, is similar to that involved in adult SPW activity. Hence, GDPs do not reflect a network pattern that disappears during development but they are the in vitro counterpart of neonatal SPWs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The blood and lymphatic vascular systems are essential for life, but they may become harnessed for sinister purposes in pathological conditions. For example, tumors learn to grow a network of blood vessels (angiogenesis), securing a source of oxygen and nutrients for sustained growth. On the other hand, damage to the lymph nodes and the collecting lymphatic vessels may lead to lymphedema, a debilitating condition characterized by peripheral edema and susceptibility to infections. Promoting the growth of new lymphatic vessels (lymphangiogenesis) is an attractive approach to treat lymphedema patients. Angiopoietin-1 (Ang1), a ligand for the endothelial receptor tyrosine kinases Tie1 and Tie2. The Ang1/Tie2 pathway has previously been implicated in promoting endothelial stability and integrity of EC monolayers. The studies presented here elucidate a novel function for Ang1 as a lymphangiogenic factor. Ang1 is known to decrease the permeability of blood vessels, and could thus act as a more global antagonist of plasma leakage and tissue edema by promoting growth of lymphatic vessels and thereby facilitating removal of excess fluid and other plasma components from the interstitium. These findings reinforce the idea that Ang1 may have therapeutic value in conditions of tissue edema. VEGFR-3 is present on all endothelia during development, but in the adult its expression becomes restricted to the lymphatic endothelium. VEGF-C and VEGF-D are ligands for VEGFR-3, and potently promote lymphangiogenesis in adult tissues, with direct and remarkably specific effects on the lymphatic endothelium in adult tissues. The data presented here show that VEGF-C and VEGF-D therapy can restore collecting lymphatic vessels in a novel orthotopic model of breast cancer-related lymphedema. Furthermore, the study introduces a novel approach to improve VEGF-C/VEGF-D therapy by using engineered heparin-binding forms of VEGF-C, which induced the rapid formation of organized lymphatic vessels. Importantly, VEGF-C therapy also greatly improved the survival and integration of lymph node transplants. The combination of lymph node transplantation and VEGF-C therapy provides a basis for future therapy of lymphedema. In adults, VEGFR-3 expression is restricted to the lymphatic endothelium and the fenestrated endothelia of certain endocrine organs. These results show that VEGFR-3 is induced at the onset of angiogenesis in the tip cells that lead the formation of new vessel sprouts, providing a tumor-specific vascular target. VEGFR-3 acts downstream of VEGF/VEGFR-2 signals, but, once induced, can sustain angiogenesis when VEGFR-2 signaling is inhibited. The data presented here implicate VEGFR-3 as a novel regulator of sprouting angiogenesis along with its role in regulating lymphatic vessel growth. Targeting VEGFR-3 may provide added efficacy to currently available anti-angiogenic therapeutics, which typically target the VEGF/VEGFR-2 pathway.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this thesis is to develop a fully automatic lameness detection system that operates in a milking robot. The instrumentation, measurement software, algorithms for data analysis and a neural network model for lameness detection were developed. Automatic milking has become a common practice in dairy husbandry, and in the year 2006 about 4000 farms worldwide used over 6000 milking robots. There is a worldwide movement with the objective of fully automating every process from feeding to milking. Increase in automation is a consequence of increasing farm sizes, the demand for more efficient production and the growth of labour costs. As the level of automation increases, the time that the cattle keeper uses for monitoring animals often decreases. This has created a need for systems for automatically monitoring the health of farm animals. The popularity of milking robots also offers a new and unique possibility to monitor animals in a single confined space up to four times daily. Lameness is a crucial welfare issue in the modern dairy industry. Limb disorders cause serious welfare, health and economic problems especially in loose housing of cattle. Lameness causes losses in milk production and leads to early culling of animals. These costs could be reduced with early identification and treatment. At present, only a few methods for automatically detecting lameness have been developed, and the most common methods used for lameness detection and assessment are various visual locomotion scoring systems. The problem with locomotion scoring is that it needs experience to be conducted properly, it is labour intensive as an on-farm method and the results are subjective. A four balance system for measuring the leg load distribution of dairy cows during milking in order to detect lameness was developed and set up in the University of Helsinki Research farm Suitia. The leg weights of 73 cows were successfully recorded during almost 10,000 robotic milkings over a period of 5 months. The cows were locomotion scored weekly, and the lame cows were inspected clinically for hoof lesions. Unsuccessful measurements, caused by cows standing outside the balances, were removed from the data with a special algorithm, and the mean leg loads and the number of kicks during milking was calculated. In order to develop an expert system to automatically detect lameness cases, a model was needed. A probabilistic neural network (PNN) classifier model was chosen for the task. The data was divided in two parts and 5,074 measurements from 37 cows were used to train the model. The operation of the model was evaluated for its ability to detect lameness in the validating dataset, which had 4,868 measurements from 36 cows. The model was able to classify 96% of the measurements correctly as sound or lame cows, and 100% of the lameness cases in the validation data were identified. The number of measurements causing false alarms was 1.1%. The developed model has the potential to be used for on-farm decision support and can be used in a real-time lameness monitoring system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis the use of the Bayesian approach to statistical inference in fisheries stock assessment is studied. The work was conducted in collaboration of the Finnish Game and Fisheries Research Institute by using the problem of monitoring and prediction of the juvenile salmon population in the River Tornionjoki as an example application. The River Tornionjoki is the largest salmon river flowing into the Baltic Sea. This thesis tackles the issues of model formulation and model checking as well as computational problems related to Bayesian modelling in the context of fisheries stock assessment. Each article of the thesis provides a novel method either for extracting information from data obtained via a particular type of sampling system or for integrating the information about the fish stock from multiple sources in terms of a population dynamics model. Mark-recapture and removal sampling schemes and a random catch sampling method are covered for the estimation of the population size. In addition, a method for estimating the stock composition of a salmon catch based on DNA samples is also presented. For most of the articles, Markov chain Monte Carlo (MCMC) simulation has been used as a tool to approximate the posterior distribution. Problems arising from the sampling method are also briefly discussed and potential solutions for these problems are proposed. Special emphasis in the discussion is given to the philosophical foundation of the Bayesian approach in the context of fisheries stock assessment. It is argued that the role of subjective prior knowledge needed in practically all parts of a Bayesian model should be recognized and consequently fully utilised in the process of model formulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advancements in the analysis techniques have led to a rapid accumulation of biological data in databases. Such data often are in the form of sequences of observations, examples including DNA sequences and amino acid sequences of proteins. The scale and quality of the data give promises of answering various biologically relevant questions in more detail than what has been possible before. For example, one may wish to identify areas in an amino acid sequence, which are important for the function of the corresponding protein, or investigate how characteristics on the level of DNA sequence affect the adaptation of a bacterial species to its environment. Many of the interesting questions are intimately associated with the understanding of the evolutionary relationships among the items under consideration. The aim of this work is to develop novel statistical models and computational techniques to meet with the challenge of deriving meaning from the increasing amounts of data. Our main concern is on modeling the evolutionary relationships based on the observed molecular data. We operate within a Bayesian statistical framework, which allows a probabilistic quantification of the uncertainties related to a particular solution. As the basis of our modeling approach we utilize a partition model, which is used to describe the structure of data by appropriately dividing the data items into clusters of related items. Generalizations and modifications of the partition model are developed and applied to various problems. Large-scale data sets provide also a computational challenge. The models used to describe the data must be realistic enough to capture the essential features of the current modeling task but, at the same time, simple enough to make it possible to carry out the inference in practice. The partition model fulfills these two requirements. The problem-specific features can be taken into account by modifying the prior probability distributions of the model parameters. The computational efficiency stems from the ability to integrate out the parameters of the partition model analytically, which enables the use of efficient stochastic search algorithms.