9 resultados para Mixture modelling
em CentAUR: Central Archive University of Reading - UK
Resumo:
Background: Microarray based comparative genomic hybridisation (CGH) experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results: The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion: After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.
Resumo:
Background: Psychotic phenomena appear to form a continuum with normal experience and beliefs, and may build on common emotional interpersonal concerns. Aims: We tested predictions that paranoid ideation is exponentially distributed and hierarchically arranged in the general population, and that persecutory ideas build on more common cognitions of mistrust, interpersonal sensitivity and ideas of reference. Method: Items were chosen from the Structured Clinical Interview for DSM-IV Axis II Disorders (SCID-II) questionnaire and the Psychosis Screening Questionnaire in the second British National Survey of Psychiatric Morbidity (n = 8580), to test a putative hierarchy of paranoid development using confirmatory factor analysis, latent class analysis and factor mixture modelling analysis. Results: Different types of paranoid ideation ranged in frequency from less than 2% to nearly 30%. Total scores on these items followed an almost perfect exponential distribution (r = 0.99). Our four a priori first-order factors were corroborated (interpersonal sensitivity; mistrust; ideas of reference; ideas of persecution). These mapped onto four classes of individual respondents: a rare, severe, persecutory class with high endorsement of all item factors, including persecutory ideation; a quasi-normal class with infrequent endorsement of interpersonal sensitivity, mistrust and ideas of reference, and no ideas of persecution; and two intermediate classes, characterised respectively by relatively high endorsement of items relating to mistrust and to ideas of reference. Conclusions: The paranoia continuum has implications for the aetiology, mechanisms and treatment of psychotic disorders, while confirming the lack of a clear distinction from normal experiences and processes.
Resumo:
While the standard models of concentration addition and independent action predict overall toxicity of multicomponent mixtures reasonably, interactions may limit the predictive capability when a few compounds dominate a mixture. This study was conducted to test if statistically significant systematic deviations from concentration addition (i.e. synergism/antagonism, dose ratio- or dose level-dependency) occur when two taxonomically unrelated species, the earthworm Eisenia fetida and the nematode Caenorhabditis elegans were exposed to a full range of mixtures of the similar acting neonicotinoid pesticides imidacloprid and thiacloprid. The effect of the mixtures on C. elegans was described significantly better (p<0.01) by a dose level-dependent deviation from the concentration addition model than by the reference model alone, while the reference model description of the effects on E. fetida could not be significantly improved. These results highlight that deviations from concentration addition are possible even with similar acting compounds, but that the nature of such deviations are species dependent. For improving ecological risk assessment of simple mixtures, this implies that the concentration addition model may need to be used in a probabilistic context, rather than in its traditional deterministic manner. Crown Copyright (C) 2008 Published by Elsevier Inc. All rights reserved.
Resumo:
This paper introduces a new fast, effective and practical model structure construction algorithm for a mixture of experts network system utilising only process data. The algorithm is based on a novel forward constrained regression procedure. Given a full set of the experts as potential model bases, the structure construction algorithm, formed on the forward constrained regression procedure, selects the most significant model base one by one so as to minimise the overall system approximation error at each iteration, while the gate parameters in the mixture of experts network system are accordingly adjusted so as to satisfy the convex constraints required in the derivation of the forward constrained regression procedure. The procedure continues until a proper system model is constructed that utilises some or all of the experts. A pruning algorithm of the consequent mixture of experts network system is also derived to generate an overall parsimonious construction algorithm. Numerical examples are provided to demonstrate the effectiveness of the new algorithms. The mixture of experts network framework can be applied to a wide variety of applications ranging from multiple model controller synthesis to multi-sensor data fusion.
Resumo:
The contribution investigates the problem of estimating the size of a population, also known as the missing cases problem. Suppose a registration system is targeting to identify all cases having a certain characteristic such as a specific disease (cancer, heart disease, ...), disease related condition (HIV, heroin use, ...) or a specific behavior (driving a car without license). Every case in such a registration system has a certain notification history in that it might have been identified several times (at least once) which can be understood as a particular capture-recapture situation. Typically, cases are left out which have never been listed at any occasion, and it is this frequency one wants to estimate. In this paper modelling is concentrating on the counting distribution, e.g. the distribution of the variable that counts how often a given case has been identified by the registration system. Besides very simple models like the binomial or Poisson distribution, finite (nonparametric) mixtures of these are considered providing rather flexible modelling tools. Estimation is done using maximum likelihood by means of the EM algorithm. A case study on heroin users in Bangkok in the year 2001 is completing the contribution.
Resumo:
The rate at which a given site in a gene sequence alignment evolves over time may vary. This phenomenon-known as heterotachy-can bias or distort phylogenetic trees inferred from models of sequence evolution that assume rates of evolution are constant. Here, we describe a phylogenetic mixture model designed to accommodate heterotachy. The method sums the likelihood of the data at each site over more than one set of branch lengths on the same tree topology. A branch-length set that is best for one site may differ from the branch-length set that is best for some other site, thereby allowing different sites to have different rates of change throughout the tree. Because rate variation may not be present in all branches, we use a reversible-jump Markov chain Monte Carlo algorithm to identify those branches in which reliable amounts of heterotachy occur. We implement the method in combination with our 'pattern-heterogeneity' mixture model, applying it to simulated data and five published datasets. We find that complex evolutionary signals of heterotachy are routinely present over and above variation in the rate or pattern of evolution across sites, that the reversible-jump method requires far fewer parameters than conventional mixture models to describe it, and serves to identify the regions of the tree in which heterotachy is most pronounced. The reversible-jump procedure also removes the need for a posteriori tests of 'significance' such as the Akaike or Bayesian information criterion tests, or Bayes factors. Heterotachy has important consequences for the correct reconstruction of phylogenies as well as for tests of hypotheses that rely on accurate branch-length information. These include molecular clocks, analyses of tempo and mode of evolution, comparative studies and ancestral state reconstruction. The model is available from the authors' website, and can be used for the analysis of both nucleotide and morphological data.
Resumo:
A connection between a fuzzy neural network model with the mixture of experts network (MEN) modelling approach is established. Based on this linkage, two new neuro-fuzzy MEN construction algorithms are proposed to overcome the curse of dimensionality that is inherent in the majority of associative memory networks and/or other rule based systems. The first construction algorithm employs a function selection manager module in an MEN system. The second construction algorithm is based on a new parallel learning algorithm in which each model rule is trained independently, for which the parameter convergence property of the new learning method is established. As with the first approach, an expert selection criterion is utilised in this algorithm. These two construction methods are equivalent in their effectiveness in overcoming the curse of dimensionality by reducing the dimensionality of the regression vector, but the latter has the additional computational advantage of parallel processing. The proposed algorithms are analysed for effectiveness followed by numerical examples to illustrate their efficacy for some difficult data based modelling problems.