919 resultados para Model-based Categorical Sequence Clustering
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
In a previous paper, Hoornaert et al. (Powder Technol. 96 (1998); 116-128) presented data from granulation experiments performed in a 50 L Lodige high shear mixer. In this study that same data was simulated with a population balance model. Based on an analysis of the experimental data, the granulation process was divided into three separate stages: nucleation, induction, and coalescence growth. These three stages were then simulated separately, with promising results. it is possible to derive a kernel that fit both the induction and the coalescence growth stage. Modeling the nucleation stage proved to be more challenging due to the complex mechanism of nucleus formation. From this work some recommendations are made for the improvement of this type of model.
Resumo:
Genetic diversity and population structure were investigated across the core range of Tasmanian devils (Sarcophilus laniarius; Dasyuridae), a wide-ranging marsupial carnivore restricted to the island of Tasmania. Heterozygosity (0.386-0.467) and allelic diversity (2.7-3.3) were low in all subpopulations and allelic size ranges were small and almost continuous, consistent with a founder effect. Island effects and repeated periods of low population density may also have contributed to the low variation. Within continuous habitat, gene flow appears extensive up to 50 km (high assignment rates to source or close neighbour populations; nonsignificant values of pairwise F-ST), in agreement with movement data. At larger scales (150-250 km), gene flow is reduced (significant pairwise F-ST) but there is no evidence for isolation by distance. The most substantial genetic structuring was observed for comparisons spanning unsuitable habitat, implying limited dispersal of devils between the well-connected, eastern populations and a smaller northwestern population. The genetic distinctiveness of the northwestern population was reflected in all analyses: unique alleles; multivariate analyses of gene frequency (multidimensional scaling, minimum spanning tree, nearest neighbour); high self-assignment (95%); two distinct populations for Tasmania were detected in isolation by distance and in Bayesian model-based clustering analyses. Marsupial carnivores appear to have stronger population subdivisions than their placental counterparts.
Resumo:
The SOX family of transcription factors are found throughout the animal kingdom and are important in a variety of developmental contexts. Genome analysis has identified 20 Sox genes in human and mouse, which can be subdivided into 8 groups, based on sequence comparison and intron-exon structure. Most of the SOX groups identified in mammals are represented by a single SOX sequence in invertebrate model organisms, suggesting a duplication and divergence mechanism has operated during vertebrate evolution. We have now analysed the Sox gene complement in the pufferfish, Fugu rubripes, in order to shed further light on the diversity and origins of the Sox gene family. Major differences were found between the Sox family in Fugu and those in humans and mice. In particular, Fugu does not have orthologues of Sry, Sox,15 and Sox30, which appear to be specific to mammals, while Sox19, found in Fugu and zebrafish but absent in mammals, seems to be specific to fishes. Six mammalian Sox genes are represented by two copies each in Fugu, indicating a large-scale gene duplication in the fish lineage. These findings point to recent Sox gene loss, duplication and divergence occurring during the evolution of tetrapod and teleost lineages, and provide further evidence for large-scale segmental or a whole-genome duplication occurring early in the radiation of teleosts. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
We examine the event statistics obtained from two differing simplified models for earthquake faults. The first model is a reproduction of the Block-Slider model of Carlson et al. (1991), a model often employed in seismicity studies. The second model is an elastodynamic fault model based upon the Lattice Solid Model (LSM) of Mora and Place (1994). We performed simulations in which the fault length was varied in each model and generated synthetic catalogs of event sizes and times. From these catalogs, we constructed interval event size distributions and inter-event time distributions. The larger, localised events in the Block-Slider model displayed the same scaling behaviour as events in the LSM however the distribution of inter-event times was markedly different. The analysis of both event size and inter-event time statistics is an effective method for comparative studies of differing simplified models for earthquake faults.
Resumo:
Background Reliable information on causes of death is a fundamental component of health development strategies, yet globally only about one-third of countries have access to such information. For countries currently without adequate mortality reporting systems there are useful models other than resource-intensive population-wide medical certification. Sample-based mortality surveillance is one such approach. This paper provides methods for addressing appropriate sample size considerations in relation to mortality surveillance, with particular reference to situations in which prior information on mortality is lacking. Methods The feasibility of model-based approaches for predicting the expected mortality structure and cause composition is demonstrated for populations in which only limited empirical data is available. An algorithm approach is then provided to derive the minimum person-years of observation needed to generate robust estimates for the rarest cause of interest in three hypothetical populations, each representing different levels of health development. Results Modelled life expectancies at birth and cause of death structures were within expected ranges based on published estimates for countries at comparable levels of health development. Total person-years of observation required in each population could be more than halved by limiting the set of age, sex, and cause groups regarded as 'of interest'. Discussion The methods proposed are consistent with the philosophy of establishing priorities across broad clusters of causes for which the public health response implications are similar. The examples provided illustrate the options available when considering the design of mortality surveillance for population health monitoring purposes.
Resumo:
The ‘leading coordinate’ approach to computing an approximate reaction pathway, with subsequent determination of the true minimum energy profile, is applied to a two-proton chain transfer model based on the chromophore and its surrounding moieties within the green fluorescent protein (GFP). Using an ab initio quantum chemical method, a number of different relaxed energy profiles are found for several plausible guesses at leading coordinates. The results obtained for different trial leading coordinates are rationalized through the calculation of a two-dimensional relaxed potential energy surface (PES) for the system. Analysis of the 2-D relaxed PES reveals that two of the trial pathways are entirely spurious, while two others contain useful information and can be used to furnish starting points for successful saddle-point searches. Implications for selection of trial leading coordinates in this class of proton chain transfer reactions are discussed, and a simple diagnostic function is proposed for revealing whether or not a relaxed pathway based on a trial leading coordinate is likely to furnish useful information.
Resumo:
Many developing south-east Asian governments are not capturing full rent from domestic forest logging operations. Such rent losses are commonly related to institutional failures, where informal institutions tend to dominate the control of forestry activity in spite of weakly enforced regulations. Our model is an attempt to add a new dimension to thinking about deforestation. We present a simple conceptual model, based on individual decisions rather than social or forest planning, which includes the human dynamics of participation in informal activity and the relatively slower ecological dynamics of changes in forest resources. We demonstrate how incumbent informal logging operations can be persistent, and that any spending aimed at replacing the informal institutions can only be successful if it pushes institutional settings past some threshold. (C) 2006 Elsevier B.V. All rights reserved.
Resumo:
This paper considers a model-based approach to the clustering of tissue samples of a very large number of genes from microarray experiments. It is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. Frequently in practice, there are also clinical data available on those cases on which the tissue samples have been obtained. Here we investigate how to use the clinical data in conjunction with the microarray gene expression data to cluster the tissue samples. We propose two mixture model-based approaches in which the number of components in the mixture model corresponds to the number of clusters to be imposed on the tissue samples. One approach specifies the components of the mixture model to be the conditional distributions of the microarray data given the clinical data with the mixing proportions also conditioned on the latter data. Another takes the components of the mixture model to represent the joint distributions of the clinical and microarray data. The approaches are demonstrated on some breast cancer data, as studied recently in van't Veer et al. (2002).
Resumo:
Ecological regions are increasingly used as a spatial unit for planning and environmental management. It is important to define these regions in a scientifically defensible way to justify any decisions made on the basis that they are representative of broad environmental assets. The paper describes a methodology and tool to identify cohesive bioregions. The methodology applies an elicitation process to obtain geographical descriptions for bioregions, each of these is transformed into a Normal density estimate on environmental variables within that region. This prior information is balanced with data classification of environmental datasets using a Bayesian statistical modelling approach to objectively map ecological regions. The method is called model-based clustering as it fits a Normal mixture model to the clusters associated with regions, and it addresses issues of uncertainty in environmental datasets due to overlapping clusters.
Resumo:
Model transformations are an integral part of model-driven development. Incremental updates are a key execution scenario for transformations in model-based systems, and are especially important for the evolution of such systems. This paper presents a strategy for the incremental maintenance of declarative, rule-based transformation executions. The strategy involves recording dependencies of the transformation execution on information from source models and from the transformation definition. Changes to the source models or the transformation itself can then be directly mapped to their effects on transformation execution, allowing changes to target models to be computed efficiently. This particular approach has many benefits. It supports changes to both source models and transformation definitions, it can be applied to incomplete transformation executions, and a priori knowledge of volatility can be used to further increase the efficiency of change propagation.
Resumo:
Finite mixture models are being increasingly used to model the distributions of a wide variety of random phenomena. While normal mixture models are often used to cluster data sets of continuous multivariate data, a more robust clustering can be obtained by considering the t mixture model-based approach. Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data where the number of observations n is very large relative to their dimension p. As the approach using the multivariate normal family of distributions is sensitive to outliers, it is more robust to adopt the multivariate t family for the component error and factor distributions. The computational aspects associated with robustness and high dimensionality in these approaches to cluster analysis are discussed and illustrated.
Resumo:
Knowledge management (KM) is an emerging discipline (Ives, Torrey & Gordon, 1997) and characterised by four processes: generation, codification, transfer, and application (Alavi & Leidner, 2001). Completing the loop, knowledge transfer is regarded as a precursor to knowledge creation (Nonaka & Takeuchi, 1995) and thus forms an essential part of the knowledge management process. The understanding of how knowledge is transferred is very important for explaining the evolution and change in institutions, organisations, technology, and economy. However, knowledge transfer is often found to be laborious, time consuming, complicated, and difficult to understand (Huber, 2001; Szulanski, 2000). It has received negligible systematic attention (Huber, 2001; Szulanski, 2000), thus we know little about it (Huber, 2001). However, some literature, such as Davenport and Prusak (1998) and Shariq (1999), has attempted to address knowledge transfer within an organisation, but studies on inter-organisational knowledge transfer are still much neglected. An emergent view is that it may be beneficial for organisations if more research can be done to help them understand and, thus, to improve their inter-organisational knowledge transfer process. Therefore, this article aims to provide an overview of the inter-organisational knowledge transfer and its related literature and present a proposed inter-organisational knowledge transfer process model based on theoretical and empirical studies.
Resumo:
What does endogenous growth theory tell about regional economies? Empirics of R&D worker-based productivity growth, Regional Studies. Endogenous growth theory emerged in the 1990s as ‘new growth theory’ accounting for technical progress in the growth process. This paper examines the role of research and development (R&D) workers underlying the Romer model (1990) and its subsequent modifications, and compares it with a model based on the accumulation of human capital engaged in R&D. Cross-section estimates of the models against productivity growth of European regions in the 1990s suggest that each R&D worker has a unique set of knowledge while his/her contributions are enhanced by knowledge sharing within a region as well as spillovers from other regions in proximity.