935 resultados para computational modeling


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents methods for locating and analyzing cis-regulatory DNA elements involved with the regulation of gene expression in multicellular organisms. The regulation of gene expression is carried out by the combined effort of several transcription factor proteins collectively binding the DNA on the cis-regulatory elements. Only sparse knowledge of the 'genetic code' of these elements exists today. An automatic tool for discovery of putative cis-regulatory elements could help their experimental analysis, which would result in a more detailed view of the cis-regulatory element structure and function. We have developed a computational model for the evolutionary conservation of cis-regulatory elements. The elements are modeled as evolutionarily conserved clusters of sequence-specific transcription factor binding sites. We give an efficient dynamic programming algorithm that locates the putative cis-regulatory elements and scores them according to the conservation model. A notable proportion of the high-scoring DNA sequences show transcriptional enhancer activity in transgenic mouse embryos. The conservation model includes four parameters whose optimal values are estimated with simulated annealing. With good parameter values the model discriminates well between the DNA sequences with evolutionarily conserved cis-regulatory elements and the DNA sequences that have evolved neutrally. In further inquiry, the set of highest scoring putative cis-regulatory elements were found to be sensitive to small variations in the parameter values. The statistical significance of the putative cis-regulatory elements is estimated with the Two Component Extreme Value Distribution. The p-values grade the conservation of the cis-regulatory elements above the neutral expectation. The parameter values for the distribution are estimated by simulating the neutral DNA evolution. The conservation of the transcription factor binding sites can be used in the upstream analysis of regulatory interactions. This approach may provide mechanistic insight to the transcription level data from, e.g., microarray experiments. Here we give a method to predict shared transcriptional regulators for a set of co-expressed genes. The EEL (Enhancer Element Locator) software implements the method for locating putative cis-regulatory elements. The software facilitates both interactive use and distributed batch processing. We have used it to analyze the non-coding regions around all human genes with respect to the orthologous regions in various other species including mouse. The data from these genome-wide analyzes is stored in a relational database which is used in the publicly available web services for upstream analysis and visualization of the putative cis-regulatory elements in the human genome.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Major infrastructure and construction (MIC) projects are those with significant traffic or environmental impact, of strategic and regional significance and high sensitivity. The decision making process of schemes of this type is becoming ever more complicated, especially with the increasing number of stakeholders involved and their growing tendency to defend their own varied interests. Failing to address and meet the concerns and expectations of stakeholders may result in project failures. To avoid this necessitates a systematic participatory approach to facilitate decision-making. Though numerous decision models have been established in previous studies (e.g. ELECTRE methods, the analytic hierarchy process and analytic network process) their applicability in the decision process during stakeholder participation in contemporary MIC projects is still uncertain. To resolve this, the decision rule approach is employed for modeling multi-stakeholder multi-objective project decisions. Through this, the result is obtained naturally according to the “rules” accepted by any stakeholder involved. In this sense, consensus is more likely to be achieved since the process is more convincing and the result is easier to be accepted by all concerned. Appropriate “rules”, comprehensive enough to address multiple objectives while straightforward enough to be understood by multiple stakeholders, are set for resolving conflict and facilitating consensus during the project decision process. The West Kowloon Cultural District (WKCD) project is used as a demonstration case and a focus group meeting is conducted in order to confirm the validity of the model established. The results indicate that the model is objective, reliable and practical enough to cope with real world problems. Finally, a suggested future research agenda is provided.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a highly sensitive genome wide search method for recessive mutations. The method is suitable for distantly related samples that are divided into phenotype positives and negatives. High throughput genotype arrays are used to identify and compare homozygous regions between the cohorts. The method is demonstrated by comparing colorectal cancer patients against unaffected references. The objective is to find homozygous regions and alleles that are more common in cancer patients. We have designed and implemented software tools to automate the data analysis from genotypes to lists of candidate genes and to their properties. The programs have been designed in respect to a pipeline architecture that allows their integration to other programs such as biological databases and copy number analysis tools. The integration of the tools is crucial as the genome wide analysis of the cohort differences produces many candidate regions not related to the studied phenotype. CohortComparator is a genotype comparison tool that detects homozygous regions and compares their loci and allele constitutions between two sets of samples. The data is visualised in chromosome specific graphs illustrating the homozygous regions and alleles of each sample. The genomic regions that may harbour recessive mutations are emphasised with different colours and a scoring scheme is given for these regions. The detection of homozygous regions, cohort comparisons and result annotations are all subjected to presumptions many of which have been parameterized in our programs. The effect of these parameters and the suitable scope of the methods have been evaluated. Samples with different resolutions can be balanced with the genotype estimates of their haplotypes and they can be used within the same study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the most tangled fields of research is the field of defining and modeling affective concepts, i. e. concepts regarding emotions and feelings. The subject can be approached from many disciplines. The main problem is lack of generally approved definitions. However, e.g. linguists have recently started to check the consistency of their theories with the help of computer simulations. Definitions of affective concepts are needed for performing similar simulations in behavioral sciences. In this thesis, preliminary computational definitions of affects for a simple utility-maximizing agent are given. The definitions have been produced by synthetizing ideas from theories from several fields of research. The class of affects is defined as a superclass of emotions and feelings. Affect is defined as a process, in which a change in an agent's expected utility causes a bodily change. If the process is currently under the attention of the agent (i.e. the agent is conscious of it), the process is a feeling. If it is not, but can in principle be taken into attention (i.e. it is preconscious), the process is an emotion. Thus, affects do not presuppose consciousness, but emotions and affects do. Affects directed at unexpected materialized (i.e. past) events are delight and fright. Delight is the consequence of an unexpected positive event and fright is the consequence of an unexpected negative event. Affects directed at expected materialized (i.e. past) events are happiness (expected positive event materialized), disappointment (expected positive event did not materialize), sadness (expected negative event materialized) and relief (expected negative event did not materialize). Affects directed at expected unrealized (i.e. future) events are fear and hope. Some other affects can be defined as directed towards originators of the events. The affect classification has also been implemented as a computer program, the purpose of which is to ensure the coherence of the definitions and also to illustrate the capabilities of the model. The exact content of bodily changes associated with specific affects is not considered relevant from the point of view of the logical structure of affective phenomena. The utility function need also not be defined, since the target of examination is only its dynamics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

"This chapter discusses laminar and turbulent natural convection in rectangular cavities. Natural convection in rectangular two-dimensional cavities has become a standard problem in numerical heat transfer because of its relevance in understanding a number of problems in engineering. Current research identified a number of difficulties with regard to the numerical methods and the turbulence modeling for this class of flows. Obtaining numerical predictions at high Rayleigh numbers proved computationally expensive such that results beyond Ra ∼ 1014 are rarely reported. The chapter discusses a study in which it was found that turbulent computations in square cavities can't be extended beyond Ra ∼ O (1012) despite having developed a code that proved very efficient for the high Ra laminar regime. As the Rayleigh number increased, thin boundary layers began to form next to the vertical walls, and the central region became progressively more stagnant and highly stratified. Results obtained for the high Ra laminar regime were in good agreement with existing studies. Turbulence computations, although of a preliminary nature, indicated that a second moment closure model was capable of predicting the experimentally observed flow features."--Publisher Summary

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hydrologic impacts of climate change are usually assessed by downscaling the General Circulation Model (GCM) output of large-scale climate variables to local-scale hydrologic variables. Such an assessment is characterized by uncertainty resulting from the ensembles of projections generated with multiple GCMs, which is known as intermodel or GCM uncertainty. Ensemble averaging with the assignment of weights to GCMs based on model evaluation is one of the methods to address such uncertainty and is used in the present study for regional-scale impact assessment. GCM outputs of large-scale climate variables are downscaled to subdivisional-scale monsoon rainfall. Weights are assigned to the GCMs on the basis of model performance and model convergence, which are evaluated with the Cumulative Distribution Functions (CDFs) generated from the downscaled GCM output (for both 20th Century [20C3M] and future scenarios) and observed data. Ensemble averaging approach, with the assignment of weights to GCMs, is characterized by the uncertainty caused by partial ignorance, which stems from nonavailability of the outputs of some of the GCMs for a few scenarios (in Intergovernmental Panel on Climate Change [IPCC] data distribution center for Assessment Report 4 [AR4]). This uncertainty is modeled with imprecise probability, i.e., the probability being represented as an interval gray number. Furthermore, the CDF generated with one GCM is entirely different from that with another and therefore the use of multiple GCMs results in a band of CDFs. Representing this band of CDFs with a single valued weighted mean CDF may be misleading. Such a band of CDFs can only be represented with an envelope that contains all the CDFs generated with a number of GCMs. Imprecise CDF represents such an envelope, which not only contains the CDFs generated with all the available GCMs but also to an extent accounts for the uncertainty resulting from the missing GCM output. This concept of imprecise probability is also validated in the present study. The imprecise CDFs of monsoon rainfall are derived for three 30-year time slices, 2020s, 2050s and 2080s, with A1B, A2 and B1 scenarios. The model is demonstrated with the prediction of monsoon rainfall in Orissa meteorological subdivision, which shows a possible decreasing trend in the future.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many species inhabit fragmented landscapes, resulting either from anthropogenic or from natural processes. The ecological and evolutionary dynamics of spatially structured populations are affected by a complex interplay between endogenous and exogenous factors. The metapopulation approach, simplifying the landscape to a discrete set of patches of breeding habitat surrounded by unsuitable matrix, has become a widely applied paradigm for the study of species inhabiting highly fragmented landscapes. In this thesis, I focus on the construction of biologically realistic models and their parameterization with empirical data, with the general objective of understanding how the interactions between individuals and their spatially structured environment affect ecological and evolutionary processes in fragmented landscapes. I study two hierarchically structured model systems, which are the Glanville fritillary butterfly in the Åland Islands, and a system of two interacting aphid species in the Tvärminne archipelago, both being located in South-Western Finland. The interesting and challenging feature of both study systems is that the population dynamics occur over multiple spatial scales that are linked by various processes. My main emphasis is in the development of mathematical and statistical methodologies. For the Glanville fritillary case study, I first build a Bayesian framework for the estimation of death rates and capture probabilities from mark-recapture data, with the novelty of accounting for variation among individuals in capture probabilities and survival. I then characterize the dispersal phase of the butterflies by deriving a mathematical approximation of a diffusion-based movement model applied to a network of patches. I use the movement model as a building block to construct an individual-based evolutionary model for the Glanville fritillary butterfly metapopulation. I parameterize the evolutionary model using a pattern-oriented approach, and use it to study how the landscape structure affects the evolution of dispersal. For the aphid case study, I develop a Bayesian model of hierarchical multi-scale metapopulation dynamics, where the observed extinction and colonization rates are decomposed into intrinsic rates operating specifically at each spatial scale. In summary, I show how analytical approaches, hierarchical Bayesian methods and individual-based simulations can be used individually or in combination to tackle complex problems from many different viewpoints. In particular, hierarchical Bayesian methods provide a useful tool for decomposing ecological complexity into more tractable components.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Using the framework of a new relaxation system, which converts a nonlinear viscous conservation law into a system of linear convection-diffusion equations with nonlinear source terms, a finite variable difference method is developed for nonlinear hyperbolic-parabolic equations. The basic idea is to formulate a finite volume method with an optimum spatial difference, using the Locally Exact Numerical Scheme (LENS), leading to a Finite Variable Difference Method as introduced by Sakai [Katsuhiro Sakai, A new finite variable difference method with application to locally exact numerical scheme, journal of Computational Physics, 124 (1996) pp. 301-308.], for the linear convection-diffusion equations obtained by using a relaxation system. Source terms are treated with the well-balanced scheme of Jin [Shi Jin, A steady-state capturing method for hyperbolic systems with geometrical source terms, Mathematical Modeling Numerical Analysis, 35 (4) (2001) pp. 631-645]. Bench-mark test problems for scalar and vector conservation laws in one and two dimensions are solved using this new algorithm and the results demonstrate the efficiency of the scheme in capturing the flow features accurately.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Two oxazolidine-2-thiones, thio-analogs of linezolid, were synthesized and their antibacterial properties evaluated. Unlike oxazolidinones, the thio-analogs did not inhibit the growth of Gram positive bacteria. A molecular modeling study has been carried out to aid understanding of this unexpected finding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Platelet endothelial cell adhesion molecule 1 (PECAM-1) has many functions, including its roles in leukocyte extravasation as part of the inflammatory response and in the maintenance of vascular integrity through its contribution to endothelial cell−cell adhesion. PECAM-1 has been shown to mediate cell−cell adhesion through homophilic binding events that involve interactions between domain 1 of PECAM-1 molecules on adjacent cells. However, various heterophilic ligands of PECAM-1 have also been proposed. The possible interaction of PECAM-1 with glycosaminoglycans (GAGs) is the focus of this study. The three-dimensional structure of the extracellular immunoglobulin (Ig) domains of PECAM-1 were constructed using homology modeling and threading methods. Potential heparin/heparan sulfate-binding sites were predicted on the basis of their amino acid consensus sequences and a comparison with known structures of sulfate-binding proteins. Heparin and other GAG fragments have been docked to investigate the structural determinants of their protein-binding specificity and selectivity. The modeling has predicted two regions in PECAM-1 that appear to bind heparin oligosaccharides. A high-affinity binding site was located in Ig domains 2 and 3, and evidence for a low-affinity site in Ig domains 5 and 6 was obtained. These GAG-binding regions were distinct from regions involved in PECAM-1 homophilic interactions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mammalian heparanase is an endo-β-glucuronidase associated with cell invasion in cancer metastasis, angiogenesis and inflammation. Heparanase cleaves heparan sulfate proteoglycans in the extracellular matrix and basement membrane, releasing heparin/heparan sulfate oligosaccharides of appreciable size. This in turn causes the release of growth factors, which accelerate tumor growth and metastasis. Heparanase has two glycosaminoglycan-binding domains; however, no three-dimensional structure information is available for human heparanase that can provide insights into how the two domains interact to degrade heparin fragments. We have constructed a new homology model of heparanase that takes into account the most recent structural and bioinformatics data available. Heparin analogs and glycosaminoglycan mimetics were computationally docked into the active site with energetically stable ring conformations and their interaction energies were compared. The resulting docked structures were used to propose a model for substrates and conformer selectivity based on the dimensions of the active site. The docking of substrates and inhibitors indicates the existence of a large binding site extending at least two saccharide units beyond the cleavage site (toward the nonreducing end) and at least three saccharides toward the reducing end (toward heparin-binding site 2). The docking of substrates suggests that heparanase recognizes the N-sulfated and O-sulfated glucosamines at subsite +1 and glucuronic acid at the cleavage site, whereas in the absence of 6-O-sulfation in glucosamine, glucuronic acid is docked at subsite +2. These findings will help us to focus on the rational design of heparanase-inhibiting molecules for anticancer drug development by targeting the two heparin/heparan sulfate recognition domains.