927 resultados para Analysis of Algorithms and Problem Complexity


Relevância:

100.00% 100.00%

Publicador:

Resumo:

MOTIVATION: High-throughput sequencing technologies enable the genome-wide analysis of the impact of genetic variation on molecular phenotypes at unprecedented resolution. However, although powerful, these technologies can also introduce unexpected artifacts. Results: We investigated the impact of library amplification bias on the identification of allele-specific (AS) molecular events from high-throughput sequencing data derived from chromatin immunoprecipitation assays (ChIP-seq). Putative AS DNA binding activity for RNA polymerase II was determined using ChIP-seq data derived from lymphoblastoid cell lines of two parent-daughter trios. We found that, at high-sequencing depth, many significant AS binding sites suffered from an amplification bias, as evidenced by a larger number of clonal reads representing one of the two alleles. To alleviate this bias, we devised an amplification bias detection strategy, which filters out sites with low read complexity and sites featuring a significant excess of clonal reads. This method will be useful for AS analyses involving ChIP-seq and other functional sequencing assays.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In wireless communications the transmitted signals may be affected by noise. The receiver must decode the received message, which can be mathematically modelled as a search for the closest lattice point to a given vector. This problem is known to be NP-hard in general, but for communications applications there exist algorithms that, for a certain range of system parameters, offer polynomial expected complexity. The purpose of the thesis is to study the sphere decoding algorithm introduced in the article On Maximum-Likelihood Detection and the Search for the Closest Lattice Point, which was published by M.O. Damen, H. El Gamal and G. Caire in 2003. We concentrate especially on its computational complexity when used in space–time coding. Computer simulations are used to study how different system parameters affect the computational complexity of the algorithm. The aim is to find ways to improve the algorithm from the complexity point of view. The main contribution of the thesis is the construction of two new modifications to the sphere decoding algorithm, which are shown to perform faster than the original algorithm within a range of system parameters.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Maintenance of thermal homeostasis in rats fed a high-fat diet (HFD) is associated with changes in their thermal balance. The thermodynamic relationship between heat dissipation and energy storage is altered by the ingestion of high-energy diet content. Observation of thermal registers of core temperature behavior, in humans and rodents, permits identification of some characteristics of time series, such as autoreference and stationarity that fit adequately to a stochastic analysis. To identify this change, we used, for the first time, a stochastic autoregressive model, the concepts of which match those associated with physiological systems involved and applied in male HFD rats compared with their appropriate standard food intake age-matched male controls (n=7 per group). By analyzing a recorded temperature time series, we were able to identify when thermal homeostasis would be affected by a new diet. The autoregressive time series model (AR model) was used to predict the occurrence of thermal homeostasis, and this model proved to be very effective in distinguishing such a physiological disorder. Thus, we infer from the results of our study that maximum entropy distribution as a means for stochastic characterization of temperature time series registers may be established as an important and early tool to aid in the diagnosis and prevention of metabolic diseases due to their ability to detect small variations in thermal profile.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The quantitative component of this study examined the effect of computerassisted instruction (CAI) on science problem-solving performance, as well as the significance of logical reasoning ability to this relationship. I had the dual role of researcher and teacher, as I conducted the study with 84 grade seven students to whom I simultaneously taught science on a rotary-basis. A two-treatment research design using this sample of convenience allowed for a comparison between the problem-solving performance of a CAI treatment group (n = 46) versus a laboratory-based control group (n = 38). Science problem-solving performance was measured by a pretest and posttest that I developed for this study. The validity of these tests was addressed through critical discussions with faculty members, colleagues, as well as through feedback gained in a pilot study. High reliability was revealed between the pretest and the posttest; in this way, students who tended to score high on the pretest also tended to score high on the posttest. Interrater reliability was found to be high for 30 randomly-selected test responses which were scored independently by two raters (i.e., myself and my faculty advisor). Results indicated that the form of computer-assisted instruction (CAI) used in this study did not significantly improve students' problem-solving performance. Logical reasoning ability was measured by an abbreviated version of the Group Assessment of Lx)gical Thinking (GALT). Logical reasoning ability was found to be correlated to problem-solving performance in that, students with high logical reasoning ability tended to do better on the problem-solving tests and vice versa. However, no significant difference was observed in problem-solving improvement, in the laboratory-based instruction group versus the CAI group, for students varying in level of logical reasoning ability.Insignificant trends were noted in results obtained from students of high logical reasoning ability, but require further study. It was acknowledged that conclusions drawn from the quantitative component of this study were limited, as further modifications of the tests were recommended, as well as the use of a larger sample size. The purpose of the qualitative component of the study was to provide a detailed description ofmy thesis research process as a Brock University Master of Education student. My research journal notes served as the data base for open coding analysis. This analysis revealed six main themes which best described my research experience: research interests, practical considerations, research design, research analysis, development of the problem-solving tests, and scoring scheme development. These important areas ofmy thesis research experience were recounted in the form of a personal narrative. It was noted that the research process was a form of problem solving in itself, as I made use of several problem-solving strategies to achieve desired thesis outcomes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Computational Biology is the research are that contributes to the analysis of biological data through the development of algorithms which will address significant research problems.The data from molecular biology includes DNA,RNA ,Protein and Gene expression data.Gene Expression Data provides the expression level of genes under different conditions.Gene expression is the process of transcribing the DNA sequence of a gene into mRNA sequences which in turn are later translated into proteins.The number of copies of mRNA produced is called the expression level of a gene.Gene expression data is organized in the form of a matrix. Rows in the matrix represent genes and columns in the matrix represent experimental conditions.Experimental conditions can be different tissue types or time points.Entries in the gene expression matrix are real values.Through the analysis of gene expression data it is possible to determine the behavioral patterns of genes such as similarity of their behavior,nature of their interaction,their respective contribution to the same pathways and so on. Similar expression patterns are exhibited by the genes participating in the same biological process.These patterns have immense relevance and application in bioinformatics and clinical research.Theses patterns are used in the medical domain for aid in more accurate diagnosis,prognosis,treatment planning.drug discovery and protein network analysis.To identify various patterns from gene expression data,data mining techniques are essential.Clustering is an important data mining technique for the analysis of gene expression data.To overcome the problems associated with clustering,biclustering is introduced.Biclustering refers to simultaneous clustering of both rows and columns of a data matrix. Clustering is a global whereas biclustering is a local model.Discovering local expression patterns is essential for identfying many genetic pathways that are not apparent otherwise.It is therefore necessary to move beyond the clustering paradigm towards developing approaches which are capable of discovering local patterns in gene expression data.A biclusters is a submatrix of the gene expression data matrix.The rows and columns in the submatrix need not be contiguous as in the gene expression data matrix.Biclusters are not disjoint.Computation of biclusters is costly because one will have to consider all the combinations of columans and rows in order to find out all the biclusters.The search space for the biclustering problem is 2 m+n where m and n are the number of genes and conditions respectively.Usually m+n is more than 3000.The biclustering problem is NP-hard.Biclustering is a powerful analytical tool for the biologist.The research reported in this thesis addresses the problem of biclustering.Ten algorithms are developed for the identification of coherent biclusters from gene expression data.All these algorithms are making use of a measure called mean squared residue to search for biclusters.The objective here is to identify the biclusters of maximum size with the mean squared residue lower than a given threshold. All these algorithms begin the search from tightly coregulated submatrices called the seeds.These seeds are generated by K-Means clustering algorithm.The algorithms developed can be classified as constraint based,greedy and metaheuristic.Constarint based algorithms uses one or more of the various constaints namely the MSR threshold and the MSR difference threshold.The greedy approach makes a locally optimal choice at each stage with the objective of finding the global optimum.In metaheuristic approaches particle Swarm Optimization(PSO) and variants of Greedy Randomized Adaptive Search Procedure(GRASP) are used for the identification of biclusters.These algorithms are implemented on the Yeast and Lymphoma datasets.Biologically relevant and statistically significant biclusters are identified by all these algorithms which are validated by Gene Ontology database.All these algorithms are compared with some other biclustering algorithms.Algorithms developed in this work overcome some of the problems associated with the already existing algorithms.With the help of some of the algorithms which are developed in this work biclusters with very high row variance,which is higher than the row variance of any other algorithm using mean squared residue, are identified from both Yeast and Lymphoma data sets.Such biclusters which make significant change in the expression level are highly relevant biologically.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The identification of chemical mechanism that can exhibit oscillatory phenomena in reaction networks are currently of intense interest. In particular, the parametric question of the existence of Hopf bifurcations has gained increasing popularity due to its relation to the oscillatory behavior around the fixed points. However, the detection of oscillations in high-dimensional systems and systems with constraints by the available symbolic methods has proven to be difficult. The development of new efficient methods are therefore required to tackle the complexity caused by the high-dimensionality and non-linearity of these systems. In this thesis, we mainly present efficient algorithmic methods to detect Hopf bifurcation fixed points in (bio)-chemical reaction networks with symbolic rate constants, thereby yielding information about their oscillatory behavior of the networks. The methods use the representations of the systems on convex coordinates that arise from stoichiometric network analysis. One of the methods called HoCoQ reduces the problem of determining the existence of Hopf bifurcation fixed points to a first-order formula over the ordered field of the reals that can then be solved using computational-logic packages. The second method called HoCaT uses ideas from tropical geometry to formulate a more efficient method that is incomplete in theory but worked very well for the attempted high-dimensional models involving more than 20 chemical species. The instability of reaction networks may lead to the oscillatory behaviour. Therefore, we investigate some criterions for their stability using convex coordinates and quantifier elimination techniques. We also study Muldowney's extension of the classical Bendixson-Dulac criterion for excluding periodic orbits to higher dimensions for polynomial vector fields and we discuss the use of simple conservation constraints and the use of parametric constraints for describing simple convex polytopes on which periodic orbits can be excluded by Muldowney's criteria. All developed algorithms have been integrated into a common software framework called PoCaB (platform to explore bio- chemical reaction networks by algebraic methods) allowing for automated computation workflows from the problem descriptions. PoCaB also contains a database for the algebraic entities computed from the models of chemical reaction networks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a trainable system capable of tracking faces and facialsfeatures like eyes and nostrils and estimating basic mouth features such as sdegrees of openness and smile in real time. In developing this system, we have addressed the twin issues of image representation and algorithms for learning. We have used the invariance properties of image representations based on Haar wavelets to robustly capture various facial features. Similarly, unlike previous approaches this system is entirely trained using examples and does not rely on a priori (hand-crafted) models of facial features based on optical flow or facial musculature. The system works in several stages that begin with face detection, followed by localization of facial features and estimation of mouth parameters. Each of these stages is formulated as a problem in supervised learning from examples. We apply the new and robust technique of support vector machines (SVM) for classification in the stage of skin segmentation, face detection and eye detection. Estimation of mouth parameters is modeled as a regression from a sparse subset of coefficients (basis functions) of an overcomplete dictionary of Haar wavelets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Stochastic Diffusion Search (SDS) was developed as a solution to the best-fit search problem. Thus, as a special case it is capable of solving the transform invariant pattern recognition problem. SDS is efficient and, although inherently probabilistic, produces very reliable solutions in widely ranging search conditions. However, to date a systematic formal investigation of its properties has not been carried out. This thesis addresses this problem. The thesis reports results pertaining to the global convergence of SDS as well as characterising its time complexity. However, the main emphasis of the work, reports on the resource allocation aspect of the Stochastic Diffusion Search operations. The thesis introduces a novel model of the algorithm, generalising an Ehrenfest Urn Model from statistical physics. This approach makes it possible to obtain a thorough characterisation of the response of the algorithm in terms of the parameters describing the search conditions in case of a unique best-fit pattern in the search space. This model is further generalised in order to account for different search conditions: two solutions in the search space and search for a unique solution in a noisy search space. Also an approximate solution in the case of two alternative solutions is proposed and compared with predictions of the extended Ehrenfest Urn model. The analysis performed enabled a quantitative characterisation of the Stochastic Diffusion Search in terms of exploration and exploitation of the search space. It appeared that SDS is biased towards the latter mode of operation. This novel perspective on the Stochastic Diffusion Search lead to an investigation of extensions of the standard SDS, which would strike a different balance between these two modes of search space processing. Thus, two novel algorithms were derived from the standard Stochastic Diffusion Search, ‘context-free’ and ‘context-sensitive’ SDS, and their properties were analysed with respect to resource allocation. It appeared that they shared some of the desired features of their predecessor but also possessed some properties not present in the classic SDS. The theory developed in the thesis was illustrated throughout with carefully chosen simulations of a best-fit search for a string pattern, a simple but representative domain, enabling careful control of search conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Proteomics approaches have made important contributions to the characterisation of platelet regulatory mechanisms. A common problem encountered with this method, however, is the masking of low-abundance (e.g. signalling) proteins in complex mixtures by highly abundant proteins. In this study, subcellular fractionation of washed human platelets either inactivated or stimulated with the glycoprotein (GP) VI collagen receptor agonist, collagen-related peptide, reduced the complexity of the platelet proteome. The majority of proteins identified by tandem mass spectrometry are involved in signalling. The effect of GPVI stimulation on levels of specific proteins in subcellular compartments was compared and analysed using in silico quantification, and protein associations were predicted using STRING (the search tool for recurring instances of neighbouring genes/proteins). Interestingly, we observed that some proteins that were previously unidentified in platelets including teneurin-1 and Van Gogh-like protein 1, translocated to the membrane upon GPVI stimulation. Newly identified proteins may be involved in GPVI signalling nodes of importance for haemostasis and thrombosis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop and analyze a class of efficient Galerkin approximation methods for uncertainty quantification of nonlinear operator equations. The algorithms are based on sparse Galerkin discretizations of tensorized linearizations at nominal parameters. Specifically, we consider abstract, nonlinear, parametric operator equations J(\alpha ,u)=0 for random input \alpha (\omega ) with almost sure realizations in a neighborhood of a nominal input parameter \alpha _0. Under some structural assumptions on the parameter dependence, we prove existence and uniqueness of a random solution, u(\omega ) = S(\alpha (\omega )). We derive a multilinear, tensorized operator equation for the deterministic computation of k-th order statistical moments of the random solution's fluctuations u(\omega ) - S(\alpha _0). We introduce and analyse sparse tensor Galerkin discretization schemes for the efficient, deterministic computation of the k-th statistical moment equation. We prove a shift theorem for the k-point correlation equation in anisotropic smoothness scales and deduce that sparse tensor Galerkin discretizations of this equation converge in accuracy vs. complexity which equals, up to logarithmic terms, that of the Galerkin discretization of a single instance of the mean field problem. We illustrate the abstract theory for nonstationary diffusion problems in random domains.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human brain imaging techniques, such as Magnetic Resonance Imaging (MRI) or Diffusion Tensor Imaging (DTI), have been established as scientific and diagnostic tools and their adoption is growing in popularity. Statistical methods, machine learning and data mining algorithms have successfully been adopted to extract predictive and descriptive models from neuroimage data. However, the knowledge discovery process typically requires also the adoption of pre-processing, post-processing and visualisation techniques in complex data workflows. Currently, a main problem for the integrated preprocessing and mining of MRI data is the lack of comprehensive platforms able to avoid the manual invocation of preprocessing and mining tools, that yields to an error-prone and inefficient process. In this work we present K-Surfer, a novel plug-in of the Konstanz Information Miner (KNIME) workbench, that automatizes the preprocessing of brain images and leverages the mining capabilities of KNIME in an integrated way. K-Surfer supports the importing, filtering, merging and pre-processing of neuroimage data from FreeSurfer, a tool for human brain MRI feature extraction and interpretation. K-Surfer automatizes the steps for importing FreeSurfer data, reducing time costs, eliminating human errors and enabling the design of complex analytics workflow for neuroimage data by leveraging the rich functionalities available in the KNIME workbench.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, genetic algorithms concepts along with a rotamer library for proteins side chains and implicit solvation potential are used to optimize the tertiary structure of peptides. We starting from the known PDB structure of its backbone which is kept fixed while the side chains allowed adopting the conformations present in the rotamer library. It was used rotamer library independent of backbone and a implicit solvation potential. The structure of Mastoporan-X was predicted using several force fields with a growing complexity; we started it with a field where the only present interaction was Lennard-Jones. We added the Coulombian term and we considered the solvation effects through a term proportional to the solvent accessible area. This paper present good and interesting results obtained using the potential with solvation term and rotamer library. Hence, the algorithm (called YODA) presented here can be a good tool to the prediction problem. (c) 2007 Elsevier B.V. All rights reserved.