7 resultados para Branch and bound algorithms

em Helda - Digital Repository of University of Helsinki


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis which consists of an introduction and four peer-reviewed original publications studies the problems of haplotype inference (haplotyping) and local alignment significance. The problems studied here belong to the broad area of bioinformatics and computational biology. The presented solutions are computationally fast and accurate, which makes them practical in high-throughput sequence data analysis. Haplotype inference is a computational problem where the goal is to estimate haplotypes from a sample of genotypes as accurately as possible. This problem is important as the direct measurement of haplotypes is difficult, whereas the genotypes are easier to quantify. Haplotypes are the key-players when studying for example the genetic causes of diseases. In this thesis, three methods are presented for the haplotype inference problem referred to as HaploParser, HIT, and BACH. HaploParser is based on a combinatorial mosaic model and hierarchical parsing that together mimic recombinations and point-mutations in a biologically plausible way. In this mosaic model, the current population is assumed to be evolved from a small founder population. Thus, the haplotypes of the current population are recombinations of the (implicit) founder haplotypes with some point--mutations. HIT (Haplotype Inference Technique) uses a hidden Markov model for haplotypes and efficient algorithms are presented to learn this model from genotype data. The model structure of HIT is analogous to the mosaic model of HaploParser with founder haplotypes. Therefore, it can be seen as a probabilistic model of recombinations and point-mutations. BACH (Bayesian Context-based Haplotyping) utilizes a context tree weighting algorithm to efficiently sum over all variable-length Markov chains to evaluate the posterior probability of a haplotype configuration. Algorithms are presented that find haplotype configurations with high posterior probability. BACH is the most accurate method presented in this thesis and has comparable performance to the best available software for haplotype inference. Local alignment significance is a computational problem where one is interested in whether the local similarities in two sequences are due to the fact that the sequences are related or just by chance. Similarity of sequences is measured by their best local alignment score and from that, a p-value is computed. This p-value is the probability of picking two sequences from the null model that have as good or better best local alignment score. Local alignment significance is used routinely for example in homology searches. In this thesis, a general framework is sketched that allows one to compute a tight upper bound for the p-value of a local pairwise alignment score. Unlike the previous methods, the presented framework is not affeced by so-called edge-effects and can handle gaps (deletions and insertions) without troublesome sampling and curve fitting.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Enzymes offer many advantages in industrial processes, such as high specificity, mild treatment conditions and low energy requirements. Therefore, the industry has exploited them in many sectors including food processing. Enzymes can modify food properties by acting on small molecules or on polymers such as carbohydrates or proteins. Crosslinking enzymes such as tyrosinases and sulfhydryl oxidases catalyse the formation of novel covalent bonds between specific residues in proteins and/or peptides, thus forming or modifying the protein network of food. In this study, novel secreted fungal proteins with sequence features typical of tyrosinases and sulfhydryl oxidases were iden-tified through a genome mining study. Representatives of both of these enzyme families were selected for heterologous produc-tion in the filamentous fungus Trichoderma reesei and biochemical characterisation. Firstly, a novel family of putative tyrosinases carrying a shorter sequence than the previously characterised tyrosinases was discovered. These proteins lacked the whole linker and C-terminal domain that possibly play a role in cofactor incorporation, folding or protein activity. One of these proteins, AoCO4 from Aspergillus oryzae, was produced in T. reesei with a production level of about 1.5 g/l. The enzyme AoCO4 was correctly folded and bound the copper cofactors with a type-3 copper centre. However, the enzyme had only a low level of activity with the phenolic substrates tested. Highest activity was obtained with 4-tert-butylcatechol. Since tyrosine was not a substrate for AoCO4, the enzyme was classified as catechol oxidase. Secondly, the genome analysis for secreted proteins with sequence features typical of flavin-dependent sulfhydryl oxidases pinpointed two previously uncharacterised proteins AoSOX1 and AoSOX2 from A. oryzae. These two novel sulfhydryl oxidases were produced in T. reesei with production levels of 70 and 180 mg/l, respectively, in shake flask cultivations. AoSOX1 and AoSOX2 were FAD-dependent enzymes with a dimeric tertiary structure and they both showed activity on small sulfhydryl compounds such as glutathione and dithiothreitol, and were drastically inhibited by zinc sulphate. AoSOX2 showed good stabil-ity to thermal and chemical denaturation, being superior to AoSOX1 in this respect. Thirdly, the suitability of AoSOX1 as a possible baking improver was elucidated. The effect of AoSOX1, alone and in combi-nation with the widely used improver ascorbic acid was tested on yeasted wheat dough, both fresh and frozen, and on fresh water-flour dough. In all cases, AoSOX1 had no effect on the fermentation properties of fresh yeasted dough. AoSOX1 nega-tively affected the fermentation properties of frozen doughs and accelerated the damaging effects of the frozen storage, i.e. giving a softer dough with poorer gas retention abilities than the control. In combination with ascorbic acid, AoSOX1 gave harder doughs. In accordance, rheological studies in yeast-free dough showed that the presence of only AoSOX1 resulted in weaker and more extensible dough whereas a dough with opposite properties was obtained if ascorbic acid was also used. Doughs containing ascorbic acid and increasing amounts of AoSOX1 were harder in a dose-dependent manner. Sulfhydryl oxidase AoSOX1 had an enhancing effect on the dough hardening mechanism of ascorbic acid. This was ascribed mainly to the produc-tion of hydrogen peroxide in the SOX reaction which is able to convert the ascorbic acid to the actual improver dehydroascorbic acid. In addition, AoSOX1 could possibly oxidise the free glutathione in the dough and thus prevent the loss of dough strength caused by the spontaneous reduction of the disulfide bonds constituting the dough protein network. Sulfhydryl oxidase AoSOX1 is therefore able to enhance the action of ascorbic acid in wheat dough and could potentially be applied in wheat dough baking.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Natural products constitute an important source of new drugs. The bioavailability of the drugs depends on their absorption, distribution, metabolism and elimination. To achieve good bioavailability, the drug must be soluble in water, stable in the gastrointestinal tract and palatable. Binding proteins may improve the solubility of drug compounds, masking unwanted properties, such as bad taste, bitterness or toxicity, transporting or protecting these compounds during processing and storage. The focus of this thesis was to study the interactions, including ligand binding and the effect of pH and temperature, of bovine and reindeer β-lactoglobulin (βLG) with such compounds as retinoids, phenolic compounds as well as with compounds from plant extracts, and to investigate the transport properties of the βLG-ligand complex. To examine the binding interactions of different ligands to βLG, new methods were developed. The fluorescence binding method for the evaluation of ligand binding to βLG was miniaturized from a quartz cell to a 96-well plate. A method of ultrafiltration sampling combined with high-performance liquid chromatography was developed to assess the binding of compounds from extracts. The interactions of phenolic compounds or retinoids and βLG were investigated using the 96-well plate method. The majority of flavones, flavonols, flavanones and isoflavones and all of the retinoids included were shown to bind to bovine and reindeer βLG. Phenolic compounds, contrary to retinol, were not released at acidic pH. Those results suggest that βLG may have more binding sites, probably also on the surface of βLG. An extract from Camellia sinensis (L.) O. Kunze (black tea), Urtica dioica L. (nettle) and Piper nigrum (black pepper) were used to evaluate whether βLG could bind compounds from plant extracts. Piperine from P. nigrum was found to bind tightly and rutin from U. dioica weakly to βLG. No components from C. sinensis bound to βLG in our experiment. The uptake and membrane permeation of bovine and reindeer βLG, free and bound with retinol, palmitic acid and cholesterol, were investigated using Caco-2 cell monolayers. Both bovine and reindeer βLG were able to cross the Caco-2 cell membrane. Free and βLG-bound retinol and palmitic acid were transported equally, whereas cholesterol could not cross the Caco-2 cell monolayer free or bound to βLG. Our results showed that βLG can bind different natural product compounds, but cannot enhance transport of retinol, palmitic acid or cholesterol through Caco-2 cells. Despite this, βLG, as a water-soluble binding protein, may improve the solubility of natural compounds, possibly protecting them from early degradation and transporting some of them through the stomach. Furthermore, it may decrease their bad or bitter taste during oral administration of drugs or in food preparations. βLG can also enhance or decrease the health benefits of herbal teas and food preparations by binding compounds from extracts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An important challenge in forest industry is to get the appropriate raw material out from the forests to the wood processing industry. Growth and stem reconstruction simulators are therefore increasingly integrated in industrial conversion simulators, for linking the properties of wooden products to the three-dimensional structure of stems and their growing conditions. Static simulators predict the wood properties from stem dimensions at the end of a growth simulation period, whereas in dynamic approaches, the structural components, e.g. branches, are incremented along with the growth processes. The dynamic approach can be applied to stem reconstruction by predicting the three-dimensional stem structure from external tree variables (i.e. age, height) as a result of growth to the current state. In this study, a dynamic growth simulator, PipeQual, and a stem reconstruction simulator, RetroSTEM, are adapted to Norway spruce (Picea abies [L.] Karst.) to predict the three-dimensional structure of stems (tapers, branchiness, wood basic density) over time such that both simulators can be integrated in a sawing simulator. The parameterisation of the PipeQual and RetroSTEM simulators for Norway spruce relied on the theoretically based description of tree structure developing in the growth process and following certain conservative structural regularities while allowing for plasticity in the crown development. The crown expressed both regularity and plasticity in its development, as the vertical foliage density peaked regularly at about 5 m from the stem apex, varying below that with tree age and dominance position (Study I). Conservative stem structure was characterized in terms of (1) the pipe ratios between foliage mass and branch and stem cross-sectional areas at crown base, (2) the allometric relationship between foliage mass and crown length, (3) mean branch length relative to crown length and (4) form coefficients in branches and stem (Study II). The pipe ratio between branch and stem cross-sectional area at crown base, and mean branch length relative to the crown length may differ in trees before and after canopy closure, but the variation should be further analysed in stands of different ages and densities with varying site fertilities and climates. The predictions of the PipeQual and RetroSTEM simulators were evaluated by comparing the simulated values to measured ones (Study III, IV). Both simulators predicted stem taper and branch diameter at the individual tree level with a small bias. RetroSTEM predictions of wood density were accurate. For focusing on even more accurate predictions of stem diameters and branchiness along the stem, both simulators should be further improved by revising the following aspects in the simulators: the relationship between foliage and stem sapwood area in the upper stem, the error source in branch sizes, the crown base development and the height growth models in RetroSTEM. In Study V, the RetroSTEM simulator was integrated in the InnoSIM sawing simulator, and according to the pilot simulations, this turned out to be an efficient tool for readily producing stand scale information about stem sizes and structure when approximating the available assortments of wood products.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis integrates real-time feedback control into an optical tweezers instrument. The goal is to reduce the variance in the trapped bead s position, -effectively increasing the trap stiffness of the optical tweezers. Trap steering is done with acousto-optic deflectors and control algorithms are implemented with a field-programmable gate array card. When position clamp feedback control is on, the effective trap stiffness increases 12.1-times compared to the stiffness without control. This allows improved spatial control over trapped particles without increasing the trapping laser power.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The paradigm of computational vision hypothesizes that any visual function -- such as the recognition of your grandparent -- can be replicated by computational processing of the visual input. What are these computations that the brain performs? What should or could they be? Working on the latter question, this dissertation takes the statistical approach, where the suitable computations are attempted to be learned from the natural visual data itself. In particular, we empirically study the computational processing that emerges from the statistical properties of the visual world and the constraints and objectives specified for the learning process. This thesis consists of an introduction and 7 peer-reviewed publications, where the purpose of the introduction is to illustrate the area of study to a reader who is not familiar with computational vision research. In the scope of the introduction, we will briefly overview the primary challenges to visual processing, as well as recall some of the current opinions on visual processing in the early visual systems of animals. Next, we describe the methodology we have used in our research, and discuss the presented results. We have included some additional remarks, speculations and conclusions to this discussion that were not featured in the original publications. We present the following results in the publications of this thesis. First, we empirically demonstrate that luminance and contrast are strongly dependent in natural images, contradicting previous theories suggesting that luminance and contrast were processed separately in natural systems due to their independence in the visual data. Second, we show that simple cell -like receptive fields of the primary visual cortex can be learned in the nonlinear contrast domain by maximization of independence. Further, we provide first-time reports of the emergence of conjunctive (corner-detecting) and subtractive (opponent orientation) processing due to nonlinear projection pursuit with simple objective functions related to sparseness and response energy optimization. Then, we show that attempting to extract independent components of nonlinear histogram statistics of a biologically plausible representation leads to projection directions that appear to differentiate between visual contexts. Such processing might be applicable for priming, \ie the selection and tuning of later visual processing. We continue by showing that a different kind of thresholded low-frequency priming can be learned and used to make object detection faster with little loss in accuracy. Finally, we show that in a computational object detection setting, nonlinearly gain-controlled visual features of medium complexity can be acquired sequentially as images are encountered and discarded. We present two online algorithms to perform this feature selection, and propose the idea that for artificial systems, some processing mechanisms could be selectable from the environment without optimizing the mechanisms themselves. In summary, this thesis explores learning visual processing on several levels. The learning can be understood as interplay of input data, model structures, learning objectives, and estimation algorithms. The presented work adds to the growing body of evidence showing that statistical methods can be used to acquire intuitively meaningful visual processing mechanisms. The work also presents some predictions and ideas regarding biological visual processing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reorganizing a dataset so that its hidden structure can be observed is useful in any data analysis task. For example, detecting a regularity in a dataset helps us to interpret the data, compress the data, and explain the processes behind the data. We study datasets that come in the form of binary matrices (tables with 0s and 1s). Our goal is to develop automatic methods that bring out certain patterns by permuting the rows and columns. We concentrate on the following patterns in binary matrices: consecutive-ones (C1P), simultaneous consecutive-ones (SC1P), nestedness, k-nestedness, and bandedness. These patterns reflect specific types of interplay and variation between the rows and columns, such as continuity and hierarchies. Furthermore, their combinatorial properties are interlinked, which helps us to develop the theory of binary matrices and efficient algorithms. Indeed, we can detect all these patterns in a binary matrix efficiently, that is, in polynomial time in the size of the matrix. Since real-world datasets often contain noise and errors, we rarely witness perfect patterns. Therefore we also need to assess how far an input matrix is from a pattern: we count the number of flips (from 0s to 1s or vice versa) needed to bring out the perfect pattern in the matrix. Unfortunately, for most patterns it is an NP-complete problem to find the minimum distance to a matrix that has the perfect pattern, which means that the existence of a polynomial-time algorithm is unlikely. To find patterns in datasets with noise, we need methods that are noise-tolerant and work in practical time with large datasets. The theory of binary matrices gives rise to robust heuristics that have good performance with synthetic data and discover easily interpretable structures in real-world datasets: dialectical variation in the spoken Finnish language, division of European locations by the hierarchies found in mammal occurrences, and co-occuring groups in network data. In addition to determining the distance from a dataset to a pattern, we need to determine whether the pattern is significant or a mere occurrence of a random chance. To this end, we use significance testing: we deem a dataset significant if it appears exceptional when compared to datasets generated from a certain null hypothesis. After detecting a significant pattern in a dataset, it is up to domain experts to interpret the results in the terms of the application.