Biblioteca Digital

949 resultados para Museum conservation methods.

Bayesian Inference for Retrospective Population Genetics Models Using Markov Chain Monte Carlo Methods

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genetics, the science of heredity and variation in living organisms, has a central role in medicine, in breeding crops and livestock, and in studying fundamental topics of biological sciences such as evolution and cell functioning. Currently the field of genetics is under a rapid development because of the recent advances in technologies by which molecular data can be obtained from living organisms. In order that most information from such data can be extracted, the analyses need to be carried out using statistical models that are tailored to take account of the particular genetic processes. In this thesis we formulate and analyze Bayesian models for genetic marker data of contemporary individuals. The major focus is on the modeling of the unobserved recent ancestry of the sampled individuals (say, for tens of generations or so), which is carried out by using explicit probabilistic reconstructions of the pedigree structures accompanied by the gene flows at the marker loci. For such a recent history, the recombination process is the major genetic force that shapes the genomes of the individuals, and it is included in the model by assuming that the recombination fractions between the adjacent markers are known. The posterior distribution of the unobserved history of the individuals is studied conditionally on the observed marker data by using a Markov chain Monte Carlo algorithm (MCMC). The example analyses consider estimation of the population structure, relatedness structure (both at the level of whole genomes as well as at each marker separately), and haplotype configurations. For situations where the pedigree structure is partially known, an algorithm to create an initial state for the MCMC algorithm is given. Furthermore, the thesis includes an extension of the model for the recent genetic history to situations where also a quantitative phenotype has been measured from the contemporary individuals. In that case the goal is to identify positions on the genome that affect the observed phenotypic values. This task is carried out within the Bayesian framework, where the number and the relative effects of the quantitative trait loci are treated as random variables whose posterior distribution is studied conditionally on the observed genetic and phenotypic data. In addition, the thesis contains an extension of a widely-used haplotyping method, the PHASE algorithm, to settings where genetic material from several individuals has been pooled together, and the allele frequencies of each pool are determined in a single genotyping.

Methods to improve gene signal: Application to cDNA microarrays

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Microarrays are high throughput biological assays that allow the screening of thousands of genes for their expression. The main idea behind microarrays is to compute for each gene a unique signal that is directly proportional to the quantity of mRNA that was hybridized on the chip. A large number of steps and errors associated with each step make the generated expression signal noisy. As a result, microarray data need to be carefully pre-processed before their analysis can be assumed to lead to reliable and biologically relevant conclusions. This thesis focuses on developing methods for improving gene signal and further utilizing this improved signal for higher level analysis. To achieve this, first, approaches for designing microarray experiments using various optimality criteria, considering both biological and technical replicates, are described. A carefully designed experiment leads to signal with low noise, as the effect of unwanted variations is minimized and the precision of the estimates of the parameters of interest are maximized. Second, a system for improving the gene signal by using three scans at varying scanner sensitivities is developed. A novel Bayesian latent intensity model is then applied on these three sets of expression values, corresponding to the three scans, to estimate the suitably calibrated true signal of genes. Third, a novel image segmentation approach that segregates the fluorescent signal from the undesired noise is developed using an additional dye, SYBR green RNA II. This technique helped in identifying signal only with respect to the hybridized DNA, and signal corresponding to dust, scratch, spilling of dye, and other noises, are avoided. Fourth, an integrated statistical model is developed, where signal correction, systematic array effects, dye effects, and differential expression, are modelled jointly as opposed to a sequential application of several methods of analysis. The methods described in here have been tested only for cDNA microarrays, but can also, with some modifications, be applied to other high-throughput technologies. Keywords: High-throughput technology, microarray, cDNA, multiple scans, Bayesian hierarchical models, image analysis, experimental design, MCMC, WinBUGS.

Bayesian analysis of community dynamics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Elucidating the mechanisms responsible for the patterns of species abundance, diversity, and distribution within and across ecological systems is a fundamental research focus in ecology. Species abundance patterns are shaped in a convoluted way by interplays between inter-/intra-specific interactions, environmental forcing, demographic stochasticity, and dispersal. Comprehensive models and suitable inferential and computational tools for teasing out these different factors are quite limited, even though such tools are critically needed to guide the implementation of management and conservation strategies, the efficacy of which rests on a realistic evaluation of the underlying mechanisms. This is even more so in the prevailing context of concerns over climate change progress and its potential impacts on ecosystems. This thesis utilized the flexible hierarchical Bayesian modelling framework in combination with the computer intensive methods known as Markov chain Monte Carlo, to develop methodologies for identifying and evaluating the factors that control the structure and dynamics of ecological communities. These methodologies were used to analyze data from a range of taxa: macro-moths (Lepidoptera), fish, crustaceans, birds, and rodents. Environmental stochasticity emerged as the most important driver of community dynamics, followed by density dependent regulation; the influence of inter-specific interactions on community-level variances was broadly minor. This thesis contributes to the understanding of the mechanisms underlying the structure and dynamics of ecological communities, by showing directly that environmental fluctuations rather than inter-specific competition dominate the dynamics of several systems. This finding emphasizes the need to better understand how species are affected by the environment and acknowledge species differences in their responses to environmental heterogeneity, if we are to effectively model and predict their dynamics (e.g. for management and conservation purposes). The thesis also proposes a model-based approach to integrating the niche and neutral perspectives on community structure and dynamics, making it possible for the relative importance of each category of factors to be evaluated in light of field data.

A Review of Methods for Unconstrained Optimization: Theory, Implementation and Testing

Relevância:

20.00% 20.00%

Publicador:

Short-term survival of discarded pearl perch (Glaucosoma scapulare Ramsay, 1881) caught by hook-and-line in Queensland, Australia

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Post-release survival of line-caught pearl perch (Glaucosoma scapulare) was assessed via field experiments where fish were angled using methods similar to those used by commercial, recreational and charter fishers. One hundred and eighty-three individuals were caught during four experiments, of which >91 survived up to three days post-capture. Hook location was found to be the best predictor of survival, with the survival of throat- or stomach-hooked pearl perch significantly (P < 0.05) lower than those hooked in either the mouth or lip. Post-release survival was similar for both legal (≥35 cm) and sub-legal (<35 cm) pearl perch, while those individuals showing no signs of barotrauma were more likely to survive in the short term. Examination of the swim bladders in the laboratory, combined with observations in the field, revealed that swim bladders rupture during ascent from depth allowing swim bladder gases to escape into the gut cavity. As angled fish approach the surface, the alimentary tract ruptures near the anus allowing swim bladder gases to escape the gut cavity. As a result, very few pearl perch exhibit barotrauma symptoms and no barotrauma mitigation strategies were recommended. The results of this study show that pearl perch are relatively resilient to catch-and-release suggesting that post-release mortality would not contribute significantly to total fishing mortality. We recommend the use of circle hooks, fished actively on tight lines, combined with minimal handling in order to maximise the post-release survival of pearl perch.

Strategic tillage in no-till farming systems in Australia’s northern grains-growing regions: I. Drivers and implementation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Development of no-tillage (NT) farming has revolutionized agricultural systems by allowing growers to manage greater areas of land with reduced energy, labour and machinery inputs to control erosion, improve soil health and reduce greenhouse gas emission. However, NT farming systems have resulted in a build-up of herbicide-resistant weeds, an increased incidence of soil- and stubble-borne diseases and enrichment of nutrients and carbon near the soil surface. Consequently, there is an increased interest in the use of an occasional tillage (termed strategic tillage, ST) to address such emerging constraints in otherwise-NT farming systems. Decisions around ST uses will depend upon the specific issues present on the individual field or farm, and profitability and effectiveness of available options for management. This paper explores some of the issues with the implementation of ST in NT farming systems. The impact of contrasting soil properties, the timing of the tillage and the prevailing climate exert a strong influence on the success of ST. Decisions around timing of tillage are very complex and depend on the interactions between soil water content and the purpose for which the ST is intended. The soil needs to be at the right water content before executing any tillage, while the objective of the ST will influence the frequency and type of tillage implement used. The use of ST in long-term NT systems will depend on factors associated with system costs and profitability, soil health and environmental impacts. For many farmers maintaining farm profitability is a priority, so economic considerations are likely to be a primary factor dictating adoption. However, impacts on soil health and environment, especially the risk of erosion and the loss of soil carbon, will also influence a grower’s choice to adopt ST, as will the impact on soil moisture reserves in rainfed cropping systems.

Numerical Methods for Unconstrained Minimization - An Integrated Computational Environment

Relevância:

20.00% 20.00%

Publicador:

Soybean rotation and crop residue management to reduce nitrous oxide emissions from sugarcane soils

Relevância:

20.00% 20.00%

Publicador:

Resumo:

NITROUS OXIDE (N2O) IS a potent greenhouse gas and the predominant ozone-depleting substance in the atmosphere. Agricultural nitrogenous fertiliser use is the major source of human-induced N2O emissions. A field experiment was conducted at Bundaberg from October 2012 to September 2014 to examine the impacts of legume crop (soybean) rotation as an alternative nitrogen (N) source on N2O emissions during the fallow period and to investigate low-emission soybean residue management practices. An automatic monitoring system and manual gas sampling chambers were used to measure greenhouse gas emissions from soil. Soybean cropping during the fallow period reduced N2O emissions compared to the bare fallow. Based on the N content in the soybean crop residues, the fertiliser N application rate was reduced by about 120 kg N/ha for the subsequent sugarcane crop. Consequently, emissions of N2O during the sugarcane cropping season were significantly lower from the soybean cropped soil than those from the conventionally fertilised (145 kg N/ha) soil following bare fallow. However, tillage that incorporated the soybean crop residues into soil promoted N2O emissions in the first two months. Spraying a nitrification inhibitor (DMPP) onto the soybean crop residues before tillage effectively prevented the N2O emission spikes. Compared to conventional tillage, practising no-till with or without growing a nitrogen catch crop during the time after soybean harvest and before cane planting also reduced N2O emissions substantially. These results demonstrated that soybean rotation during the fallow period followed with N conservation management practices could offer a promising N2O mitigation strategy in sugarcane farming. Further investigation is required to provide guidance on N and water management following soybean fallow to maintain sugar productivity.

Computational Methods for Reconstruction and Analysis of Genome-Scale Metabolic Networks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Metabolism is the cellular subsystem responsible for generation of energy from nutrients and production of building blocks for larger macromolecules. Computational and statistical modeling of metabolism is vital to many disciplines including bioengineering, the study of diseases, drug target identification, and understanding the evolution of metabolism. In this thesis, we propose efficient computational methods for metabolic modeling. The techniques presented are targeted particularly at the analysis of large metabolic models encompassing the whole metabolism of one or several organisms. We concentrate on three major themes of metabolic modeling: metabolic pathway analysis, metabolic reconstruction and the study of evolution of metabolism. In the first part of this thesis, we study metabolic pathway analysis. We propose a novel modeling framework called gapless modeling to study biochemically viable metabolic networks and pathways. In addition, we investigate the utilization of atom-level information on metabolism to improve the quality of pathway analyses. We describe efficient algorithms for discovering both gapless and atom-level metabolic pathways, and conduct experiments with large-scale metabolic networks. The presented gapless approach offers a compromise in terms of complexity and feasibility between the previous graph-theoretic and stoichiometric approaches to metabolic modeling. Gapless pathway analysis shows that microbial metabolic networks are not as robust to random damage as suggested by previous studies. Furthermore the amino acid biosynthesis pathways of the fungal species Trichoderma reesei discovered from atom-level data are shown to closely correspond to those of Saccharomyces cerevisiae. In the second part, we propose computational methods for metabolic reconstruction in the gapless modeling framework. We study the task of reconstructing a metabolic network that does not suffer from connectivity problems. Such problems often limit the usability of reconstructed models, and typically require a significant amount of manual postprocessing. We formulate gapless metabolic reconstruction as an optimization problem and propose an efficient divide-and-conquer strategy to solve it with real-world instances. We also describe computational techniques for solving problems stemming from ambiguities in metabolite naming. These techniques have been implemented in a web-based sofware ReMatch intended for reconstruction of models for 13C metabolic flux analysis. In the third part, we extend our scope from single to multiple metabolic networks and propose an algorithm for inferring gapless metabolic networks of ancestral species from phylogenetic data. Experimenting with 16 fungal species, we show that the method is able to generate results that are easily interpretable and that provide hypotheses about the evolution of metabolism.

Computational Methods for Detecting Large-Scale Chromosome Rearrangements in SNP Data

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Large-scale chromosome rearrangements such as copy number variants (CNVs) and inversions encompass a considerable proportion of the genetic variation between human individuals. In a number of cases, they have been closely linked with various inheritable diseases. Single-nucleotide polymorphisms (SNPs) are another large part of the genetic variance between individuals. They are also typically abundant and their measuring is straightforward and cheap. This thesis presents computational means of using SNPs to detect the presence of inversions and deletions, a particular variety of CNVs. Technically, the inversion-detection algorithm detects the suppressed recombination rate between inverted and non-inverted haplotype populations whereas the deletion-detection algorithm uses the EM-algorithm to estimate the haplotype frequencies of a window with and without a deletion haplotype. As a contribution to population biology, a coalescent simulator for simulating inversion polymorphisms has been developed. Coalescent simulation is a backward-in-time method of modelling population ancestry. Technically, the simulator also models multiple crossovers by using the Counting model as the chiasma interference model. Finally, this thesis includes an experimental section. The aforementioned methods were tested on synthetic data to evaluate their power and specificity. They were also applied to the HapMap Phase II and Phase III data sets, yielding a number of candidates for previously unknown inversions, deletions and also correctly detecting known such rearrangements.

Statistical and Information-Theoretic Methods for Data-Analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this Thesis, we develop theory and methods for computational data analysis. The problems in data analysis are approached from three perspectives: statistical learning theory, the Bayesian framework, and the information-theoretic minimum description length (MDL) principle. Contributions in statistical learning theory address the possibility of generalization to unseen cases, and regression analysis with partially observed data with an application to mobile device positioning. In the second part of the Thesis, we discuss so called Bayesian network classifiers, and show that they are closely related to logistic regression models. In the final part, we apply the MDL principle to tracing the history of old manuscripts, and to noise reduction in digital signals.

Computational methods and models for paleoecology

Relevância:

20.00% 20.00%

Publicador:

Methods for Answer Extraction in Textual Question Answering

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.

Computationally Efficient Methods for MDL-Optimal Density Estimation and Data Clustering

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.

Matrix Decomposition Methods for Data Mining : Computational Complexity and Algorithms

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.

«
1
2
...
53
54
55
56
57
58
59
...
63
64
»