8 resultados para Software-based techniques
em Helda - Digital Repository of University of Helsinki
Resumo:
In this thesis, two separate single nucleotide polymorphism (SNP) genotyping techniques were set up at the Finnish Genome Center, pooled genotyping was evaluated as a screening method for large-scale association studies, and finally, the former approaches were used to identify genetic factors predisposing to two distinct complex diseases by utilizing large epidemiological cohorts and also taking environmental factors into account. The first genotyping platform was based on traditional but improved restriction-fragment-length-polymorphism (RFLP) utilizing 384-microtiter well plates, multiplexing, small reaction volumes (5 µl), and automated genotype calling. We participated in the development of the second genotyping method, based on single nucleotide primer extension (SNuPeTM by Amersham Biosciences), by carrying out the alpha- and beta tests for the chemistry and the allele-calling software. Both techniques proved to be accurate, reliable, and suitable for projects with thousands of samples and tens of markers. Pooled genotyping (genotyping of pooled instead of individual DNA samples) was evaluated with Sequenom s MassArray MALDI-TOF, in addition to SNuPeTM and PCR-RFLP techniques. We used MassArray mainly as a point of comparison, because it is known to be well suited for pooled genotyping. All three methods were shown to be accurate, the standard deviations between measurements being 0.017 for the MassArray, 0.022 for the PCR-RFLP, and 0.026 for the SNuPeTM. The largest source of error in the process of pooled genotyping was shown to be the volumetric error, i.e., the preparation of pools. We also demonstrated that it would have been possible to narrow down the genetic locus underlying congenital chloride diarrhea (CLD), an autosomal recessive disorder, by using the pooling technique instead of genotyping individual samples. Although the approach seems to be well suited for traditional case-control studies, it is difficult to apply if any kind of stratification based on environmental factors is needed. Therefore we chose to continue with individual genotyping in the following association studies. Samples in the two separate large epidemiological cohorts were genotyped with the PCR-RFLP and SNuPeTM techniques. The first of these association studies concerned various pregnancy complications among 100,000 consecutive pregnancies in Finland, of which we genotyped 2292 patients and controls, in addition to a population sample of 644 blood donors, with 7 polymorphisms in the potentially thrombotic genes. In this thesis, the analysis of a sub-study of pregnancy-related venous thromboses was included. We showed that the impact of factor V Leiden polymorphism on pregnancy-related venous thrombosis, but not the other tested polymorphisms, was fairly large (odds ratio 11.6; 95% CI 3.6-33.6), and increased multiplicatively when combined with other risk factors such as obesity or advanced age. Owing to our study design, we were also able to estimate the risks at the population level. The second epidemiological cohort was the Helsinki Birth Cohort of men and women who were born during 1924-1933 in Helsinki. The aim was to identify genetic factors that might modify the well known link between small birth size and adult metabolic diseases, such as type 2 diabetes and impaired glucose tolerance. Among ~500 individuals with detailed birth measurements and current metabolic profile, we found that an insertion/deletion polymorphism of the angiotensin converting enzyme (ACE) gene was associated with the duration of gestation, and weight and length at birth. Interestingly, the ACE insertion allele was also associated with higher indices of insulin secretion (p=0.0004) in adult life, but only among individuals who were born small (those among the lowest third of birth weight). Likewise, low birth weight was associated with higher indices of insulin secretion (p=0.003), but only among carriers of the ACE insertion allele. The association with birth measurements was also found with a common haplotype of the glucocorticoid receptor (GR) gene. Furthermore, the association between short length at birth and adult impaired glucose tolerance was confined to carriers of this haplotype (p=0.007). These associations exemplify the interaction between environmental factors and genotype, which, possibly due to altered gene expression, predisposes to complex metabolic diseases. Indeed, we showed that the common GR gene haplotype associated with reduced mRNA expression in thymus of three individuals (p=0.0002).
Resumo:
Megasphaera cerevisiae, Pectinatus cerevisiiphilus, Pectinatus frisingensis, Selenomonas lacticifex, Zymophilus paucivorans and Zymophilus raffinosivorans are strictly anaerobic Gram-stain-negative bacteria that are able to spoil beer by producing off-flavours and turbidity. They have only been isolated from the beer production chain. The species are phylogenetically affiliated to the Sporomusa sub-branch in the class "Clostridia". Routine cultivation methods for detection of strictly anaerobic bacteria in breweries are time-consuming and do not allow species identification. The main aim of this study was to utilise DNA-based techniques in order to improve detection and identification of the Sporomusa sub-branch beer-spoilage bacteria and to increase understanding of their biodiversity, evolution and natural sources. Practical PCR-based assays were developed for monitoring of M. cerevisiae, Pectinatus species and the group of Sporomusa sub-branch beer spoilers throughout the beer production process. The developed assays reliably differentiated the target bacteria from other brewery-related microbes. The contaminant detection in process samples (10 1,000 cfu/ml) could be accomplished in 2 8 h. Low levels of viable cells in finished beer (≤10 cfu/100 ml) were usually detected after 1 3 d culture enrichment. Time saving compared to cultivation methods was up to 6 d. Based on a polyphasic approach, this study revealed the existence of three new anaerobic spoilage species in the beer production chain, i.e. Megasphaera paucivorans, Megasphaera sueciensis and Pectinatus haikarae. The description of these species enabled establishment of phenotypic and DNA-based methods for their detection and identification. The 16S rRNA gene based phylogenetic analysis of the Sporomusa sub-branch showed that the genus Selenomonas originates from several ancestors and will require reclassification. Moreover, Z. paucivorans and Z. raffinosivorans were found to be in fact members of the genus Propionispira. This relationship implies that they were carried to breweries along with plant material. The brewery-related Megasphaera species formed a distinct sub-group that did not include any sequences from other sources, suggesting that M. cerevisiae, M. paucivorans and M. sueciensis may be uniquely adapted to the brewery ecosystem. M. cerevisiae was also shown to exhibit remarkable resistance against many brewery-related stress conditions. This may partly explain why it is a brewery contaminant. This study showed that DNA-based techniques provide useful tools for obtaining more rapid and specific information about the presence and identity of the strictly anaerobic spoilage bacteria in the beer production chain than is possible using cultivation methods. This should ensure financial benefits to the industry and better product quality to customers. In addition, DNA-based analyses provided new insight into the biodiversity as well as natural sources and relations of the Sporomusa sub-branch bacteria. The data can be exploited for taxonomic classification of these bacteria and for surveillance and control of contaminations.
Resumo:
Free and open source software development is an alternative to traditional software engineering as an approach to the development of complex software systems. It is a way of developing software based on geographically distributed teams of volunteers without apparent central plan or traditional mechanisms of coordination. The purpose of this thesis is to summarize the current knowledge about free and open source software development and explore the ways on which further understanding on it could be gained. The results of research on the field as well as the research methods are introduced and discussed. Also adapting software process metrics to the context of free and open source software development is illustrated and the possibilities to utilize them as tools to validate other research are discussed.
Resumo:
To obtain data on phytoplankton dynamics with improved spatial and temporal resolution, and at reduced cost, traditional phytoplankton monitoring methods have been supplemented with optical approaches. In this thesis, I have explored various fluorescence-based techniques for detection of phytoplankton abundance, taxonomy and physiology in the Baltic Sea. In algal cultures used in this thesis, the availability of nitrogen and light conditions caused changes in pigmentation, and consequently in light absorption and fluorescence properties of cells. In the Baltic Sea, physical environmental factors (e.g. mixing depth, irradiance and temperature) and related seasonal succession in the phytoplankton community explained a large part of the seasonal variability in the magnitude and shape of Chlorophyll a (Chla)-specific absorption. The variability in Chla-specific fluorescence was related to the abundance of cyanobacteria, the size structure of the phytoplankton community, and absorption characteristics of phytoplankton. Cyanobacteria show very low Chla-specific fluorescence. In the presence of eukaryotic species, Chla fluorescence describes poorly cyanobacteria. During cyanobacterial bloom in the Baltic Sea, phycocyanin fluorescence explained large part of the variability in Chla concentrations. Thus, both Chla and phycocyanin fluorescence were required to predict Chla concentration. Phycobilins are major light harvesting pigments for cyanobacteria. In the open Baltic Sea, small picoplanktonic cyanobacteria were the main source of phycoerythrin fluorescence and absorption signal. Large filamentous cyanobacteria, forming harmful blooms, were the main source of the phycocyanin fluorescence signal and typically their biomass and phycocyanin fluorescence were linearly related. Using phycocyanin fluorescence, dynamics of cyanobacterial blooms can be detected at high spatial and seasonal resolution not possible with other methods. Various taxonomic phytoplankton pigment groups can be separated by spectral fluorescence. I compared multivariate calibration methods for the retrieval of phytoplankton biomass in different taxonomic groups. Partial least squares regression method gave the closest predictions for all taxonomic groups, and the accuracy was adequate for phytoplankton bloom detection. Variable fluorescence has been proposed as a tool to study the physiological state of phytoplankton. My results from the Baltic Sea emphasize that variable fluorescence alone cannot be used to detect nutrient limitation of phytoplankton. However, when combined with experiments with active nutrient manipulation, and other nutrient limitation indices, variable fluorescence provided valuable information on the physiological responses of the phytoplankton community. This thesis found a severe limitation of a commercial fast repetition rate fluorometer, which couldn t detect the variable fluorescence of phycoerythrin-lacking cyanobacteria. For these species, the Photosystem II absorption of blue light is very low, and fluorometer excitation light did not saturate Photosystem II during a measurement. This thesis encourages the use of various in vivo fluorescence methods for the detection of bulk phytoplankton biomass, biomass of cyanobacteria, chemotaxonomy of phytoplankton community, and phytoplankton physiology. Fluorescence methods can support traditional phytoplankton monitoring by providing continuous measurements of phytoplankton, and thereby strengthen the understanding of the links between biological, chemical and physical processes in aquatic ecosystems.
Resumo:
Atmospheric aerosol particles have a strong impact on the global climate. A deep understanding of the physical and chemical processes affecting the atmospheric aerosol climate system is crucial in order to describe those processes properly in global climate models. Besides the climatic effects, aerosol particles can deteriorate e.g. visibility and human health. Nucleation is a fundamental step in atmospheric new particle formation. However, details of the atmospheric nucleation mechanisms have remained unresolved. The main reason for that has been the non-existence of instruments capable of measuring neutral newly formed particles in the size range below 3 nm in diameter. This thesis aims to extend the detectable particle size range towards close-to-molecular sizes (~1nm) of freshly nucleated clusters, and by direct measurement obtain the concentrations of sub-3 nm particles in atmospheric environment and in well defined laboratory conditions. In the work presented in this thesis, new methods and instruments for the sub-3 nm particle detection were developed and tested. The selected approach comprises four different condensation based techniques and one electrical detection scheme. All of them are capable to detect particles with diameters well below 3 nm, some even down to ~1 nm. The developed techniques and instruments were deployed in the field measurements as well as in laboratory nucleation experiments. Ambient air studies showed that in a boreal forest environment a persistent population of 1-2 nm particles or clusters exists. The observation was done using 4 different instruments showing a consistent capability for the direct measurement of the atmospheric nucleation. The results from the laboratory experiments showed that sulphuric acid is a key species in the atmospheric nucleation. The mismatch between the earlier laboratory data and ambient observations on the dependency of nucleation rate on sulphuric acid concentration was explained. The reason was shown to be associated in the inefficient growth of the nucleated clusters and in the insufficient detection efficiency of particle counters used in the previous experiments. Even though the exact molecular steps of nucleation still remain an open question, the instrumental techniques developed in this work as well as their application in laboratory and ambient studies opened a new view into atmospheric nucleation and prepared the way for investigating the nucleation processes with more suitable tools.
Resumo:
In this thesis we deal with the concept of risk. The objective is to bring together and conclude on some normative information regarding quantitative portfolio management and risk assessment. The first essay concentrates on return dependency. We propose an algorithm for classifying markets into rising and falling. Given the algorithm, we derive a statistic: the Trend Switch Probability, for detection of long-term return dependency in the first moment. The empirical results suggest that the Trend Switch Probability is robust over various volatility specifications. The serial dependency in bear and bull markets behaves however differently. It is strongly positive in rising market whereas in bear markets it is closer to a random walk. Realized volatility, a technique for estimating volatility from high frequency data, is investigated in essays two and three. In the second essay we find, when measuring realized variance on a set of German stocks, that the second moment dependency structure is highly unstable and changes randomly. Results also suggest that volatility is non-stationary from time to time. In the third essay we examine the impact from market microstructure on the error between estimated realized volatility and the volatility of the underlying process. With simulation-based techniques we show that autocorrelation in returns leads to biased variance estimates and that lower sampling frequency and non-constant volatility increases the error variation between the estimated variance and the variance of the underlying process. From these essays we can conclude that volatility is not easily estimated, even from high frequency data. It is neither very well behaved in terms of stability nor dependency over time. Based on these observations, we would recommend the use of simple, transparent methods that are likely to be more robust over differing volatility regimes than models with a complex parameter universe. In analyzing long-term return dependency in the first moment we find that the Trend Switch Probability is a robust estimator. This is an interesting area for further research, with important implications for active asset allocation.
Resumo:
This thesis which consists of an introduction and four peer-reviewed original publications studies the problems of haplotype inference (haplotyping) and local alignment significance. The problems studied here belong to the broad area of bioinformatics and computational biology. The presented solutions are computationally fast and accurate, which makes them practical in high-throughput sequence data analysis. Haplotype inference is a computational problem where the goal is to estimate haplotypes from a sample of genotypes as accurately as possible. This problem is important as the direct measurement of haplotypes is difficult, whereas the genotypes are easier to quantify. Haplotypes are the key-players when studying for example the genetic causes of diseases. In this thesis, three methods are presented for the haplotype inference problem referred to as HaploParser, HIT, and BACH. HaploParser is based on a combinatorial mosaic model and hierarchical parsing that together mimic recombinations and point-mutations in a biologically plausible way. In this mosaic model, the current population is assumed to be evolved from a small founder population. Thus, the haplotypes of the current population are recombinations of the (implicit) founder haplotypes with some point--mutations. HIT (Haplotype Inference Technique) uses a hidden Markov model for haplotypes and efficient algorithms are presented to learn this model from genotype data. The model structure of HIT is analogous to the mosaic model of HaploParser with founder haplotypes. Therefore, it can be seen as a probabilistic model of recombinations and point-mutations. BACH (Bayesian Context-based Haplotyping) utilizes a context tree weighting algorithm to efficiently sum over all variable-length Markov chains to evaluate the posterior probability of a haplotype configuration. Algorithms are presented that find haplotype configurations with high posterior probability. BACH is the most accurate method presented in this thesis and has comparable performance to the best available software for haplotype inference. Local alignment significance is a computational problem where one is interested in whether the local similarities in two sequences are due to the fact that the sequences are related or just by chance. Similarity of sequences is measured by their best local alignment score and from that, a p-value is computed. This p-value is the probability of picking two sequences from the null model that have as good or better best local alignment score. Local alignment significance is used routinely for example in homology searches. In this thesis, a general framework is sketched that allows one to compute a tight upper bound for the p-value of a local pairwise alignment score. Unlike the previous methods, the presented framework is not affeced by so-called edge-effects and can handle gaps (deletions and insertions) without troublesome sampling and curve fitting.
Resumo:
The history of software development in a somewhat systematical way has been performed for half a century. Despite this time period, serious failures in software development projects still occur. The pertinent mission of software project management is to continuously achieve more and more successful projects. The application of agile software methods and more recently the integration of Lean practices contribute to this trend of continuous improvement in the software industry. One such area warranting proper empirical evidence is the operational efficiency of projects. In the field of software development, Kanban as a process management method has gained momentum recently, mostly due to its linkages to Lean thinking. However, only a few empirical studies investigate the impacts of Kanban on projects in that particular area. The aim of this doctoral thesis is to improve the understanding of how Kanban impacts on software projects. The research is carried out in the area of Lean thinking, which contains a variety of concepts including Kanban. This article-type thesis conducts a set of case studies expanded with the research strategy of quasi-controlled experiment. The data-gathering techniques of interviews, questionnaires, and different types of observations are used to study the case projects, and thereby to understand the impacts of Kanban on software development projects. The research papers of the thesis are refereed, international journal and conference publications. The results highlight new findings regarding the application of Kanban in the software context. The key findings of the thesis suggest that Kanban is applicable to software development. Despite its several benefits reported in this thesis, the empirical evidence implies that Kanban is not all-encompassing but requires additional practices to keep development projects performing appropriately. Implications for research are given, as well. In addition to these findings, the thesis contributes in the area of plan-driven software development by suggesting implications both for research and practitioners. As a conclusion, Kanban can benefit software development projects but additional practices would increase its potential for the projects.