112 resultados para Interview Methods
Resumo:
Metabolism is the cellular subsystem responsible for generation of energy from nutrients and production of building blocks for larger macromolecules. Computational and statistical modeling of metabolism is vital to many disciplines including bioengineering, the study of diseases, drug target identification, and understanding the evolution of metabolism. In this thesis, we propose efficient computational methods for metabolic modeling. The techniques presented are targeted particularly at the analysis of large metabolic models encompassing the whole metabolism of one or several organisms. We concentrate on three major themes of metabolic modeling: metabolic pathway analysis, metabolic reconstruction and the study of evolution of metabolism. In the first part of this thesis, we study metabolic pathway analysis. We propose a novel modeling framework called gapless modeling to study biochemically viable metabolic networks and pathways. In addition, we investigate the utilization of atom-level information on metabolism to improve the quality of pathway analyses. We describe efficient algorithms for discovering both gapless and atom-level metabolic pathways, and conduct experiments with large-scale metabolic networks. The presented gapless approach offers a compromise in terms of complexity and feasibility between the previous graph-theoretic and stoichiometric approaches to metabolic modeling. Gapless pathway analysis shows that microbial metabolic networks are not as robust to random damage as suggested by previous studies. Furthermore the amino acid biosynthesis pathways of the fungal species Trichoderma reesei discovered from atom-level data are shown to closely correspond to those of Saccharomyces cerevisiae. In the second part, we propose computational methods for metabolic reconstruction in the gapless modeling framework. We study the task of reconstructing a metabolic network that does not suffer from connectivity problems. Such problems often limit the usability of reconstructed models, and typically require a significant amount of manual postprocessing. We formulate gapless metabolic reconstruction as an optimization problem and propose an efficient divide-and-conquer strategy to solve it with real-world instances. We also describe computational techniques for solving problems stemming from ambiguities in metabolite naming. These techniques have been implemented in a web-based sofware ReMatch intended for reconstruction of models for 13C metabolic flux analysis. In the third part, we extend our scope from single to multiple metabolic networks and propose an algorithm for inferring gapless metabolic networks of ancestral species from phylogenetic data. Experimenting with 16 fungal species, we show that the method is able to generate results that are easily interpretable and that provide hypotheses about the evolution of metabolism.
Resumo:
Large-scale chromosome rearrangements such as copy number variants (CNVs) and inversions encompass a considerable proportion of the genetic variation between human individuals. In a number of cases, they have been closely linked with various inheritable diseases. Single-nucleotide polymorphisms (SNPs) are another large part of the genetic variance between individuals. They are also typically abundant and their measuring is straightforward and cheap. This thesis presents computational means of using SNPs to detect the presence of inversions and deletions, a particular variety of CNVs. Technically, the inversion-detection algorithm detects the suppressed recombination rate between inverted and non-inverted haplotype populations whereas the deletion-detection algorithm uses the EM-algorithm to estimate the haplotype frequencies of a window with and without a deletion haplotype. As a contribution to population biology, a coalescent simulator for simulating inversion polymorphisms has been developed. Coalescent simulation is a backward-in-time method of modelling population ancestry. Technically, the simulator also models multiple crossovers by using the Counting model as the chiasma interference model. Finally, this thesis includes an experimental section. The aforementioned methods were tested on synthetic data to evaluate their power and specificity. They were also applied to the HapMap Phase II and Phase III data sets, yielding a number of candidates for previously unknown inversions, deletions and also correctly detecting known such rearrangements.
Resumo:
In this Thesis, we develop theory and methods for computational data analysis. The problems in data analysis are approached from three perspectives: statistical learning theory, the Bayesian framework, and the information-theoretic minimum description length (MDL) principle. Contributions in statistical learning theory address the possibility of generalization to unseen cases, and regression analysis with partially observed data with an application to mobile device positioning. In the second part of the Thesis, we discuss so called Bayesian network classifiers, and show that they are closely related to logistic regression models. In the final part, we apply the MDL principle to tracing the history of old manuscripts, and to noise reduction in digital signals.
Resumo:
In this thesis we present and evaluate two pattern matching based methods for answer extraction in textual question answering systems. A textual question answering system is a system that seeks answers to natural language questions from unstructured text. Textual question answering systems are an important research problem because as the amount of natural language text in digital format grows all the time, the need for novel methods for pinpointing important knowledge from the vast textual databases becomes more and more urgent. We concentrate on developing methods for the automatic creation of answer extraction patterns. A new type of extraction pattern is developed also. The pattern matching based approach chosen is interesting because of its language and application independence. The answer extraction methods are developed in the framework of our own question answering system. Publicly available datasets in English are used as training and evaluation data for the methods. The techniques developed are based on the well known methods of sequence alignment and hierarchical clustering. The similarity metric used is based on edit distance. The main conclusions of the research are that answer extraction patterns consisting of the most important words of the question and of the following information extracted from the answer context: plain words, part-of-speech tags, punctuation marks and capitalization patterns, can be used in the answer extraction module of a question answering system. This type of patterns and the two new methods for generating answer extraction patterns provide average results when compared to those produced by other systems using the same dataset. However, most answer extraction methods in the question answering systems tested with the same dataset are both hand crafted and based on a system-specific and fine-grained question classification. The the new methods developed in this thesis require no manual creation of answer extraction patterns. As a source of knowledge, they require a dataset of sample questions and answers, as well as a set of text documents that contain answers to most of the questions. The question classification used in the training data is a standard one and provided already in the publicly available data.
Resumo:
The Minimum Description Length (MDL) principle is a general, well-founded theoretical formalization of statistical modeling. The most important notion of MDL is the stochastic complexity, which can be interpreted as the shortest description length of a given sample of data relative to a model class. The exact definition of the stochastic complexity has gone through several evolutionary steps. The latest instantation is based on the so-called Normalized Maximum Likelihood (NML) distribution which has been shown to possess several important theoretical properties. However, the applications of this modern version of the MDL have been quite rare because of computational complexity problems, i.e., for discrete data, the definition of NML involves an exponential sum, and in the case of continuous data, a multi-dimensional integral usually infeasible to evaluate or even approximate accurately. In this doctoral dissertation, we present mathematical techniques for computing NML efficiently for some model families involving discrete data. We also show how these techniques can be used to apply MDL in two practical applications: histogram density estimation and clustering of multi-dimensional data.
Resumo:
Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.
Resumo:
This thesis presents methods for locating and analyzing cis-regulatory DNA elements involved with the regulation of gene expression in multicellular organisms. The regulation of gene expression is carried out by the combined effort of several transcription factor proteins collectively binding the DNA on the cis-regulatory elements. Only sparse knowledge of the 'genetic code' of these elements exists today. An automatic tool for discovery of putative cis-regulatory elements could help their experimental analysis, which would result in a more detailed view of the cis-regulatory element structure and function. We have developed a computational model for the evolutionary conservation of cis-regulatory elements. The elements are modeled as evolutionarily conserved clusters of sequence-specific transcription factor binding sites. We give an efficient dynamic programming algorithm that locates the putative cis-regulatory elements and scores them according to the conservation model. A notable proportion of the high-scoring DNA sequences show transcriptional enhancer activity in transgenic mouse embryos. The conservation model includes four parameters whose optimal values are estimated with simulated annealing. With good parameter values the model discriminates well between the DNA sequences with evolutionarily conserved cis-regulatory elements and the DNA sequences that have evolved neutrally. In further inquiry, the set of highest scoring putative cis-regulatory elements were found to be sensitive to small variations in the parameter values. The statistical significance of the putative cis-regulatory elements is estimated with the Two Component Extreme Value Distribution. The p-values grade the conservation of the cis-regulatory elements above the neutral expectation. The parameter values for the distribution are estimated by simulating the neutral DNA evolution. The conservation of the transcription factor binding sites can be used in the upstream analysis of regulatory interactions. This approach may provide mechanistic insight to the transcription level data from, e.g., microarray experiments. Here we give a method to predict shared transcriptional regulators for a set of co-expressed genes. The EEL (Enhancer Element Locator) software implements the method for locating putative cis-regulatory elements. The software facilitates both interactive use and distributed batch processing. We have used it to analyze the non-coding regions around all human genes with respect to the orthologous regions in various other species including mouse. The data from these genome-wide analyzes is stored in a relational database which is used in the publicly available web services for upstream analysis and visualization of the putative cis-regulatory elements in the human genome.
Resumo:
Ubiquitous computing is about making computers and computerized artefacts a pervasive part of our everyday lifes, bringing more and more activities into the realm of information. The computationalization, informationalization of everyday activities increases not only our reach, efficiency and capabilities but also the amount and kinds of data gathered about us and our activities. In this thesis, I explore how information systems can be constructed so that they handle this personal data in a reasonable manner. The thesis provides two kinds of results: on one hand, tools and methods for both the construction as well as the evaluation of ubiquitous and mobile systems---on the other hand an evaluation of the privacy aspects of a ubiquitous social awareness system. The work emphasises real-world experiments as the most important way to study privacy. Additionally, the state of current information systems as regards data protection is studied. The tools and methods in this thesis consist of three distinct contributions. An algorithm for locationing in cellular networks is proposed that does not require the location information to be revealed beyond the user's terminal. A prototyping platform for the creation of context-aware ubiquitous applications called ContextPhone is described and released as open source. Finally, a set of methodological findings for the use of smartphones in social scientific field research is reported. A central contribution of this thesis are the pragmatic tools that allow other researchers to carry out experiments. The evaluation of the ubiquitous social awareness application ContextContacts covers both the usage of the system in general as well as an analysis of privacy implications. The usage of the system is analyzed in the light of how users make inferences of others based on real-time contextual cues mediated by the system, based on several long-term field studies. The analysis of privacy implications draws together the social psychological theory of self-presentation and research in privacy for ubiquitous computing, deriving a set of design guidelines for such systems. The main findings from these studies can be summarized as follows: The fact that ubiquitous computing systems gather more data about users can be used to not only study the use of such systems in an effort to create better systems but in general to study phenomena previously unstudied, such as the dynamic change of social networks. Systems that let people create new ways of presenting themselves to others can be fun for the users---but the self-presentation requires several thoughtful design decisions that allow the manipulation of the image mediated by the system. Finally, the growing amount of computational resources available to the users can be used to allow them to use the data themselves, rather than just being passive subjects of data gathering.
Resumo:
Free and open source software development is an alternative to traditional software engineering as an approach to the development of complex software systems. It is a way of developing software based on geographically distributed teams of volunteers without apparent central plan or traditional mechanisms of coordination. The purpose of this thesis is to summarize the current knowledge about free and open source software development and explore the ways on which further understanding on it could be gained. The results of research on the field as well as the research methods are introduced and discussed. Also adapting software process metrics to the context of free and open source software development is illustrated and the possibilities to utilize them as tools to validate other research are discussed.
Resumo:
This study explores the EMU stand taken by the major Finnish political parties from 1994 to 1999. The starting point is the empirical evidence showing that party responses to European integration are shaped by a mix of national and cross-national factors, with national factors having more explanatory value. The study is the first to produce evidence that classified party documents such as protocols, manifestos and authoritative policy summaries may describe the EMU policy emphasis. In fact, as the literature review demonstrates, it has been unclear so far what kind of stand the three major Finnish political parties took during 1994–1999. Consequently, this study makes a substantive contribution to understanding the factors that shaped EMU party policies, and eventually, the national EMU policy during the 1990s. The research questions addressed are the following: What are the main factors that shaped partisan standpoints on EMU during 1994–1999? To what extent did the policy debate and themes change in the political parties? How far were the policies of the Social Democratic Party, the Centre Party and the National Coalition Party shaped by factors unique to their own national contexts? Furthermore, to what extent were they determined by cross-national influences from abroad, and especially from countries with which Finland has a special relationship, such as Sweden? The theoretical background of the study is in the area of party politics and approaches to EU policies, and party change, developed mainly by Kevin Featherstone, Peter Mair and Richard Katz. At the same time, it puts forward generic hypotheses that help to explain party standpoints on EMU. It incorporates a large quantity of classified new material based on primary research through content analysis and interviews. Quantitative and qualitative methods are used sequentially in order to overcome possible limitations. Established content-analysis techniques improve the reliability of the data. The coding frame is based on the salience theory of party competition. Interviews with eight party leaders and one independent expert civil servant provided additional insights and improve the validity of the data. Public-opinion surveys and media coverage are also used to complete the research path. Four major conclusions are drawn from the research findings. First, the quantitative and the interview data reveal the importance of the internal influences within the parties that most noticeably shaped their EMU policies during the 1990s. In contrast, international events play a minor role. The most striking feature turned out to be the strong emphasis by all of the parties on economic goals. However, it is important to note that the factors manifest differences between economic, democratic and international issues across the three major parties. Secondly, it seems that the parties have transformed into centralised and professional organisations in terms of their EMU policy-making. The weight and direction of party EMU strategy rests within the leadership and a few administrative elites. This could imply changes in their institutional environment. Eventually, parties may appear generally less differentiated and more standardised in their policy-making. Thirdly, the case of the Social Democratic Party shows that traditional organisational links continue to exist between the left and the trade unions in terms of their EMU policy-making. Hence, it could be that the parties have not yet moved beyond their conventional affiliate organisations. Fourthly, parties tend to neglect citizen opinion and demands with regard to EMU, which could imply conflict between the changes in their strategic environment. They seem to give more attention to the demands of political competition (party-party relationships) than to public attitudes (party-voter relationships), which would imply that they have had to learn to be more flexible and responsive. Finally, three suggestions for institutional reform are offered, which could contribute to the emergence of legitimised policy-making: measures to bring more party members and voter groups into the policy-making process; measures to adopt new technologies in order to open up the policy-formation process in the early phase; and measures to involve all interest groups in the policy-making process.
Resumo:
During the last 10-15 years interest in mouse behavioural analysis has evolved considerably. The driving force is development in molecular biological techniques that allow manipulation of the mouse genome by changing the expression of genes. Therefore, with some limitations it is possible to study how genes participate in regulation of physiological functions and to create models explaining genetic contribution to various pathological conditions. The first aim of our study was to establish a framework for behavioural phenotyping of genetically modified mice. We established comprehensive battery of tests for the initial screening of mutant mice. These included tests for exploratory and locomotor activity, emotional behaviour, sensory functions, and cognitive performance. Our interest was in the behavioural patterns of common background strains used for genetic manipulations in mice. Additionally we studied the behavioural effect of sex differences, test history, and individual housing. Our findings highlight the importance of careful consideration of genetic background for analysis of mutant mice. It was evident that some backgrounds may mask or modify the behavioural phenotype of mutants and thereby lead to false positive or negative findings. Moreover, there is no universal strain that is equally suitable for all tests, and using different backgrounds allows one to address possible phenotype modifying factors. We discovered that previous experience affected performance in several tasks. The most sensitive traits were the exploratory and emotional behaviour, as well as motor and nociceptive functions. Therefore, it may be essential to repeat some of the tests in naïve animals for assuring the phenotype. Social isolation for a long time period had strong effects on exploratory behaviour, but also on learning and memory. All experiments revealed significant interactions between strain and environmental factors (test history or housing condition) indicating genotype-dependent effects of environmental manipulations. Several mutant line analyses utilize this information. For example, we studied mice overexpressing as well as those lacking extracellular matrix protein heparin-binding growth-associated molecule (HB-GAM), and mice lacking N-syndecan (a receptor for HB-GAM). All mutant mice appeared to be fertile and healthy, without any apparent neurological or sensory defects. The lack of HB-GAM and N-syndecan, however, significantly reduced the learning capacity of the mice. On the other hand, overexpression of HB-GAM resulted in facilitated learning. Moreover, HB-GAM knockout mice displayed higher anxiety-like behaviour, whereas anxiety was reduced in HB-GAM overexpressing mice. Changes in hippocampal plasticity accompanied the behavioural phenotypes. We conclude that HB-GAM and N-syndecan are involved in the modulation of synaptic plasticity in hippocampus and play a role in regulation of anxiety- and learning-related behaviour.
Resumo:
Children and young people as environmental citizens the environmental education perspective to participation This doctoral thesis examines the participation of children and young people in developing their own environment at school, as a part of environmental education. The aim of the research is to assess and consider children and young people s environmentally responsible participation and its effectiveness in relation to the participants own learning and the end results of the participation. The research combines the perspectives of environmental education and citizenship education through the concept of environmental citizenship. Environmental education, which enhances environmental citizenship, offers children and young people the possibility to be active citizens and learn about citizenship in their own lives by taking action themselves. The research is made up of two parts which complement each other. The first part consists of an action research carried out in the Joensuu Lyseo Upper Secondary School, where an environmental education course with a traffic-related theme was planned, developed and evaluated. The second part is made up of an interview survey carried out in Helsinki. In the survey actors from schools and various city offices, who were involved in development projects of school environments, were interviewed. According to the research results, all-round cooperation and more open relations with those outside of the school environment are important ways to support environmental citizenship in schools. Thus, environmentally responsible participation offers a chance to learn competence that an environmental citizen needs the knowledge, skills and willingness to act that have not been successfully taught through traditional school education. The research introduces a model of environmentally responsible participation as a learning process, in which learning is studied through the development of competence, self-empowerment and social empowerment. The model makes the context of environmental education visible and puts emphasis on reflection in the learning process. A central factor in children and young people s self-empowerment is the sense of being heard and taken into consideration. At the moment children and young people s rights to participate are strong, due to legislation, school curricula, and several national and international agreements. Despite this, involving them in developing their own immediate surroundings has not become a part of schools and planning organisations daily life and established methods. Reasons for this situation can be found in the lack of regard and resources for these matters, in the complex nature of planning and a long time frame, and the problems of ownership and of reaching each other. Central to overcoming these obstacles are a gradual change in conduct and mentalities and the strengthening of teachers and officials competence. Children and young people need different ways and methods of varying levels of involvement, structures and arenas which enable participation and in which environmental citizenship can be realized. Key words: environmental citizenship, environmental education, citizenship education, children and young people s participation, social learning, self-empowerment, social empowerment, school, community planning