165 resultados para Thematic Text Analysis
Resumo:
Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.
Resumo:
Metabolism is the cellular subsystem responsible for generation of energy from nutrients and production of building blocks for larger macromolecules. Computational and statistical modeling of metabolism is vital to many disciplines including bioengineering, the study of diseases, drug target identification, and understanding the evolution of metabolism. In this thesis, we propose efficient computational methods for metabolic modeling. The techniques presented are targeted particularly at the analysis of large metabolic models encompassing the whole metabolism of one or several organisms. We concentrate on three major themes of metabolic modeling: metabolic pathway analysis, metabolic reconstruction and the study of evolution of metabolism. In the first part of this thesis, we study metabolic pathway analysis. We propose a novel modeling framework called gapless modeling to study biochemically viable metabolic networks and pathways. In addition, we investigate the utilization of atom-level information on metabolism to improve the quality of pathway analyses. We describe efficient algorithms for discovering both gapless and atom-level metabolic pathways, and conduct experiments with large-scale metabolic networks. The presented gapless approach offers a compromise in terms of complexity and feasibility between the previous graph-theoretic and stoichiometric approaches to metabolic modeling. Gapless pathway analysis shows that microbial metabolic networks are not as robust to random damage as suggested by previous studies. Furthermore the amino acid biosynthesis pathways of the fungal species Trichoderma reesei discovered from atom-level data are shown to closely correspond to those of Saccharomyces cerevisiae. In the second part, we propose computational methods for metabolic reconstruction in the gapless modeling framework. We study the task of reconstructing a metabolic network that does not suffer from connectivity problems. Such problems often limit the usability of reconstructed models, and typically require a significant amount of manual postprocessing. We formulate gapless metabolic reconstruction as an optimization problem and propose an efficient divide-and-conquer strategy to solve it with real-world instances. We also describe computational techniques for solving problems stemming from ambiguities in metabolite naming. These techniques have been implemented in a web-based sofware ReMatch intended for reconstruction of models for 13C metabolic flux analysis. In the third part, we extend our scope from single to multiple metabolic networks and propose an algorithm for inferring gapless metabolic networks of ancestral species from phylogenetic data. Experimenting with 16 fungal species, we show that the method is able to generate results that are easily interpretable and that provide hypotheses about the evolution of metabolism.
Resumo:
Telecommunications network management is based on huge amounts of data that are continuously collected from elements and devices from all around the network. The data is monitored and analysed to provide information for decision making in all operation functions. Knowledge discovery and data mining methods can support fast-pace decision making in network operations. In this thesis, I analyse decision making on different levels of network operations. I identify the requirements decision-making sets for knowledge discovery and data mining tools and methods, and I study resources that are available to them. I then propose two methods for augmenting and applying frequent sets to support everyday decision making. The proposed methods are Comprehensive Log Compression for log data summarisation and Queryable Log Compression for semantic compression of log data. Finally I suggest a model for a continuous knowledge discovery process and outline how it can be implemented and integrated to the existing network operations infrastructure.
Resumo:
In this Thesis, we develop theory and methods for computational data analysis. The problems in data analysis are approached from three perspectives: statistical learning theory, the Bayesian framework, and the information-theoretic minimum description length (MDL) principle. Contributions in statistical learning theory address the possibility of generalization to unseen cases, and regression analysis with partially observed data with an application to mobile device positioning. In the second part of the Thesis, we discuss so called Bayesian network classifiers, and show that they are closely related to logistic regression models. In the final part, we apply the MDL principle to tracing the history of old manuscripts, and to noise reduction in digital signals.
Resumo:
This thesis studies human gene expression space using high throughput gene expression data from DNA microarrays. In molecular biology, high throughput techniques allow numerical measurements of expression of tens of thousands of genes simultaneously. In a single study, this data is traditionally obtained from a limited number of sample types with a small number of replicates. For organism-wide analysis, this data has been largely unavailable and the global structure of human transcriptome has remained unknown. This thesis introduces a human transcriptome map of different biological entities and analysis of its general structure. The map is constructed from gene expression data from the two largest public microarray data repositories, GEO and ArrayExpress. The creation of this map contributed to the development of ArrayExpress by identifying and retrofitting the previously unusable and missing data and by improving the access to its data. It also contributed to creation of several new tools for microarray data manipulation and establishment of data exchange between GEO and ArrayExpress. The data integration for the global map required creation of a new large ontology of human cell types, disease states, organism parts and cell lines. The ontology was used in a new text mining and decision tree based method for automatic conversion of human readable free text microarray data annotations into categorised format. The data comparability and minimisation of the systematic measurement errors that are characteristic to each lab- oratory in this large cross-laboratories integrated dataset, was ensured by computation of a range of microarray data quality metrics and exclusion of incomparable data. The structure of a global map of human gene expression was then explored by principal component analysis and hierarchical clustering using heuristics and help from another purpose built sample ontology. A preface and motivation to the construction and analysis of a global map of human gene expression is given by analysis of two microarray datasets of human malignant melanoma. The analysis of these sets incorporate indirect comparison of statistical methods for finding differentially expressed genes and point to the need to study gene expression on a global level.
Resumo:
The metabolism of an organism consists of a network of biochemical reactions that transform small molecules, or metabolites, into others in order to produce energy and building blocks for essential macromolecules. The goal of metabolic flux analysis is to uncover the rates, or the fluxes, of those biochemical reactions. In a steady state, the sum of the fluxes that produce an internal metabolite is equal to the sum of the fluxes that consume the same molecule. Thus the steady state imposes linear balance constraints to the fluxes. In general, the balance constraints imposed by the steady state are not sufficient to uncover all the fluxes of a metabolic network. The fluxes through cycles and alternative pathways between the same source and target metabolites remain unknown. More information about the fluxes can be obtained from isotopic labelling experiments, where a cell population is fed with labelled nutrients, such as glucose that contains 13C atoms. Labels are then transferred by biochemical reactions to other metabolites. The relative abundances of different labelling patterns in internal metabolites depend on the fluxes of pathways producing them. Thus, the relative abundances of different labelling patterns contain information about the fluxes that cannot be uncovered from the balance constraints derived from the steady state. The field of research that estimates the fluxes utilizing the measured constraints to the relative abundances of different labelling patterns induced by 13C labelled nutrients is called 13C metabolic flux analysis. There exist two approaches of 13C metabolic flux analysis. In the optimization approach, a non-linear optimization task, where candidate fluxes are iteratively generated until they fit to the measured abundances of different labelling patterns, is constructed. In the direct approach, linear balance constraints given by the steady state are augmented with linear constraints derived from the abundances of different labelling patterns of metabolites. Thus, mathematically involved non-linear optimization methods that can get stuck to the local optima can be avoided. On the other hand, the direct approach may require more measurement data than the optimization approach to obtain the same flux information. Furthermore, the optimization framework can easily be applied regardless of the labelling measurement technology and with all network topologies. In this thesis we present a formal computational framework for direct 13C metabolic flux analysis. The aim of our study is to construct as many linear constraints to the fluxes from the 13C labelling measurements using only computational methods that avoid non-linear techniques and are independent from the type of measurement data, the labelling of external nutrients and the topology of the metabolic network. The presented framework is the first representative of the direct approach for 13C metabolic flux analysis that is free from restricting assumptions made about these parameters.In our framework, measurement data is first propagated from the measured metabolites to other metabolites. The propagation is facilitated by the flow analysis of metabolite fragments in the network. Then new linear constraints to the fluxes are derived from the propagated data by applying the techniques of linear algebra.Based on the results of the fragment flow analysis, we also present an experiment planning method that selects sets of metabolites whose relative abundances of different labelling patterns are most useful for 13C metabolic flux analysis. Furthermore, we give computational tools to process raw 13C labelling data produced by tandem mass spectrometry to a form suitable for 13C metabolic flux analysis.
Resumo:
Wireless access is expected to play a crucial role in the future of the Internet. The demands of the wireless environment are not always compatible with the assumptions that were made on the era of the wired links. At the same time, new services that take advantage of the advances in many areas of technology are invented. These services include delivery of mass media like television and radio, Internet phone calls, and video conferencing. The network must be able to deliver these services with acceptable performance and quality to the end user. This thesis presents an experimental study to measure the performance of bulk data TCP transfers, streaming audio flows, and HTTP transfers which compete the limited bandwidth of the GPRS/UMTS-like wireless link. The wireless link characteristics are modeled with a wireless network emulator. We analyze how different competing workload types behave with regular TPC and how the active queue management, the Differentiated services (DiffServ), and a combination of TCP enhancements affect the performance and the quality of service. We test on four link types including an error-free link and the links with different Automatic Repeat reQuest (ARQ) persistency. The analysis consists of comparing the resulting performance in different configurations based on defined metrics. We observed that DiffServ and Random Early Detection (RED) with Explicit Congestion Notification (ECN) are useful, and in some conditions necessary, for quality of service and fairness because a long queuing delay and congestion related packet losses cause problems without DiffServ and RED. However, we observed situations, where there is still room for significant improvements if the link-level is aware of the quality of service. Only very error-prone link diminishes the benefits to nil. The combination of TCP enhancements improves performance. These include initial window of four, Control Block Interdependence (CBI) and Forward RTO recovery (F-RTO). The initial window of four helps a later starting TCP flow to start faster but generates congestion under some conditions. CBI prevents slow-start overshoot and balances slow start in the presence of error drops, and F-RTO reduces unnecessary retransmissions successfully.
Resumo:
We investigate methods for recommending multimedia items suitable for an online multimedia sharing community and introduce a novel algorithm called UserRank for ranking multimedia items based on link analysis. We also take the initiative of applying EigenRumor from the domain of blogosphere to multimedia. Furthermore, we present a strategy for making personalized recommendation that combines UserRank with collaborative filtering. We evaluate our method with an informal user study and show that results obtained are promising.
Resumo:
The aim of the present study is to analyze Confucian understandings of the Christian doctrine of salvation in order to find the basic problems in the Confucian-Christian dialogue. I will approach the task via a systematic theological analysis of four issues in order to limit the thesis to an appropriate size. They are analyzed in three chapters as follows: 1. The Confucian concept concerning the existence of God. Here I discuss mainly the issue of assimilation of the Christian concept of God to the concepts of Sovereign on High (Shangdi) and Heaven (Tian) in Confucianism. 2. The Confucian understanding of the object of salvation and its status in Christianity. 3. The Confucian understanding of the means of salvation in Christianity. Before beginning this analysis it is necessary to clarify the vast variety of controversies, arguments, ideas, opinions and comments expressed in the name of Confucianism; thus, clear distinctions among different schools of Confucianism are given in chapter 2. In the last chapter I will discuss the results of my research in this study by pointing out the basic problems that will appear in the analysis. The results of the present study provide conclusions in three related areas: the tacit differences in the ways of thinking between Confucians and Christians, the basic problems of the Confucian-Christian dialogue, and the affirmative elements in the dialogue. In addition to a summary, a bibliography and an index, there are also eight appendices, where I have introduced important background information for readers to understand the present study.
Resumo:
The aim of this research is to present, interpret and analyze the phenomenon of pilgrimage in a contemporary, suburban Greek nunnery, and to elucidate the different functions that the present-day convent has for its pilgrims. The scope of the study is limited to a case nunnery, the convent of the Dormition of the Virgin, which is situated in Northern Greece. The main corpus of data utilized for this work consists of 25 interviews and field diary material, which was collected in the convent mainly during the academic year 2002-2003 and summer 2005 by means of participant observation and unstructured thematic interviewing. It must be noted that most Greek nunneries are not really communities of hermits but institutions that operate in complex interaction with the surrounding society. Thus, the main interest in this study is in the interaction between pilgrims and nuns. Pilgrimage is seen here as a significant and concrete form of interaction, which in fact makes the contemporary nunneries dynamic scenes of religious, social and sometimes even political life. The focus of the analysis is on the pilgrims’ experiences, reflected upon on the levels of the individual, the Church institution, and society in general. This study shows that pilgrimage in a suburban nunnery, such as the convent of the Dormition, can be seen as part of everyday religiosity. Many pilgrims visit the convent regularly and the visitation is a lifestyle the pilgrims have chosen and wish to maintain. Pilgrimage to a contemporary Greek nunnery should not be ennobled, but seen as part of a popular religious sentiment. The visits offer pilgrims various tools for reflecting on their personal life situations and on questions of identity. For them the full round of liturgical worship is a very good reason for going to the convent, and many see it as a way of maintaining their faith and of feeling close to God. Despite cultural developments such as secularization and globalization, pilgrims are quite loyal to the convent they visit. It represents the positive values of ‘Greekness’ and therefore they also trust the nuns’ approach to various matters, both personal and political. The coalition of Orthodoxy and nationalism is also visible in their attitudes towards the convent, which they see as a guardian of Hellenism and as nurturing Greek values both now and in the future.
Resumo:
Modern Christian theology has been at pain with the schism between the Bible and theology, and between biblical studies and systematic theology. Brevard Springs Childs is one of biblical scholars who attempt to dismiss this “iron curtain” separating the two disciplines. The present thesis aims at analyzing Childs’ concept of theological exegesis in the canonical context. In the present study I employ the method of systematic analysis. The thesis consists of seven chapters. Introduction is the first chapter. The second chapter attempts to find out the most important elements which exercise influence on Childs’ methodology of biblical theology by sketching his academic development during his career. The third chapter attempts to deal with the crucial question why and how the concept of the canon is so important for Childs’ methodology of biblical theology. In chapter four I analyze why and how Childs is dissatisfied with historical-critical scholarship and I point out the differences and similarities between his canonical approach and historical criticism. The fifth chapter attempts at discussing Childs’ central concepts of theological exegesis by investigating whether a Christocentric approach is an appropriate way of creating a unified biblical theology. In the sixth chapter I present a critical evaluation and methodological reflection of Childs’ theological exegesis in the canonical context. The final chapter sums up the key points of Childs’ methodology of biblical theology. The basic results of this thesis are as follows: First, the fundamental elements of Childs’ theological thinking are rooted in Reformed theological tradition and in modern theological neo-orthodoxy and in its most prominent theologian, Karl Barth. The American Biblical Theological Movement and the controversy between Protestant liberalism and conservatism in the modern American context cultivate his theological sensitivity and position. Second, Childs attempts to dismiss negative influences of the historical-critical method by establishing canon-based theological exegesis leading into confessional biblical theology. Childs employs terminology such as canonical intentionality, the wholeness of the canon, the canon as the most appropriate context for doing a biblical theology, and the continuity of the two Testaments, in order to put into effect his canonical program. Childs demonstrates forcefully the inadequacies of the historical-critical method in creating biblical theology in biblical hermeneutics, doctrinal theology, and pastoral practice. His canonical approach endeavors to establish and create post-critical Christian biblical theology, and works within the traditional framework of faith seeking understanding. Third, Childs’ biblical theology has a double task: descriptive and constructive, the former connects biblical theology with exegesis, the later with dogmatic theology. He attempts to use a comprehensive model, which combines a thematic investigation of the essential theological contents of the Bible with a systematic analysis of the contents of the Christian faith. Childs also attempts to unite Old Testament theology and New Testament theology into one unified biblical theology. Fourth, some problematic points of Childs’ thinking need to be mentioned. For instance, his emphasis on the final form of the text of the biblical canon is highly controversial, yet Childs firmly believes in it, he even regards it as the corner stone of his biblical theology. The relationship between the canon and the doctrine of biblical inspiration is weak. He does not clearly define whether Scripture is God’s word or whether it only “witnesses” to it. Childs’ concepts of “the word of God” and “divine revelation” remain unclear, and their ontological status is ambiguous. Childs’ theological exegesis in the canonical context is a new attempt in the modern history of Christian theology. It expresses his sincere effort to create a path for doing biblical theology. Certainly, it was just a modest beginning of a long process.