3 resultados para Automatic Gridding of microarray images

em ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis concerns artificially intelligent natural language processing systems that are capable of learning the properties of lexical items (properties like verbal valency or inflectional class membership) autonomously while they are fulfilling their tasks for which they have been deployed in the first place. Many of these tasks require a deep analysis of language input, which can be characterized as a mapping of utterances in a given input C to a set S of linguistically motivated structures with the help of linguistic information encoded in a grammar G and a lexicon L: G + L + C → S (1) The idea that underlies intelligent lexical acquisition systems is to modify this schematic formula in such a way that the system is able to exploit the information encoded in S to create a new, improved version of the lexicon: G + L + S → L' (2) Moreover, the thesis claims that a system can only be considered intelligent if it does not just make maximum usage of the learning opportunities in C, but if it is also able to revise falsely acquired lexical knowledge. So, one of the central elements in this work is the formulation of a couple of criteria for intelligent lexical acquisition systems subsumed under one paradigm: the Learn-Alpha design rule. The thesis describes the design and quality of a prototype for such a system, whose acquisition components have been developed from scratch and built on top of one of the state-of-the-art Head-driven Phrase Structure Grammar (HPSG) processing systems. The quality of this prototype is investigated in a series of experiments, in which the system is fed with extracts of a large English corpus. While the idea of using machine-readable language input to automatically acquire lexical knowledge is not new, we are not aware of a system that fulfills Learn-Alpha and is able to deal with large corpora. To instance four major challenges of constructing such a system, it should be mentioned that a) the high number of possible structural descriptions caused by highly underspeci ed lexical entries demands for a parser with a very effective ambiguity management system, b) the automatic construction of concise lexical entries out of a bulk of observed lexical facts requires a special technique of data alignment, c) the reliability of these entries depends on the system's decision on whether it has seen 'enough' input and d) general properties of language might render some lexical features indeterminable if the system tries to acquire them with a too high precision. The cornerstone of this dissertation is the motivation and development of a general theory of automatic lexical acquisition that is applicable to every language and independent of any particular theory of grammar or lexicon. This work is divided into five chapters. The introductory chapter first contrasts three different and mutually incompatible approaches to (artificial) lexical acquisition: cue-based queries, head-lexicalized probabilistic context free grammars and learning by unification. Then the postulation of the Learn-Alpha design rule is presented. The second chapter outlines the theory that underlies Learn-Alpha and exposes all the related notions and concepts required for a proper understanding of artificial lexical acquisition. Chapter 3 develops the prototyped acquisition method, called ANALYZE-LEARN-REDUCE, a framework which implements Learn-Alpha. The fourth chapter presents the design and results of a bootstrapping experiment conducted on this prototype: lexeme detection, learning of verbal valency, categorization into nominal count/mass classes, selection of prepositions and sentential complements, among others. The thesis concludes with a review of the conclusions and motivation for further improvements as well as proposals for future research on the automatic induction of lexical features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ziel dieser Dissertation ist die experimentelle Charakterisierung und quantitative Beschreibung der Hybridisierung von komplementären Nukleinsäuresträngen mit oberflächengebundenen Fängermolekülen für die Entwicklung von integrierten Biosensoren. Im Gegensatz zu lösungsbasierten Verfahren ist mit Microarray Substraten die Untersuchung vieler Nukleinsäurekombinationen parallel möglich. Als biologisch relevantes Evaluierungssystem wurde das in Eukaryoten universell exprimierte Actin Gen aus unterschiedlichen Pflanzenspezies verwendet. Dieses Testsystem ermöglicht es, nahe verwandte Pflanzenarten auf Grund von geringen Unterschieden in der Gen-Sequenz (SNPs) zu charakterisieren. Aufbauend auf dieses gut studierte Modell eines House-Keeping Genes wurde ein umfassendes Microarray System, bestehend aus kurzen und langen Oligonukleotiden (mit eingebauten LNA-Molekülen), cDNAs sowie DNA und RNA Targets realisiert. Damit konnte ein für online Messung optimiertes Testsystem mit hohen Signalstärken entwickelt werden. Basierend auf den Ergebnissen wurde der gesamte Signalpfad von Nukleinsärekonzentration bis zum digitalen Wert modelliert. Die aus der Entwicklung und den Experimenten gewonnen Erkenntnisse über die Kinetik und Thermodynamik von Hybridisierung sind in drei Publikationen zusammengefasst die das Rückgrat dieser Dissertation bilden. Die erste Publikation beschreibt die Verbesserung der Reproduzierbarkeit und Spezifizität von Microarray Ergebnissen durch online Messung von Kinetik und Thermodynamik gegenüber endpunktbasierten Messungen mit Standard Microarrays. Für die Auswertung der riesigen Datenmengen wurden zwei Algorithmen entwickelt, eine reaktionskinetische Modellierung der Isothermen und ein auf der Fermi-Dirac Statistik beruhende Beschreibung des Schmelzüberganges. Diese Algorithmen werden in der zweiten Publikation beschrieben. Durch die Realisierung von gleichen Sequenzen in den chemisch unterschiedlichen Nukleinsäuren (DNA, RNA und LNA) ist es möglich, definierte Unterschiede in der Konformation des Riboserings und der C5-Methylgruppe der Pyrimidine zu untersuchen. Die kompetitive Wechselwirkung dieser unterschiedlichen Nukleinsäuren gleicher Sequenz und die Auswirkungen auf Kinetik und Thermodynamik ist das Thema der dritten Publikation. Neben der molekularbiologischen und technologischen Entwicklung im Bereich der Sensorik von Hybridisierungsreaktionen oberflächengebundener Nukleinsäuremolekülen, der automatisierten Auswertung und Modellierung der anfallenden Datenmengen und der damit verbundenen besseren quantitativen Beschreibung von Kinetik und Thermodynamik dieser Reaktionen tragen die Ergebnisse zum besseren Verständnis der physikalisch-chemischen Struktur des elementarsten biologischen Moleküls und seiner nach wie vor nicht vollständig verstandenen Spezifizität bei.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data sets describing the state of the earth's atmosphere are of great importance in the atmospheric sciences. Over the last decades, the quality and sheer amount of the available data increased significantly, resulting in a rising demand for new tools capable of handling and analysing these large, multidimensional sets of atmospheric data. The interdisciplinary work presented in this thesis covers the development and the application of practical software tools and efficient algorithms from the field of computer science, aiming at the goal of enabling atmospheric scientists to analyse and to gain new insights from these large data sets. For this purpose, our tools combine novel techniques with well-established methods from different areas such as scientific visualization and data segmentation. In this thesis, three practical tools are presented. Two of these tools are software systems (Insight and IWAL) for different types of processing and interactive visualization of data, the third tool is an efficient algorithm for data segmentation implemented as part of Insight.Insight is a toolkit for the interactive, three-dimensional visualization and processing of large sets of atmospheric data, originally developed as a testing environment for the novel segmentation algorithm. It provides a dynamic system for combining at runtime data from different sources, a variety of different data processing algorithms, and several visualization techniques. Its modular architecture and flexible scripting support led to additional applications of the software, from which two examples are presented: the usage of Insight as a WMS (web map service) server, and the automatic production of a sequence of images for the visualization of cyclone simulations. The core application of Insight is the provision of the novel segmentation algorithm for the efficient detection and tracking of 3D features in large sets of atmospheric data, as well as for the precise localization of the occurring genesis, lysis, merging and splitting events. Data segmentation usually leads to a significant reduction of the size of the considered data. This enables a practical visualization of the data, statistical analyses of the features and their events, and the manual or automatic detection of interesting situations for subsequent detailed investigation. The concepts of the novel algorithm, its technical realization, and several extensions for avoiding under- and over-segmentation are discussed. As example applications, this thesis covers the setup and the results of the segmentation of upper-tropospheric jet streams and cyclones as full 3D objects. Finally, IWAL is presented, which is a web application for providing an easy interactive access to meteorological data visualizations, primarily aimed at students. As a web application, the needs to retrieve all input data sets and to install and handle complex visualization tools on a local machine are avoided. The main challenge in the provision of customizable visualizations to large numbers of simultaneous users was to find an acceptable trade-off between the available visualization options and the performance of the application. Besides the implementational details, benchmarks and the results of a user survey are presented.