4 resultados para programming language processing

em ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Moderne ESI-LC-MS/MS-Techniken erlauben in Verbindung mit Bottom-up-Ansätzen eine qualitative und quantitative Charakterisierung mehrerer tausend Proteine in einem einzigen Experiment. Für die labelfreie Proteinquantifizierung eignen sich besonders datenunabhängige Akquisitionsmethoden wie MSE und die IMS-Varianten HDMSE und UDMSE. Durch ihre hohe Komplexität stellen die so erfassten Daten besondere Anforderungen an die Analysesoftware. Eine quantitative Analyse der MSE/HDMSE/UDMSE-Daten blieb bislang wenigen kommerziellen Lösungen vorbehalten. rn| In der vorliegenden Arbeit wurden eine Strategie und eine Reihe neuer Methoden zur messungsübergreifenden, quantitativen Analyse labelfreier MSE/HDMSE/UDMSE-Daten entwickelt und als Software ISOQuant implementiert. Für die ersten Schritte der Datenanalyse (Featuredetektion, Peptid- und Proteinidentifikation) wird die kommerzielle Software PLGS verwendet. Anschließend werden die unabhängigen PLGS-Ergebnisse aller Messungen eines Experiments in einer relationalen Datenbank zusammengeführt und mit Hilfe der dedizierten Algorithmen (Retentionszeitalignment, Feature-Clustering, multidimensionale Normalisierung der Intensitäten, mehrstufige Datenfilterung, Proteininferenz, Umverteilung der Intensitäten geteilter Peptide, Proteinquantifizierung) überarbeitet. Durch diese Nachbearbeitung wird die Reproduzierbarkeit der qualitativen und quantitativen Ergebnisse signifikant gesteigert.rn| Um die Performance der quantitativen Datenanalyse zu evaluieren und mit anderen Lösungen zu vergleichen, wurde ein Satz von exakt definierten Hybridproteom-Proben entwickelt. Die Proben wurden mit den Methoden MSE und UDMSE erfasst, mit Progenesis QIP, synapter und ISOQuant analysiert und verglichen. Im Gegensatz zu synapter und Progenesis QIP konnte ISOQuant sowohl eine hohe Reproduzierbarkeit der Proteinidentifikation als auch eine hohe Präzision und Richtigkeit der Proteinquantifizierung erreichen.rn| Schlussfolgernd ermöglichen die vorgestellten Algorithmen und der Analyseworkflow zuverlässige und reproduzierbare quantitative Datenanalysen. Mit der Software ISOQuant wurde ein einfaches und effizientes Werkzeug für routinemäßige Hochdurchsatzanalysen labelfreier MSE/HDMSE/UDMSE-Daten entwickelt. Mit den Hybridproteom-Proben und den Bewertungsmetriken wurde ein umfassendes System zur Evaluierung quantitativer Akquisitions- und Datenanalysesysteme vorgestellt.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The increasing precision of current and future experiments in high-energy physics requires a likewise increase in the accuracy of the calculation of theoretical predictions, in order to find evidence for possible deviations of the generally accepted Standard Model of elementary particles and interactions. Calculating the experimentally measurable cross sections of scattering and decay processes to a higher accuracy directly translates into including higher order radiative corrections in the calculation. The large number of particles and interactions in the full Standard Model results in an exponentially growing number of Feynman diagrams contributing to any given process in higher orders. Additionally, the appearance of multiple independent mass scales makes even the calculation of single diagrams non-trivial. For over two decades now, the only way to cope with these issues has been to rely on the assistance of computers. The aim of the xloops project is to provide the necessary tools to automate the calculation procedures as far as possible, including the generation of the contributing diagrams and the evaluation of the resulting Feynman integrals. The latter is based on the techniques developed in Mainz for solving one- and two-loop diagrams in a general and systematic way using parallel/orthogonal space methods. These techniques involve a considerable amount of symbolic computations. During the development of xloops it was found that conventional computer algebra systems were not a suitable implementation environment. For this reason, a new system called GiNaC has been created, which allows the development of large-scale symbolic applications in an object-oriented fashion within the C++ programming language. This system, which is now also in use for other projects besides xloops, is the main focus of this thesis. The implementation of GiNaC as a C++ library sets it apart from other algebraic systems. Our results prove that a highly efficient symbolic manipulator can be designed in an object-oriented way, and that having a very fine granularity of objects is also feasible. The xloops-related parts of this work consist of a new implementation, based on GiNaC, of functions for calculating one-loop Feynman integrals that already existed in the original xloops program, as well as the addition of supplementary modules belonging to the interface between the library of integral functions and the diagram generator.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis concerns artificially intelligent natural language processing systems that are capable of learning the properties of lexical items (properties like verbal valency or inflectional class membership) autonomously while they are fulfilling their tasks for which they have been deployed in the first place. Many of these tasks require a deep analysis of language input, which can be characterized as a mapping of utterances in a given input C to a set S of linguistically motivated structures with the help of linguistic information encoded in a grammar G and a lexicon L: G + L + C → S (1) The idea that underlies intelligent lexical acquisition systems is to modify this schematic formula in such a way that the system is able to exploit the information encoded in S to create a new, improved version of the lexicon: G + L + S → L' (2) Moreover, the thesis claims that a system can only be considered intelligent if it does not just make maximum usage of the learning opportunities in C, but if it is also able to revise falsely acquired lexical knowledge. So, one of the central elements in this work is the formulation of a couple of criteria for intelligent lexical acquisition systems subsumed under one paradigm: the Learn-Alpha design rule. The thesis describes the design and quality of a prototype for such a system, whose acquisition components have been developed from scratch and built on top of one of the state-of-the-art Head-driven Phrase Structure Grammar (HPSG) processing systems. The quality of this prototype is investigated in a series of experiments, in which the system is fed with extracts of a large English corpus. While the idea of using machine-readable language input to automatically acquire lexical knowledge is not new, we are not aware of a system that fulfills Learn-Alpha and is able to deal with large corpora. To instance four major challenges of constructing such a system, it should be mentioned that a) the high number of possible structural descriptions caused by highly underspeci ed lexical entries demands for a parser with a very effective ambiguity management system, b) the automatic construction of concise lexical entries out of a bulk of observed lexical facts requires a special technique of data alignment, c) the reliability of these entries depends on the system's decision on whether it has seen 'enough' input and d) general properties of language might render some lexical features indeterminable if the system tries to acquire them with a too high precision. The cornerstone of this dissertation is the motivation and development of a general theory of automatic lexical acquisition that is applicable to every language and independent of any particular theory of grammar or lexicon. This work is divided into five chapters. The introductory chapter first contrasts three different and mutually incompatible approaches to (artificial) lexical acquisition: cue-based queries, head-lexicalized probabilistic context free grammars and learning by unification. Then the postulation of the Learn-Alpha design rule is presented. The second chapter outlines the theory that underlies Learn-Alpha and exposes all the related notions and concepts required for a proper understanding of artificial lexical acquisition. Chapter 3 develops the prototyped acquisition method, called ANALYZE-LEARN-REDUCE, a framework which implements Learn-Alpha. The fourth chapter presents the design and results of a bootstrapping experiment conducted on this prototype: lexeme detection, learning of verbal valency, categorization into nominal count/mass classes, selection of prepositions and sentential complements, among others. The thesis concludes with a review of the conclusions and motivation for further improvements as well as proposals for future research on the automatic induction of lexical features.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Satellite image classification involves designing and developing efficient image classifiers. With satellite image data and image analysis methods multiplying rapidly, selecting the right mix of data sources and data analysis approaches has become critical to the generation of quality land-use maps. In this study, a new postprocessing information fusion algorithm for the extraction and representation of land-use information based on high-resolution satellite imagery is presented. This approach can produce land-use maps with sharp interregional boundaries and homogeneous regions. The proposed approach is conducted in five steps. First, a GIS layer - ATKIS data - was used to generate two coarse homogeneous regions, i.e. urban and rural areas. Second, a thematic (class) map was generated by use of a hybrid spectral classifier combining Gaussian Maximum Likelihood algorithm (GML) and ISODATA classifier. Third, a probabilistic relaxation algorithm was performed on the thematic map, resulting in a smoothed thematic map. Fourth, edge detection and edge thinning techniques were used to generate a contour map with pixel-width interclass boundaries. Fifth, the contour map was superimposed on the thematic map by use of a region-growing algorithm with the contour map and the smoothed thematic map as two constraints. For the operation of the proposed method, a software package is developed using programming language C. This software package comprises the GML algorithm, a probabilistic relaxation algorithm, TBL edge detector, an edge thresholding algorithm, a fast parallel thinning algorithm, and a region-growing information fusion algorithm. The county of Landau of the State Rheinland-Pfalz, Germany was selected as a test site. The high-resolution IRS-1C imagery was used as the principal input data.