6 resultados para Linear time invariants
em Helda - Digital Repository of University of Helsinki
Resumo:
This dissertation is a theoretical study of finite-state based grammars used in natural language processing. The study is concerned with certain varieties of finite-state intersection grammars (FSIG) whose parsers define regular relations between surface strings and annotated surface strings. The study focuses on the following three aspects of FSIGs: (i) Computational complexity of grammars under limiting parameters In the study, the computational complexity in practical natural language processing is approached through performance-motivated parameters on structural complexity. Each parameter splits some grammars in the Chomsky hierarchy into an infinite set of subset approximations. When the approximations are regular, they seem to fall into the logarithmic-time hierarchyand the dot-depth hierarchy of star-free regular languages. This theoretical result is important and possibly relevant to grammar induction. (ii) Linguistically applicable structural representations Related to the linguistically applicable representations of syntactic entities, the study contains new bracketing schemes that cope with dependency links, left- and right branching, crossing dependencies and spurious ambiguity. New grammar representations that resemble the Chomsky-Schützenberger representation of context-free languages are presented in the study, and they include, in particular, representations for mildly context-sensitive non-projective dependency grammars whose performance-motivated approximations are linear time parseable. (iii) Compilation and simplification of linguistic constraints Efficient compilation methods for certain regular operations such as generalized restriction are presented. These include an elegant algorithm that has already been adopted as the approach in a proprietary finite-state tool. In addition to the compilation methods, an approach to on-the-fly simplifications of finite-state representations for parse forests is sketched. These findings are tightly coupled with each other under the theme of locality. I argue that the findings help us to develop better, linguistically oriented formalisms for finite-state parsing and to develop more efficient parsers for natural language processing. Avainsanat: syntactic parsing, finite-state automata, dependency grammar, first-order logic, linguistic performance, star-free regular approximations, mildly context-sensitive grammars
Resumo:
This thesis studies quantile residuals and uses different methodologies to develop test statistics that are applicable in evaluating linear and nonlinear time series models based on continuous distributions. Models based on mixtures of distributions are of special interest because it turns out that for those models traditional residuals, often referred to as Pearson's residuals, are not appropriate. As such models have become more and more popular in practice, especially with financial time series data there is a need for reliable diagnostic tools that can be used to evaluate them. The aim of the thesis is to show how such diagnostic tools can be obtained and used in model evaluation. The quantile residuals considered here are defined in such a way that, when the model is correctly specified and its parameters are consistently estimated, they are approximately independent with standard normal distribution. All the tests derived in the thesis are pure significance type tests and are theoretically sound in that they properly take the uncertainty caused by parameter estimation into account. -- In Chapter 2 a general framework based on the likelihood function and smooth functions of univariate quantile residuals is derived that can be used to obtain misspecification tests for various purposes. Three easy-to-use tests aimed at detecting non-normality, autocorrelation, and conditional heteroscedasticity in quantile residuals are formulated. It also turns out that these tests can be interpreted as Lagrange Multiplier or score tests so that they are asymptotically optimal against local alternatives. Chapter 3 extends the concept of quantile residuals to multivariate models. The framework of Chapter 2 is generalized and tests aimed at detecting non-normality, serial correlation, and conditional heteroscedasticity in multivariate quantile residuals are derived based on it. Score test interpretations are obtained for the serial correlation and conditional heteroscedasticity tests and in a rather restricted special case for the normality test. In Chapter 4 the tests are constructed using the empirical distribution function of quantile residuals. So-called Khmaladze s martingale transformation is applied in order to eliminate the uncertainty caused by parameter estimation. Various test statistics are considered so that critical bounds for histogram type plots as well as Quantile-Quantile and Probability-Probability type plots of quantile residuals are obtained. Chapters 2, 3, and 4 contain simulations and empirical examples which illustrate the finite sample size and power properties of the derived tests and also how the tests and related graphical tools based on residuals are applied in practice.
Resumo:
Human sport doping control analysis is a complex and challenging task for anti-doping laboratories. The List of Prohibited Substances and Methods, updated annually by World Anti-Doping Agency (WADA), consists of hundreds of chemically and pharmacologically different low and high molecular weight compounds. This poses a considerable challenge for laboratories to analyze for them all in a limited amount of time from a limited sample aliquot. The continuous expansion of the Prohibited List obliges laboratories to keep their analytical methods updated and to research new available methodologies. In this thesis, an accurate mass-based analysis employing liquid chromatography - time-of-flight mass spectrometry (LC-TOFMS) was developed and validated to improve the power of doping control analysis. New analytical methods were developed utilizing the high mass accuracy and high information content obtained by TOFMS to generate comprehensive and generic screening procedures. The suitability of LC-TOFMS for comprehensive screening was demonstrated for the first time in the field with mass accuracies better than 1 mDa. Further attention was given to generic sample preparation, an essential part of screening analysis, to rationalize the whole work flow and minimize the need for several separate sample preparation methods. Utilizing both positive and negative ionization allowed the detection of almost 200 prohibited substances. Automatic data processing produced a Microsoft Excel based report highlighting the entries fulfilling the criteria of the reverse data base search (retention time (RT), mass accuracy, isotope match). The quantitative performance of LC-TOFMS was demonstrated with morphine, codeine and their intact glucuronide conjugates. After a straightforward sample preparation the compounds were analyzed directly without the need for hydrolysis, solvent transfer, evaporation or reconstitution. The hydrophilic interaction technique (HILIC) provided good chromatographic separation, which was critical for the morphine glucuronide isomers. A wide linear range (50-5000 ng/ml) with good precision (RSD<10%) and accuracy (±10%) was obtained, showing comparable or better performance to other methods used. In-source collision-induced dissociation (ISCID) allowed confirmation analysis with three diagnostic ions with a median mass accuracy of 1.08 mDa and repeatable ion ratios fulfilling WADA s identification criteria. The suitability of LC-TOFMS for screening of high molecular weight doping agents was demonstrated with plasma volume expanders (PVE), namely dextran and hydroxyethylstarch (HES). Specificity of the assay was improved, since interfering matrix compounds were removed by size exclusion chromatography (SEC). ISCID produced three characteristic ions with an excellent mean mass accuracy of 0.82 mDa at physiological concentration levels. In summary, by combining TOFMS with a proper sample preparation and chromatographic separation, the technique can be utilized extensively in doping control laboratories for comprehensive screening of chemically different low and high molecular weight compounds, for quantification of threshold substances and even for confirmation. LC-TOFMS rationalized the work flow in doping control laboratories by simplifying the screening scheme, expediting reporting and minimizing the analysis costs. Therefore LC-TOFMS can be exploited widely in doping control, and the need for several separate analysis techniques is reduced.
Resumo:
In a max-min LP, the objective is to maximise ω subject to Ax ≤ 1, Cx ≥ ω1, and x ≥ 0 for nonnegative matrices A and C. We present a local algorithm (constant-time distributed algorithm) for approximating max-min LPs. The approximation ratio of our algorithm is the best possible for any local algorithm; there is a matching unconditional lower bound.
Resumo:
In a max-min LP, the objective is to maximise ω subject to Ax ≤ 1, Cx ≥ ω1, and x ≥ 0. In a min-max LP, the objective is to minimise ρ subject to Ax ≤ ρ1, Cx ≥ 1, and x ≥ 0. The matrices A and C are nonnegative and sparse: each row ai of A has at most ΔI positive elements, and each row ck of C has at most ΔK positive elements. We study the approximability of max-min LPs and min-max LPs in a distributed setting; in particular, we focus on local algorithms (constant-time distributed algorithms). We show that for any ΔI ≥ 2, ΔK ≥ 2, and ε > 0 there exists a local algorithm that achieves the approximation ratio ΔI (1 − 1/ΔK) + ε. We also show that this result is the best possible: no local algorithm can achieve the approximation ratio ΔI (1 − 1/ΔK) for any ΔI ≥ 2 and ΔK ≥ 2.