212 resultados para Statistical mean
Resumo:
This paper introduces a rule-based classification of single-word and compound verbs into a statistical machine translation approach. By substituting verb forms by the lemma of their head verb, the data sparseness problem caused by highly-inflected languages can be successfully addressed. On the other hand, the information of seen verb forms can be used to generate new translations for unseen verb forms. Translation results for an English to Spanish task are reported, producing a significant performance improvement.
Resumo:
In this paper we describe MARIE, an Ngram-based statistical machine translation decoder. It is implemented using a beam search strategy, with distortion (or reordering) capabilities. The underlying translation model is based on an Ngram approach, extended to introduce reordering at the phrase level. The search graph structure is designed to perform very accurate comparisons, what allows for a high level of pruning, improving the decoder efficiency. We report several techniques for efficiently prune out the search space. The combinatory explosion of the search space derived from the search graph structure is reduced by limiting the number of reorderings a given translation is allowed to perform, and also the maximum distance a word (or a phrase) is allowed to be reordered. We finally report translation accuracy results on three different translation tasks.
Resumo:
In this paper a method to incorporate linguistic information regarding single-word and compound verbs is proposed, as a first step towards an SMT model based on linguistically-classified phrases. By substituting these verb structures by the base form of the head verb, we achieve a better statistical word alignment performance, and are able to better estimate the translation model and generalize to unseen verb forms during translation. Preliminary experiments for the English - Spanish language pair are performed, and future research lines are detailed. © 2005 Association for Computational Linguistics.
Resumo:
In order to minimize the number of iterations to a turbine design, reasonable choices of the key parameters must be made at the earliest possible opportunity. The choice of blade loading is of particular concern in the low pressure (LP) turbine of civil aero engines, where the use of high-lift blades is widespread. This paper presents an analytical mean-line design study for a repeating-stage, axial-flow Low Pressure (LP) turbine. The problem of how to measure blade loading is first addressed. The analysis demonstrates that the Zweifel coefficient [1] is not a reasonable gauge of blade loading because it inherently depends on the flow angles. A more appropriate coefficient based on blade circulation is proposed. Without a large set of turbine test data it is not possible to directly evaluate the accuracy of a particular loss correlation. The analysis therefore focuses on the efficiency trends with respect to flow coefficient, stage loading, lift coefficient and Reynolds number. Of the various loss correlations examined, those based on Ainley and Mathieson ([2], [3], [4]) do not produce realistic trends. The profile loss model of Coull and Hodson [5] and the secondary loss models of Craig and Cox [6] and Traupel [7] gave the most reasonable results. The analysis suggests that designs with the highest flow turning are the least sensitive to increases in blade loading. The increase in Reynolds number lapse with loading is also captured, achieving reasonable agreement with experiments. Copyright © 2011 by ASME.
Resumo:
The numerical solution of problems in unbounded physical space requires a truncation of the computational domain to a reasonable size. As a result, the conditions on the artificial boundaries are generally unknown. Assumptions like constant pressure or velocities are only valid in the far field and lead to spurious reflections if applied on the boundaries of the truncated domain. A number of attempts have been made over the past decades to design conditions that prevent such reflections. One approach is based on characteristics. The standard analysis assumes a spatially uniform mean flow field but this is often impractical. In the present paper we show how to extend the formulation to the more general case of a non-uniform mean velocity field. A number of test cases are provided and our results compare favourably with other boundary conditions. In principle the present approach can be extended to include non-uniformities in all variables.
Resumo:
In order to minimize the number of iterations to a turbine design, reasonable choices of the key parameters must be made at the preliminary design stage. The choice of blade loading is of particular concern in the low pressure (LP) turbine of civil aero engines, where the use of high-lift blades is widespread. This paper considers how blade loading should be measured, compares the performance of various loss correlations, and explores the impact of blade lift on performance and lapse rates. To these ends, an analytical design study is presented for a repeating-stage, axial-flow LP turbine. It is demonstrated that the long-established Zweifel lift coefficient (Zweifel, 1945, "The Spacing of Turbomachine Blading, Especially with Large Angular Deflection" Brown Boveri Rev., 32(1), pp. 436-444) is flawed because it does not account for the blade camber. As a result the Zweifel coefficient is only meaningful for a fixed set of flow angles and cannot be used as an absolute measure of blade loading. A lift coefficient based on circulation is instead proposed that accounts for the blade curvature and is independent of the flow angles. Various existing profile and secondary loss correlations are examined for their suitability to preliminary design. A largely qualitative comparison demonstrates that the loss correlations based on Ainley and Mathieson (Ainley and Mathieson, 1957, "A Method of Performance Estimation for Axial-Flow Turbines," ARC Reports and Memoranda No. 2974; Dunham and Came, 1970, "Improvements to the Ainley-Mathieson Method of Turbine Performance Prediction," Trans. ASME: J. Eng. Gas Turbines Power, July, pp. 252-256; Kacker and Okapuu, 1982, "A Mean Line Performance Method for Axial Flow Turbine Efficiency," J. Eng. Power, 104, pp. 111-119). are not realistic, while the profile loss model of Coull and Hodson (Coull and Hodson, 2011, "Predicting the Profile Loss of High-Lift Low Pressure Turbines," J. Turbomach., 134(2), pp. 021002) and the secondary loss model of (Traupel, W, 1977, Thermische Turbomaschinen, Springer-Verlag, Berlin) are arguably the most reasonable. A quantitative comparison with multistage rig data indicates that, together, these methods over-predict lapse rates by around 30%, highlighting the need for improved loss models and a better understanding of the multistage environment. By examining the influence of blade lift across the Smith efficiency chart, the analysis demonstrates that designs with higher flow turning will tend to be less sensitive to increases in blade loading. © 2013 American Society of Mechanical Engineers.
Resumo:
Amplitude demodulation is an ill-posed problem and so it is natural to treat it from a Bayesian viewpoint, inferring the most likely carrier and envelope under probabilistic constraints. One such treatment is Probabilistic Amplitude Demodulation (PAD), which, whilst computationally more intensive than traditional approaches, offers several advantages. Here we provide methods for estimating the uncertainty in the PAD-derived envelopes and carriers, and for learning free-parameters like the time-scale of the envelope. We show how the probabilistic approach can naturally handle noisy and missing data. Finally, we indicate how to extend the model to signals which contain multiple modulators and carriers.
Resumo:
An existing hybrid finite element (FE)/statistical energy analysis (SEA) approach to the analysis of the mid- and high frequency vibrations of a complex built-up system is extended here to a wider class of uncertainty modeling. In the original approach, the constituent parts of the system are considered to be either deterministic, and modeled using FE, or highly random, and modeled using SEA. A non-parametric model of randomness is employed in the SEA components, based on diffuse wave theory and the Gaussian Orthogonal Ensemble (GOE), and this enables the mean and variance of second order quantities such as vibrational energy and response cross-spectra to be predicted. In the present work the assumption that the FE components are deterministic is relaxed by the introduction of a parametric model of uncertainty in these components. The parametric uncertainty may be modeled either probabilistically, or by using a non-probabilistic approach such as interval analysis, and it is shown how these descriptions can be combined with the non-parametric uncertainty in the SEA subsystems to yield an overall assessment of the performance of the system. The method is illustrated by application to an example built-up plate system which has random properties, and benchmark comparisons are made with full Monte Carlo simulations. © 2012 Elsevier Ltd. All rights reserved.
Resumo:
Statistical dialog systems (SDSs) are motivated by the need for a data-driven framework that reduces the cost of laboriously handcrafting complex dialog managers and that provides robustness against the errors created by speech recognizers operating in noisy environments. By including an explicit Bayesian model of uncertainty and by optimizing the policy via a reward-driven process, partially observable Markov decision processes (POMDPs) provide such a framework. However, exact model representation and optimization is computationally intractable. Hence, the practical application of POMDP-based systems requires efficient algorithms and carefully constructed approximations. This review article provides an overview of the current state of the art in the development of POMDP-based spoken dialog systems. © 1963-2012 IEEE.
Resumo:
We report an empirical study of n-gram posterior probability confidence measures for statistical machine translation (SMT). We first describe an efficient and practical algorithm for rapidly computing n-gram posterior probabilities from large translation word lattices. These probabilities are shown to be a good predictor of whether or not the n-gram is found in human reference translations, motivating their use as a confidence measure for SMT. Comprehensive n-gram precision and word coverage measurements are presented for a variety of different language pairs, domains and conditions. We analyze the effect on reference precision of using single or multiple references, and compare the precision of posteriors computed from k-best lists to those computed over the full evidence space of the lattice. We also demonstrate improved confidence by combining multiple lattices in a multi-source translation framework. © 2012 The Author(s).
Resumo:
This paper introduces a new metric and mean on the set of positive semidefinite matrices of fixed-rank. The proposed metric is derived from a well-chosen Riemannian quotient geometry that generalizes the reductive geometry of the positive cone and the associated natural metric. The resulting Riemannian space has strong geometrical properties: it is geodesically complete, and the metric is invariant with respect to all transformations that preserve angles (orthogonal transformations, scalings, and pseudoinversion). A meaningful approximation of the associated Riemannian distance is proposed, that can be efficiently numerically computed via a simple algorithm based on SVD. The induced mean preserves the rank, possesses the most desirable characteristics of a geometric mean, and is easy to compute. © 2009 Society for Industrial and Applied Mathematics.
Resumo:
Particle tracking techniques are often used to assess the local mechanical properties of cells and biological fluids. The extracted trajectories are exploited to compute the mean-squared displacement that characterizes the dynamics of the probe particles. Limited spatial resolution and statistical uncertainty are the limiting factors that alter the accuracy of the mean-squared displacement estimation. We precisely quantified the effect of localization errors in the determination of the mean-squared displacement by separating the sources of these errors into two separate contributions. A "static error" arises in the position measurements of immobilized particles. A "dynamic error" comes from the particle motion during the finite exposure time that is required for visualization. We calculated the propagation of these errors on the mean-squared displacement. We examined the impact of our error analysis on theoretical model fluids used in biorheology. These theoretical predictions were verified for purely viscous fluids using simulations and a multiple-particle tracking technique performed with video microscopy. We showed that the static contribution can be confidently corrected in dynamics studies by using static experiments performed at a similar noise-to-signal ratio. This groundwork allowed us to achieve higher resolution in the mean-squared displacement, and thus to increase the accuracy of microrheology studies.