939 resultados para Statistical Language Model


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A research has been carried out in two-lanehighways in the Madrid Region to propose an alternativemodel for the speed-flowrelationship using regular loop data. The model is different in shape and, in some cases, slopes with respect to the contents of Highway Capacity Manual (HCM). A model is proposed for a mountainous area road, something for which the HCM does not provide explicitly a solution. The problem of a mountain road with high flows to access a popular recreational area is discussed, and some solutions are proposed. Up to 7 one-way sections of two-lanehighways have been selected, aiming at covering a significant number of different characteristics, to verify the proposed method the different classes of highways on which the Manual classifies them. In order to enunciate the model and to verify the basic variables of these types of roads a high number of data have been used. The counts were collected in the same way that the Madrid Region Highway Agency performs their counts. A total of 1.471 hours have been collected, in periods of 5 minutes. The models have been verified by means of specific statistical test (R2, T-Student, Durbin-Watson, ANOVA, etc.) and with the diagnostics of the contrast of assumptions (normality, linearity, homoscedasticity and independence). The model proposed for this type of highways with base conditions, can explain the different behaviors as traffic volumes increase, and follows a polynomial multiple regression model of order 3, S shaped. As secondary results of this research, the levels of service and the capacities of this road have been measured with the 2000 HCM methodology, and the results discussed. © 2011 Published by Elsevier Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A Digital Elevation Model (DEM) provides the information basis used for many geographic applications such as topographic and geomorphologic studies, landscape through GIS (Geographic Information Systems) among others. The DEM capacity to represent Earth?s surface depends on the surface roughness and the resolution used. Each DEM pixel depends on the scale used characterized by two variables: resolution and extension of the area studied. DEMs can vary in resolution and accuracy by the production method, although there are statistical characteristics that keep constant or very similar in a wide range of scales. Based on this property, several techniques have been applied to characterize DEM through multiscale analysis directly related to fractal geometry: multifractal spectrum and the structure function. The comparison of the results by both methods is discussed. The study area is represented by a 1024 x 1024 data matrix obtained from a DEM with a resolution of 10 x 10 m each point, which correspond with a region known as ?Monte de El Pardo? a property of Spanish National Heritage (Patrimonio Nacional Español) of 15820 Ha located to a short distance from the center of Madrid. Manzanares River goes through this area from North to South. In the southern area a reservoir is found with a capacity of 43 hm3, with an altitude of 603.3 m till 632 m when it is at the highest capacity. In the middle of the reservoir the minimum altitude of this area is achieved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article describes a knowledge-based method for generating multimedia descriptions that summarize the behavior of dynamic systems. We designed this method for users who monitor the behavior of a dynamic system with the help of sensor networks and make decisions according to prefixed management goals. Our method generates presentations using different modes such as text in natural language, 2D graphics and 3D animations. The method uses a qualitative representation of the dynamic system based on hierarchies of components and causal influences. The method includes an abstraction generator that uses the system representation to find and aggregate relevant data at an appropriate level of abstraction. In addition, the method includes a hierarchical planner to generate a presentation using a model with dis- course patterns. Our method provides an efficient and flexible solution to generate concise and adapted multimedia presentations that summarize thousands of time series. It is general to be adapted to differ- ent dynamic systems with acceptable knowledge acquisition effort by reusing and adapting intuitive rep- resentations. We validated our method and evaluated its practical utility by developing several models for an application that worked in continuous real time operation for more than 1 year, summarizing sen- sor data of a national hydrologic information system in Spain.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new formalism, called Hiord, for defining type-free higherorder logic programming languages with predicate abstraction is introduced. A model theory, based on partial combinatory algebras, is presented, with respect to which the formalism is shown sound. A programming language built on a subset of Hiord, and its implementation are discussed. A new proposal for defining modules in this framework is considered, along with several examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Andorra-I is the first implementation of a language based on the Andorra Principie, which states that determinate goals can (and shonld) be run before other goals, and even in a parallel fashion. This principie has materialized in a framework called the Basic Andorra model, which allows or-parallelism as well as (dependent) and-parallelism for determinate goals. In this report we show that it is possible to further extend this model in order to allow general independent and-parallelism for nondeterminate goals, withont greatly modifying the underlying implementation machinery. A simple an easy way to realize such an extensión is to make each (nondeterminate) independent goal determinate, by using a special "bagof" constract. We also show that this can be achieved antomatically by compile-time translation from original Prolog programs. A transformation that fulfüls this objective and which can be easily antomated is presented in this report.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There have been several previous proposals for the integration of Object Oriented Programming features into Logic Programming, resulting in much support theory and several language proposals. However, none of these proposals seem to have made it into the mainstream. Perhaps one of the reasons for these is that the resulting languages depart too much from the standard logic programming languages to entice the average Prolog programmer. Another reason may be that most of what can be done with object-oriented programming can already be done in Prolog through the meta- and higher-order programming facilities that the language includes, albeit sometimes in a more cumbersome way. In light of this, in this paper we propose an alternative solution which is driven by two main objectives. The first one is to include only those characteristics of object-oriented programming which are cumbersome to implement in standard Prolog systems. The second one is to do this in such a way that there is minimum impact on the syntax and complexity of the language, i.e., to introduce the minimum number of new constructs, declarations, and concepts to be learned. Finally, we would like the implementation to be as straightforward as possible, ideally based on simple source to source expansions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The term "Logic Programming" refers to a variety of computer languages and execution models which are based on the traditional concept of Symbolic Logic. The expressive power of these languages offers promise to be of great assistance in facing the programming challenges of present and future symbolic processing applications in Artificial Intelligence, Knowledge-based systems, and many other areas of computing. The sequential execution speed of logic programs has been greatly improved since the advent of the first interpreters. However, higher inference speeds are still required in order to meet the demands of applications such as those contemplated for next generation computer systems. The execution of logic programs in parallel is currently considered a promising strategy for attaining such inference speeds. Logic Programming in turn appears as a suitable programming paradigm for parallel architectures because of the many opportunities for parallel execution present in the implementation of logic programs. This dissertation presents an efficient parallel execution model for logic programs. The model is described from the source language level down to an "Abstract Machine" level suitable for direct implementation on existing parallel systems or for the design of special purpose parallel architectures. Few assumptions are made at the source language level and therefore the techniques developed and the general Abstract Machine design are applicable to a variety of logic (and also functional) languages. These techniques offer efficient solutions to several areas of parallel Logic Programming implementation previously considered problematic or a source of considerable overhead, such as the detection and handling of variable binding conflicts in AND-Parallelism, the specification of control and management of the execution tree, the treatment of distributed backtracking, and goal scheduling and memory management issues, etc. A parallel Abstract Machine design is offered, specifying data areas, operation, and a suitable instruction set. This design is based on extending to a parallel environment the techniques introduced by the Warren Abstract Machine, which have already made very fast and space efficient sequential systems a reality. Therefore, the model herein presented is capable of retaining sequential execution speed similar to that of high performance sequential systems, while extracting additional gains in speed by efficiently implementing parallel execution. These claims are supported by simulations of the Abstract Machine on sample programs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Detecting user affect automatically during real-time conversation is the main challenge towards our greater aim of infusing social intelligence into a natural-language mixed-initiative High-Fidelity (Hi-Fi) audio control spoken dialog agent. In recent years, studies on affect detection from voice have moved on to using realistic, non-acted data, which is subtler. However, it is more challenging to perceive subtler emotions and this is demonstrated in tasks such as labelling and machine prediction. This paper attempts to address part of this challenge by considering the role of user satisfaction ratings and also conversational/dialog features in discriminating contentment and frustration, two types of emotions that are known to be prevalent within spoken human-computer interaction. However, given the laboratory constraints, users might be positively biased when rating the system, indirectly making the reliability of the satisfaction data questionable. Machine learning experiments were conducted on two datasets, users and annotators, which were then compared in order to assess the reliability of these datasets. Our results indicated that standard classifiers were significantly more successful in discriminating the abovementioned emotions and their intensities (reflected by user satisfaction ratings) from annotator data than from user data. These results corroborated that: first, satisfaction data could be used directly as an alternative target variable to model affect, and that they could be predicted exclusively by dialog features. Second, these were only true when trying to predict the abovementioned emotions using annotator?s data, suggesting that user bias does exist in a laboratory-led evaluation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the design, development and field evaluation of a machine translation system from Spanish to Spanish Sign Language (LSE: Lengua de Signos Española). The developed system focuses on helping Deaf people when they want to renew their Driver’s License. The system is made up of a speech recognizer (for decoding the spoken utterance into a word sequence), a natural language translator (for converting a word sequence into a sequence of signs belonging to the sign language), and a 3D avatar animation module (for playing back the signs). For the natural language translator, three technological approaches have been implemented and evaluated: an example-based strategy, a rule-based translation method and a statistical translator. For the final version, the implemented language translator combines all the alternatives into a hierarchical structure. This paper includes a detailed description of the field evaluation. This evaluation was carried out in the Local Traffic Office in Toledo involving real government employees and Deaf people. The evaluation includes objective measurements from the system and subjective information from questionnaires. The paper details the main problems found and a discussion on how to solve them (some of them specific for LSE).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes an architecture, based on statistical machine translation, for developing the text normalization module of a text to speech conversion system. The main target is to generate a language independent text normalization module, based on data and flexible enough to deal with all situa-tions presented in this task. The proposed architecture is composed by three main modules: a tokenizer module for splitting the text input into a token graph (tokenization), a phrase-based translation module (token translation) and a post-processing module for removing some tokens. This paper presents initial exper-iments for numbers and abbreviations. The very good results obtained validate the proposed architecture.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Monitoring of neuro-evolutive development from birth until the age of six is a decisive factor in a child's quality of life. Early detection of development disorders in early childhood can facilitate necessary diagnosis and/or treatment. Primary-care pediatricians play a key role in early detection of development alterations as they can undertake the preventive and therapeutic actions necessary in the interest of a child's optimal development. The focus of this research paper is the construction of a Knowledge Base for smart screening aimed to assist pediatricians in processes of early referral in language disorders. The proposed model provides health professionals with a decision-making tool that supports referral processes. In this way, essential diagnostic and/or therapeutic actions are triggered for a comprehensive individual development. The resulting system was developed on the basis of an analysis and verification of 21 cases of children with language disorders.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Algebraic topology (homology) is used to analyze the state of spiral defect chaos in both laboratory experiments and numerical simulations of Rayleigh-Bénard convection. The analysis reveals topological asymmetries that arise when non-Boussinesq effects are present. The asymmetries are found in different flow fields in the simulations and are robust to substantial alterations to flow visualization conditions in the experiment. However, the asymmetries are not observable using conventional statistical measures. These results suggest homology may provide a new and general approach for connecting spatiotemporal observations of chaotic or turbulent patterns to theoretical models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La predicción de energía eólica ha desempeñado en la última década un papel fundamental en el aprovechamiento de este recurso renovable, ya que permite reducir el impacto que tiene la naturaleza fluctuante del viento en la actividad de diversos agentes implicados en su integración, tales como el operador del sistema o los agentes del mercado eléctrico. Los altos niveles de penetración eólica alcanzados recientemente por algunos países han puesto de manifiesto la necesidad de mejorar las predicciones durante eventos en los que se experimenta una variación importante de la potencia generada por un parque o un conjunto de ellos en un tiempo relativamente corto (del orden de unas pocas horas). Estos eventos, conocidos como rampas, no tienen una única causa, ya que pueden estar motivados por procesos meteorológicos que se dan en muy diferentes escalas espacio-temporales, desde el paso de grandes frentes en la macroescala a procesos convectivos locales como tormentas. Además, el propio proceso de conversión del viento en energía eléctrica juega un papel relevante en la ocurrencia de rampas debido, entre otros factores, a la relación no lineal que impone la curva de potencia del aerogenerador, la desalineación de la máquina con respecto al viento y la interacción aerodinámica entre aerogeneradores. En este trabajo se aborda la aplicación de modelos estadísticos a la predicción de rampas a muy corto plazo. Además, se investiga la relación de este tipo de eventos con procesos atmosféricos en la macroescala. Los modelos se emplean para generar predicciones de punto a partir del modelado estocástico de una serie temporal de potencia generada por un parque eólico. Los horizontes de predicción considerados van de una a seis horas. Como primer paso, se ha elaborado una metodología para caracterizar rampas en series temporales. La denominada función-rampa está basada en la transformada wavelet y proporciona un índice en cada paso temporal. Este índice caracteriza la intensidad de rampa en base a los gradientes de potencia experimentados en un rango determinado de escalas temporales. Se han implementado tres tipos de modelos predictivos de cara a evaluar el papel que juega la complejidad de un modelo en su desempeño: modelos lineales autorregresivos (AR), modelos de coeficientes variables (VCMs) y modelos basado en redes neuronales (ANNs). Los modelos se han entrenado en base a la minimización del error cuadrático medio y la configuración de cada uno de ellos se ha determinado mediante validación cruzada. De cara a analizar la contribución del estado macroescalar de la atmósfera en la predicción de rampas, se ha propuesto una metodología que permite extraer, a partir de las salidas de modelos meteorológicos, información relevante para explicar la ocurrencia de estos eventos. La metodología se basa en el análisis de componentes principales (PCA) para la síntesis de la datos de la atmósfera y en el uso de la información mutua (MI) para estimar la dependencia no lineal entre dos señales. Esta metodología se ha aplicado a datos de reanálisis generados con un modelo de circulación general (GCM) de cara a generar variables exógenas que posteriormente se han introducido en los modelos predictivos. Los casos de estudio considerados corresponden a dos parques eólicos ubicados en España. Los resultados muestran que el modelado de la serie de potencias permitió una mejora notable con respecto al modelo predictivo de referencia (la persistencia) y que al añadir información de la macroescala se obtuvieron mejoras adicionales del mismo orden. Estas mejoras resultaron mayores para el caso de rampas de bajada. Los resultados también indican distintos grados de conexión entre la macroescala y la ocurrencia de rampas en los dos parques considerados. Abstract One of the main drawbacks of wind energy is that it exhibits intermittent generation greatly depending on environmental conditions. Wind power forecasting has proven to be an effective tool for facilitating wind power integration from both the technical and the economical perspective. Indeed, system operators and energy traders benefit from the use of forecasting techniques, because the reduction of the inherent uncertainty of wind power allows them the adoption of optimal decisions. Wind power integration imposes new challenges as higher wind penetration levels are attained. Wind power ramp forecasting is an example of such a recent topic of interest. The term ramp makes reference to a large and rapid variation (1-4 hours) observed in the wind power output of a wind farm or portfolio. Ramp events can be motivated by a broad number of meteorological processes that occur at different time/spatial scales, from the passage of large-scale frontal systems to local processes such as thunderstorms and thermally-driven flows. Ramp events may also be conditioned by features related to the wind-to-power conversion process, such as yaw misalignment, the wind turbine shut-down and the aerodynamic interaction between wind turbines of a wind farm (wake effect). This work is devoted to wind power ramp forecasting, with special focus on the connection between the global scale and ramp events observed at the wind farm level. The framework of this study is the point-forecasting approach. Time series based models were implemented for very short-term prediction, this being characterised by prediction horizons up to six hours ahead. As a first step, a methodology to characterise ramps within a wind power time series was proposed. The so-called ramp function is based on the wavelet transform and it provides a continuous index related to the ramp intensity at each time step. The underlying idea is that ramps are characterised by high power output gradients evaluated under different time scales. A number of state-of-the-art time series based models were considered, namely linear autoregressive (AR) models, varying-coefficient models (VCMs) and artificial neural networks (ANNs). This allowed us to gain insights into how the complexity of the model contributes to the accuracy of the wind power time series modelling. The models were trained in base of a mean squared error criterion and the final set-up of each model was determined through cross-validation techniques. In order to investigate the contribution of the global scale into wind power ramp forecasting, a methodological proposal to identify features in atmospheric raw data that are relevant for explaining wind power ramp events was presented. The proposed methodology is based on two techniques: principal component analysis (PCA) for atmospheric data compression and mutual information (MI) for assessing non-linear dependence between variables. The methodology was applied to reanalysis data generated with a general circulation model (GCM). This allowed for the elaboration of explanatory variables meaningful for ramp forecasting that were utilized as exogenous variables by the forecasting models. The study covered two wind farms located in Spain. All the models outperformed the reference model (the persistence) during both ramp and non-ramp situations. Adding atmospheric information had a noticeable impact on the forecasting performance, specially during ramp-down events. Results also suggested different levels of connection between the ramp occurrence at the wind farm level and the global scale.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The characteristics of CC and CLP systems are in principle very dierent However a recent trend towards convergence in the implementation techniques for these systems can be observed While CLP and Prolog systems have been incorporating capabilities to deal with userdened suspension and coroutining CC compilers have been trying to coalesce negrained tasks into coarsergrained sequential threads This convergence of techniques opens up the possibility of having a general purpose kernel language and abstract machine to serve as a compilation target for a variety of userlevel languages We propose a transformation technique directed towards such an objective In particular we report on techniques to support the Andorra computational model essentially emulating the AndorraI system via program transformation into a sequential language with delay primitives The system is automatic comprising an optional program analyzer and a basic transformer to the kernel language It turns out that a simple parallel CLP or Prolog system with dynamic scheduling is sucient as a kernel language for this purpose The preliminary results are quite encouraging performance of the resulting system is comparable to the current AndorraI implementation.