32 resultados para statistical application

em Universidad Politécnica de Madrid


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Esta tesis doctoral consiste en un estudio empírico de la competencia lingüística del alumnado de la Escuela Técnica Superior de Ingenieros Industriales (ETSII), de la Universidad Politécnica de Madrid (UPM) en el uso de los grupos nominales (GN) en inglés profesional y académico (IPA). Mediante el análisis estadístico de los datos de las pruebas de nivel de inglés general y de conocimiento de los rasgos lingüísticos diferenciadores del IPA, se ha buscado constatar que los GN, tan presentes en los textos ingleses de carácter científico-técnico, son uno de los rasgos IPA que más dificultad presenta para el alumnado de ingeniería de habla española, cuya enseñanza es necesario abordar de forma diferenciada para que se puedan usar correctamente. El trabajo comienza presentando las características lingüísticas generales del IPA, entre las que destaca la frecuente presencia de los GN en las comunicaciones de carácter científico y técnico. Comprueba la hipótesis de que la comprensión y el uso de los GN es el rasgo lingüístico que ofrece mayor dificultad para esta población. Se detiene en explicar las propiedades de los GN en inglés y de las palabras que lo forman; es decir, muestra clases de palabras, regularidades y excepciones que están presentes en los textos científico-técnicos. También expone y razona el comportamiento de las distintas categorías gramaticales que pueden figurar como premodificadoras y se centra en ejemplos reales y datos objetivos, para llegar a conclusiones cuantitativas sobre el uso y la frecuencia de los GN en el discurso científico-técnico, así como en los diferentes grados de dificultad que dichas construcciones presentan al alumnado. El método de investigación seguido consiste en la recopilación y análisis estadístico de los datos procedentes de una muestra de población de alumnos de 5º curso de la ETSII de la UPM utilizando el análisis de la varianza ANOVA. Los datos se han tratado teniendo en cuenta el nivel general de inglés de cada alumno, según el Marco Común Europeo de Referencia para las Lenguas (MCERL), que oscila entre el nivel A2 y el C1, con una mayor frecuencia de B1 y B2. Las conclusiones se basan en el análisis de los datos empíricos obtenidos, que nos permiten deducir unos principios generales relevantes respecto al uso de los GN –simples y complejos- en el inglés para la ciencia y la tecnología (ICT) y al grado de dificultad que sus distintos tipos presentan al alumnado, con un grado de confianza superior al 95%. A partir de estos datos se ofrece un planteamiento didáctico que facilite la comprensión y elaboración de los distintos tipos de GN. La estructura general de la tesis se divide en seis capítulos. El capítulo 1 es una introducción en la que se presentan las razones que han motivado esta tesis, las hipótesis, los objetivos y la metodología de la misma. En el capítulo 2 se recogen los rasgos lingüísticos distintivos del ICT, incidiendo en la relación competencia comunicativa/competencia lingüística. En el capítulo 3 se analizan los GN profundizando en aspectos lingüísticos. El capítulo 4 se centra en el estudio empírico propiamente dicho y en el análisis estadístico de los datos. Del análisis se extraen las conclusiones objetivas sobre la dificultad que presentan los diferentes rasgos IPA analizados, con detenimiento en los GN simples y complejos. El capítulo 5 ofrece un planteamiento didáctico práctico sobre la enseñanza de los GN en el contexto del ICT, con una investigación sobre el terreno y el análisis estadístico de los resultados obtenidos antes y después de la aplicación didáctica. En el capítulo 6 se comentan los resultados obtenidos a lo largo del estudio, aportando las conclusiones, las limitaciones y las recomendaciones para futuros trabajos de investigación sobre el tema de la tesis. ABSTRACT This doctoral thesis consists of an empirical study of the linguistic competence of the Technical University of Madrid (UPM) industrial engineering students on the use of nominal groups (NG) in English for Academic and Professional Purposes (EAPP). In order to confirm that NG is the EAPP linguistic feature that presents the greatest difficulty to Spanish engineering students, a statistical analysis of the data obtained from the application of a general English test and the EAPP linguistic features tests -developed for this purpose- was carried out. Consequently, this linguistic feature needs to be specifically taught in order to be used correctly by Spanish engineering students. The study begins by presenting the general language characteristics of EAPP, among which the frequent presence of NG in scientific and technical writings stands out. It verifies the hypothesis that the understanding and use of NG in English is the most difficult linguistic feature for Spanish engineering students. It explains the features of English NG analyzing the words they are composed of by depicting the types of words, regularities and exceptions that are present in technical and scientific English. It also explains the behavior of the different grammar categories that act as pre-modifiers of the noun and focuses on real examples taken from authentic publications and quantitative data, to reach objective conclusions about the use and degree of difficulty of the NG for the students. The research methodology includes the gathering of data from the 5th year industrial engineering students´ tests and analyzing them by means of the ANOVA statistical application. The data have been treated in relation to the students’ Common European Framework of Reference for Languages (CEFRL) levels, which range from A2 to C1, although the majority lye between B1 and B2 levels. The conclusions are based on the results, which allow us to obtain relevant information about the understanding and use of NG –simple and complex- by the focus group, with a 95% confidence level. From these data, a methodological approach to NG teaching has been tested to help students to acquire such linguistic feature. The general structure of this thesis is divided into six chapters. The first is an introduction containing the reasons that have motivated this piece of research, the hypotheses, objectives and methodology employed. The second deals with the distinctive linguistic features of EST underlying the concepts of linguistic and communicative competence. Chapter three focuses on the grammar aspects of NG. Chapter four contains the empirical study and the statistical analysis of the data. Results allow us to reach objective conclusions about the degree of difficulty of the EAPP linguistic features studied, focusing on simple and complex NG. Chapter five discusses a methodological approach to the teaching of NG in an EST context, comparing students’ test results before and after the NG teaching application. Finally, chapter six discusses the findings obtained along the study, presenting the conclusions, limitations and recommendations for future research in this area.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Aplicación de simulación de Monte Carlo y técnicas de Análisis de la Varianza (ANOVA) a la comparación de modelos estocásticos dinámicos para accidentes de tráfico.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Caracterización de los procesos de disipación mecánica basándose en la microestructura de los tejidos blandos. We present a continuous damage model with regularized softening (smeared crack models) for fiber reinforced soft tissues. Material parameters of the continuous model derive from the mesoscopic scale. In the mesoscopic scale continuum is considered as a collagenous fibrilreinforced composite. We want to study the continnumlevel response as a function of the nanoscale properties of the collagen and the adherent forces between the tropocollagen molecules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The statistical distributions of different software properties have been thoroughly studied in the past, including software size, complexity and the number of defects. In the case of object-oriented systems, these distributions have been found to obey a power law, a common statistical distribution also found in many other fields. However, we have found that for some statistical properties, the behavior does not entirely follow a power law, but a mixture between a lognormal and a power law distribution. Our study is based on the Qualitas Corpus, a large compendium of diverse Java-based software projects. We have measured the Chidamber and Kemerer metrics suite for every file of every Java project in the corpus. Our results show that the range of high values for the different metrics follows a power law distribution, whereas the rest of the range follows a lognormal distribution. This is a pattern typical of so-called double Pareto distributions, also found in empirical studies for other software properties.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An image processing observational technique for the stereoscopic reconstruction of the wave form of oceanic sea states is developed. The technique incorporates the enforcement of any given statistical wave law modeling the quasi Gaussianity of oceanic waves observed in nature. The problem is posed in a variational optimization framework, where the desired wave form is obtained as the minimizer of a cost functional that combines image observations, smoothness priors and a weak statistical constraint. The minimizer is obtained combining gradient descent and multigrid methods on the necessary optimality equations of the cost functional. Robust photometric error criteria and a spatial intensity compensation model are also developed to improve the performance of the presented image matching strategy. The weak statistical constraint is thoroughly evaluated in combination with other elements presented to reconstruct and enforce constraints on experimental stereo data, demonstrating the improvement in the estimation of the observed ocean surface.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

En esta tesis se aborda la detección y el seguimiento automático de vehículos mediante técnicas de visión artificial con una cámara monocular embarcada. Este problema ha suscitado un gran interés por parte de la industria automovilística y de la comunidad científica ya que supone el primer paso en aras de la ayuda a la conducción, la prevención de accidentes y, en última instancia, la conducción automática. A pesar de que se le ha dedicado mucho esfuerzo en los últimos años, de momento no se ha encontrado ninguna solución completamente satisfactoria y por lo tanto continúa siendo un tema de investigación abierto. Los principales problemas que plantean la detección y seguimiento mediante visión artificial son la gran variabilidad entre vehículos, un fondo que cambia dinámicamente debido al movimiento de la cámara, y la necesidad de operar en tiempo real. En este contexto, esta tesis propone un marco unificado para la detección y seguimiento de vehículos que afronta los problemas descritos mediante un enfoque estadístico. El marco se compone de tres grandes bloques, i.e., generación de hipótesis, verificación de hipótesis, y seguimiento de vehículos, que se llevan a cabo de manera secuencial. No obstante, se potencia el intercambio de información entre los diferentes bloques con objeto de obtener el máximo grado posible de adaptación a cambios en el entorno y de reducir el coste computacional. Para abordar la primera tarea de generación de hipótesis, se proponen dos métodos complementarios basados respectivamente en el análisis de la apariencia y la geometría de la escena. Para ello resulta especialmente interesante el uso de un dominio transformado en el que se elimina la perspectiva de la imagen original, puesto que este dominio permite una búsqueda rápida dentro de la imagen y por tanto una generación eficiente de hipótesis de localización de los vehículos. Los candidatos finales se obtienen por medio de un marco colaborativo entre el dominio original y el dominio transformado. Para la verificación de hipótesis se adopta un método de aprendizaje supervisado. Así, se evalúan algunos de los métodos de extracción de características más populares y se proponen nuevos descriptores con arreglo al conocimiento de la apariencia de los vehículos. Para evaluar la efectividad en la tarea de clasificación de estos descriptores, y dado que no existen bases de datos públicas que se adapten al problema descrito, se ha generado una nueva base de datos sobre la que se han realizado pruebas masivas. Finalmente, se presenta una metodología para la fusión de los diferentes clasificadores y se plantea una discusión sobre las combinaciones que ofrecen los mejores resultados. El núcleo del marco propuesto está constituido por un método Bayesiano de seguimiento basado en filtros de partículas. Se plantean contribuciones en los tres elementos fundamentales de estos filtros: el algoritmo de inferencia, el modelo dinámico y el modelo de observación. En concreto, se propone el uso de un método de muestreo basado en MCMC que evita el elevado coste computacional de los filtros de partículas tradicionales y por consiguiente permite que el modelado conjunto de múltiples vehículos sea computacionalmente viable. Por otra parte, el dominio transformado mencionado anteriormente permite la definición de un modelo dinámico de velocidad constante ya que se preserva el movimiento suave de los vehículos en autopistas. Por último, se propone un modelo de observación que integra diferentes características. En particular, además de la apariencia de los vehículos, el modelo tiene en cuenta también toda la información recibida de los bloques de procesamiento previos. El método propuesto se ejecuta en tiempo real en un ordenador de propósito general y da unos resultados sobresalientes en comparación con los métodos tradicionales. ABSTRACT This thesis addresses on-road vehicle detection and tracking with a monocular vision system. This problem has attracted the attention of the automotive industry and the research community as it is the first step for driver assistance and collision avoidance systems and for eventual autonomous driving. Although many effort has been devoted to address it in recent years, no satisfactory solution has yet been devised and thus it is an active research issue. The main challenges for vision-based vehicle detection and tracking are the high variability among vehicles, the dynamically changing background due to camera motion and the real-time processing requirement. In this thesis, a unified approach using statistical methods is presented for vehicle detection and tracking that tackles these issues. The approach is divided into three primary tasks, i.e., vehicle hypothesis generation, hypothesis verification, and vehicle tracking, which are performed sequentially. Nevertheless, the exchange of information between processing blocks is fostered so that the maximum degree of adaptation to changes in the environment can be achieved and the computational cost is alleviated. Two complementary strategies are proposed to address the first task, i.e., hypothesis generation, based respectively on appearance and geometry analysis. To this end, the use of a rectified domain in which the perspective is removed from the original image is especially interesting, as it allows for fast image scanning and coarse hypothesis generation. The final vehicle candidates are produced using a collaborative framework between the original and the rectified domains. A supervised classification strategy is adopted for the verification of the hypothesized vehicle locations. In particular, state-of-the-art methods for feature extraction are evaluated and new descriptors are proposed by exploiting the knowledge on vehicle appearance. Due to the lack of appropriate public databases, a new database is generated and the classification performance of the descriptors is extensively tested on it. Finally, a methodology for the fusion of the different classifiers is presented and the best combinations are discussed. The core of the proposed approach is a Bayesian tracking framework using particle filters. Contributions are made on its three key elements: the inference algorithm, the dynamic model and the observation model. In particular, the use of a Markov chain Monte Carlo method is proposed for sampling, which circumvents the exponential complexity increase of traditional particle filters thus making joint multiple vehicle tracking affordable. On the other hand, the aforementioned rectified domain allows for the definition of a constant-velocity dynamic model since it preserves the smooth motion of vehicles in highways. Finally, a multiple-cue observation model is proposed that not only accounts for vehicle appearance but also integrates the available information from the analysis in the previous blocks. The proposed approach is proven to run near real-time in a general purpose PC and to deliver outstanding results compared to traditional methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work presents a systematic method for the generation and treatment of the alarms' graphs, being its final object to find the Alarm Root Cause of the Massive Alarms that are produced in the dispatching centers. Although many works about this matter have been already developed, the problem about the alarm management in the industry is still completely unsolved. In this paper, a simple statistic analysis of the historical data base is conducted. The results obtained by the acquisition alarm systems, are used to generate a directed graph from which the more significant alarms are extracted, previously analyzing any possible case in which a great quantity of alarms are produced.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents some of the results of a method to determine the main reliability functions of concentrator solar cells. High concentrator GaAs single junction solar cells have been tested in an Accelerated Life Test. The method can be directly applied to multi-junction solar cells. The main conclusions of this test carried out show that these solar cells are robust devices with a very low probability of failure caused by degradation during their operation life (more than 30 years). The evaluation of the probability operation function (i.e. the reliability function R(t)) is obtained for two nominal operation conditions of these cells, namely simulated concentration ratios of 700 and 1050 suns. Preliminary determination of the Mean Time to Failure indicates a value much higher than the intended operation life time of the concentrator cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ontologies and taxonomies are widely used to organize concepts providing the basis for activities such as indexing, and as background knowledge for NLP tasks. As such, translation of these resources would prove useful to adapt these systems to new languages. However, we show that the nature of these resources is significantly different from the "free-text" paradigm used to train most statistical machine translation systems. In particular, we see significant differences in the linguistic nature of these resources and such resources have rich additional semantics. We demonstrate that as a result of these linguistic differences, standard SMT methods, in particular evaluation metrics, can produce poor performance. We then look to the task of leveraging these semantics for translation, which we approach in three ways: by adapting the translation system to the domain of the resource; by examining if semantics can help to predict the syntactic structure used in translation; and by evaluating if we can use existing translated taxonomies to disambiguate translations. We present some early results from these experiments, which shed light on the degree of success we may have with each approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work explores the automatic recognition of physical activity intensity patterns from multi-axial accelerometry and heart rate signals. Data collection was carried out in free-living conditions and in three controlled gymnasium circuits, for a total amount of 179.80 h of data divided into: sedentary situations (65.5%), light-to-moderate activity (17.6%) and vigorous exercise (16.9%). The proposed machine learning algorithms comprise the following steps: time-domain feature definition, standardization and PCA projection, unsupervised clustering (by k-means and GMM) and a HMM to account for long-term temporal trends. Performance was evaluated by 30 runs of a 10-fold cross-validation. Both k-means and GMM-based approaches yielded high overall accuracy (86.97% and 85.03%, respectively) and, given the imbalance of the dataset, meritorious F-measures (up to 77.88%) for non-sedentary cases. Classification errors tended to be concentrated around transients, what constrains their practical impact. Hence, we consider our proposal to be suitable for 24 h-based monitoring of physical activity in ambulatory scenarios and a first step towards intensity-specific energy expenditure estimators

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wind power time series usually show complex dynamics mainly due to non-linearities related to the wind physics and the power transformation process in wind farms. This article provides an approach to the incorporation of observed local variables (wind speed and direction) to model some of these effects by means of statistical models. To this end, a benchmarking between two different families of varying-coefficient models (regime-switching and conditional parametric models) is carried out. The case of the offshore wind farm of Horns Rev in Denmark has been considered. The analysis is focused on one-step ahead forecasting and a time series resolution of 10 min. It has been found that the local wind direction contributes to model some features of the prevailing winds, such as the impact of the wind direction on the wind variability, whereas the non-linearities related to the power transformation process can be introduced by considering the local wind speed. In both cases, conditional parametric models showed a better performance than the one achieved by the regime-switching strategy. The results attained reinforce the idea that each explanatory variable allows the modelling of different underlying effects in the dynamics of wind power time series.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pragmatism is the leading motivation of regularization. We can understand regularization as a modification of the maximum-likelihood estimator so that a reasonable answer could be given in an unstable or ill-posed situation. To mention some typical examples, this happens when fitting parametric or non-parametric models with more parameters than data or when estimating large covariance matrices. Regularization is usually used, in addition, to improve the bias-variance tradeoff of an estimation. Then, the definition of regularization is quite general, and, although the introduction of a penalty is probably the most popular type, it is just one out of multiple forms of regularization. In this dissertation, we focus on the applications of regularization for obtaining sparse or parsimonious representations, where only a subset of the inputs is used. A particular form of regularization, L1-regularization, plays a key role for reaching sparsity. Most of the contributions presented here revolve around L1-regularization, although other forms of regularization are explored (also pursuing sparsity in some sense). In addition to present a compact review of L1-regularization and its applications in statistical and machine learning, we devise methodology for regression, supervised classification and structure induction of graphical models. Within the regression paradigm, we focus on kernel smoothing learning, proposing techniques for kernel design that are suitable for high dimensional settings and sparse regression functions. We also present an application of regularized regression techniques for modeling the response of biological neurons. Supervised classification advances deal, on the one hand, with the application of regularization for obtaining a na¨ıve Bayes classifier and, on the other hand, with a novel algorithm for brain-computer interface design that uses group regularization in an efficient manner. Finally, we present a heuristic for inducing structures of Gaussian Bayesian networks using L1-regularization as a filter. El pragmatismo es la principal motivación de la regularización. Podemos entender la regularización como una modificación del estimador de máxima verosimilitud, de tal manera que se pueda dar una respuesta cuando la configuración del problema es inestable. A modo de ejemplo, podemos mencionar el ajuste de modelos paramétricos o no paramétricos cuando hay más parámetros que casos en el conjunto de datos, o la estimación de grandes matrices de covarianzas. Se suele recurrir a la regularización, además, para mejorar el compromiso sesgo-varianza en una estimación. Por tanto, la definición de regularización es muy general y, aunque la introducción de una función de penalización es probablemente el método más popular, éste es sólo uno de entre varias posibilidades. En esta tesis se ha trabajado en aplicaciones de regularización para obtener representaciones dispersas, donde sólo se usa un subconjunto de las entradas. En particular, la regularización L1 juega un papel clave en la búsqueda de dicha dispersión. La mayor parte de las contribuciones presentadas en la tesis giran alrededor de la regularización L1, aunque también se exploran otras formas de regularización (que igualmente persiguen un modelo disperso). Además de presentar una revisión de la regularización L1 y sus aplicaciones en estadística y aprendizaje de máquina, se ha desarrollado metodología para regresión, clasificación supervisada y aprendizaje de estructura en modelos gráficos. Dentro de la regresión, se ha trabajado principalmente en métodos de regresión local, proponiendo técnicas de diseño del kernel que sean adecuadas a configuraciones de alta dimensionalidad y funciones de regresión dispersas. También se presenta una aplicación de las técnicas de regresión regularizada para modelar la respuesta de neuronas reales. Los avances en clasificación supervisada tratan, por una parte, con el uso de regularización para obtener un clasificador naive Bayes y, por otra parte, con el desarrollo de un algoritmo que usa regularización por grupos de una manera eficiente y que se ha aplicado al diseño de interfaces cerebromáquina. Finalmente, se presenta una heurística para inducir la estructura de redes Bayesianas Gaussianas usando regularización L1 a modo de filtro.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The full text of this article is available in the PDF provided.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The geometrical factors defining an adhesive joint are of great importance as its design greatly conditions the performance of the bonding. One of the most relevant geometrical factors is the thickness of the adhesive as it decisively influences the mechanical properties of the bonding and has a clear economic impact on the manufacturing processes or long runs. The traditional mechanical joints (riveting, welding, etc.) are characterised by a predictable performance, and are very reliable in service conditions. Thus, structural adhesive joints will only be selected in industrial applications demanding mechanical requirements and adverse environmental conditions if the suitable reliability (the same or higher than the mechanical joints) is guaranteed. For this purpose, the objective of this paper is to analyse the influence of the adhesive thickness on the mechanical behaviour of the joint and, by means of a statistical analysis based on Weibull distribution, propose the optimum thickness for the adhesive combining the best mechanical performance and high reliability. This procedure, which is applicable without a great deal of difficulty to other joints and adhesives, provides a general use for a more reliable use of adhesive bondings and, therefore, for a better and wider use in the industrial manufacturing processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Este artículo describe una estrategia de selección de frases para hacer el ajuste de un sistema de traducción estadístico basado en el decodificador Moses que traduce del español al inglés. En este trabajo proponemos dos posibilidades para realizar esta selección de las frases del corpus de validación que más se parecen a las frases que queremos traducir (frases de test en lengua origen). Con esta selección podemos obtener unos mejores pesos de los modelos para emplearlos después en el proceso de traducción y, por tanto, mejorar los resultados. Concretamente, con el método de selección basado en la medida de similitud propuesta en este artículo, mejoramos la medida BLEU del 27,17% con el corpus de validación completo al 27,27% seleccionando las frases para el ajuste. Estos resultados se acercan a los del experimento ORACLE: se utilizan las mismas frases de test para hacer el ajuste de los pesos. En este caso, el BLEU obtenido es de 27,51%.