906 resultados para Autoregressive-Moving Average model
Resumo:
Predicting failures in a distributed system based on previous events through logistic regression is a standard approach in literature. This technique is not reliable, though, in two situations: in the prediction of rare events, which do not appear in enough proportion for the algorithm to capture, and in environments where there are too many variables, as logistic regression tends to overfit on this situations; while manually selecting a subset of variables to create the model is error- prone. On this paper, we solve an industrial research case that presented this situation with a combination of elastic net logistic regression, a method that allows us to automatically select useful variables, a process of cross-validation on top of it and the application of a rare events prediction technique to reduce computation time. This process provides two layers of cross- validation that automatically obtain the optimal model complexity and the optimal mode l parameters values, while ensuring even rare events will be correctly predicted with a low amount of training instances. We tested this method against real industrial data, obtaining a total of 60 out of 80 possible models with a 90% average model accuracy.
Resumo:
El propósito de esta tesis fue estudiar el rendimiento ofensivo de los equipos de balonmano de élite cuando se considera el balonmano como un sistema dinámico complejo no lineal. La perspectiva de análisis dinámica dependiente del tiempo fue adoptada para evaluar el rendimiento de los equipos durante el partido. La muestra general comprendió los 240 partidos jugados en la temporada 2011-2012 de la liga profesional masculina de balonmano de España (Liga ASOBAL). En el análisis posterior solo se consideraron los partidos ajustados (diferencia final de goles ≤ 5; n = 142). El estado del marcador, la localización del partido, el nivel de los oponentes y el periodo de juego fueron incorporados al análisis como variables situacionales. Tres estudios compusieron el núcleo de la tesis. En el primer estudio, analizamos la coordinación entre las series temporales que representan el proceso goleador a lo largo del partido de cada uno de los dos equipos que se enfrentan. Autocorrelaciones, correlaciones cruzadas, doble media móvil y transformada de Hilbert fueron usadas para el análisis. El proceso goleador de los equipos presentó una alta consistencia a lo largo de todos los partidos, así como fuertes modos de coordinación en fase en todos los contextos de juego. Las únicas diferencias se encontraron en relación al periodo de juego. La coordinación en los procesos goleadores de los equipos fue significativamente menor en el 1er y 2º periodo (0–10 min y 10–20 min), mostrando una clara coordinación creciente a medida que el partido avanzaba. Esto sugiere que son los 20 primeros minutos aquellos que rompen los partidos. En el segundo estudio, analizamos los efectos temporales (efecto inmediato, a corto y a medio plazo) de los tiempos muertos en el rendimiento goleador de los equipos. Modelos de regresión lineal múltiple fueron empleados para el análisis. Los resultados mostraron incrementos de 0.59, 1.40 y 1.85 goles para los periodos que comprenden la primera, tercera y quinta posesión de los equipos que pidieron el tiempo muerto. Inversamente, se encontraron efectos significativamente negativos para los equipos rivales, con decrementos de 0.50, 1.43 y 2.05 goles en los mismos periodos respectivamente. La influencia de las variables situacionales solo se registró en ciertos periodos de juego. Finalmente, en el tercer estudio, analizamos los efectos temporales de las exclusiones de los jugadores sobre el rendimiento goleador de los equipos, tanto para los equipos que sufren la exclusión (inferioridad numérica) como para los rivales (superioridad numérica). Se emplearon modelos de regresión lineal múltiple para el análisis. Los resultados mostraron efectos negativos significativos en el número de goles marcados por los equipos con un jugador menos, con decrementos de 0.25, 0.40, 0.61, 0.62 y 0.57 goles para los periodos que comprenden el primer, segundo, tercer, cuarto y quinto minutos previos y posteriores a la exclusión. Para los rivales, los resultados mostraron efectos positivos significativos, con incrementos de la misma magnitud en los mismos periodos. Esta tendencia no se vio afectada por el estado del marcador, localización del partido, nivel de los oponentes o periodo de juego. Los incrementos goleadores fueron menores de lo que se podría esperar de una superioridad numérica de 2 minutos. Diferentes teorías psicológicas como la paralización ante situaciones de presión donde se espera un gran rendimiento pueden ayudar a explicar este hecho. Los últimos capítulos de la tesis enumeran las conclusiones principales y presentan diferentes aplicaciones prácticas que surgen de los tres estudios. Por último, se presentan las limitaciones y futuras líneas de investigación. ABSTRACT The purpose of this thesis was to investigate the offensive performance of elite handball teams when considering handball as a complex non-linear dynamical system. The time-dependent dynamic approach was adopted to assess teams’ performance during the game. The overall sample comprised the 240 games played in the season 2011-2012 of men’s Spanish Professional Handball League (ASOBAL League). In the subsequent analyses, only close games (final goal-difference ≤ 5; n = 142) were considered. Match status, game location, quality of opposition, and game period situational variables were incorporated into the analysis. Three studies composed the core of the thesis. In the first study, we analyzed the game-scoring coordination between the time series representing the scoring processes of the two opposing teams throughout the game. Autocorrelation, cross-correlation, double moving average, and Hilbert transform were used for analysis. The scoring processes of the teams presented a high consistency across all the games as well as strong in-phase modes of coordination in all the game contexts. The only differences were found when controlling for the game period. The coordination in the scoring processes of the teams was significantly lower for the 1st and 2nd period (0–10 min and 10–20 min), showing a clear increasing coordination behavior as the game progressed. This suggests that the first 20 minutes are those that break the game-scoring. In the second study, we analyzed the temporal effects (immediate effect, short-term effect, and medium-term effect) of team timeouts on teams’ scoring performance. Multiple linear regression models were used for the analysis. The results showed increments of 0.59, 1.40 and 1.85 goals for the periods within the first, third and fifth timeout ball possessions for the teams that requested the timeout. Conversely, significant negative effects on goals scored were found for the opponent teams, with decrements of 0.59, 1.43 and 2.04 goals for the same periods, respectively. The influence of situational variables on the scoring performance was only registered in certain game periods. Finally, in the third study, we analyzed the players’ exclusions temporal effects on teams’ scoring performance, for the teams that suffer the exclusion (numerical inferiority) and for the opponents (numerical superiority). Multiple linear regression models were used for the analysis. The results showed significant negative effects on the number of goals scored for the teams with one less player, with decrements of 0.25, 0.40, 0.61, 0.62, and 0.57 goals for the periods within the previous and post one, two, three, four and five minutes of play. For the opponent teams, the results showed positive effects, with increments of the same magnitude in the same game periods. This trend was not affected by match status, game location, quality of opposition, or game period. The scoring increments were smaller than might be expected from a 2-minute numerical playing superiority. Psychological theories such as choking under pressure situations where good performance is expected could contribute to explain this finding. The final chapters of the thesis enumerate the main conclusions and underline the main practical applications that arise from the three studies. Lastly, limitations and future research directions are described.
Resumo:
We argue that given even an infinitely long data sequence, it is impossible (with any test statistic) to distinguish perfectly between linear and nonlinear processes (including slightly noisy chaotic processes). Our approach is to consider the set of moving-average (linear) processes and study its closure under a suitable metric. We give the precise characterization of this closure, which is unexpectedly large, containing nonergodic processes, which are Poisson sums of independent and identically distributed copies of a stationary process. Proofs of these results will appear elsewhere.
Resumo:
In the analysis of heart rate variability (HRV) are used temporal series that contains the distances between successive heartbeats in order to assess autonomic regulation of the cardiovascular system. These series are obtained from the electrocardiogram (ECG) signal analysis, which can be affected by different types of artifacts leading to incorrect interpretations in the analysis of the HRV signals. Classic approach to deal with these artifacts implies the use of correction methods, some of them based on interpolation, substitution or statistical techniques. However, there are few studies that shows the accuracy and performance of these correction methods on real HRV signals. This study aims to determine the performance of some linear and non-linear correction methods on HRV signals with induced artefacts by quantification of its linear and nonlinear HRV parameters. As part of the methodology, ECG signals of rats measured using the technique of telemetry were used to generate real heart rate variability signals without any error. In these series were simulated missing points (beats) in different quantities in order to emulate a real experimental situation as accurately as possible. In order to compare recovering efficiency, deletion (DEL), linear interpolation (LI), cubic spline interpolation (CI), moving average window (MAW) and nonlinear predictive interpolation (NPI) were used as correction methods for the series with induced artifacts. The accuracy of each correction method was known through the results obtained after the measurement of the mean value of the series (AVNN), standard deviation (SDNN), root mean square error of the differences between successive heartbeats (RMSSD), Lomb\'s periodogram (LSP), Detrended Fluctuation Analysis (DFA), multiscale entropy (MSE) and symbolic dynamics (SD) on each HRV signal with and without artifacts. The results show that, at low levels of missing points the performance of all correction techniques are very similar with very close values for each HRV parameter. However, at higher levels of losses only the NPI method allows to obtain HRV parameters with low error values and low quantity of significant differences in comparison to the values calculated for the same signals without the presence of missing points.
Resumo:
Objective: To investigate gender-specific relationships between self-reported sexual abuse, antisocial behaviour and substance use in a large community sample of adolescents. Method: A cross-sectional study of students aged, on average, 13 (n = 2596), 14 (n = 2475) and 15 years (n = 2290), from 27 schools in South Australia with a questionnaire including sexual abuse, frequency and severity of substance use, depressive symptomatology (CES-D), family functioning (McMaster Family Assessment Device), and antisocial behaviour (an adapted 22-item Self-Report Delinquency Scale). Logistic regression analyses using HLM V5.05 with a population-average model were conducted. Results: In the model considered, reported sexual abuse is significantly independently associated with antisocial behaviour, controlling for confounding factors of depressive symptomatology and family dysfunction, with increased risks of three- to eightfold for sexually abused boys, and two- to threefold for sexually abused girls, compared to nonabused. Increased risks of extreme substance use in sexually abused girls (age 13) and boys (ages 13-15) are more than fourfold, compared to nonabused. Age differences were not statistically significant. Conclusion: Childhood sexual abuse is a risk factor for the development of antisocial behaviour and substance use in young adolescents. Clinicians should be aware of gender differences.
Resumo:
2000 Mathematics Subject Classification: 60G70, 60F12, 60G10.
Resumo:
Physiological signals, which are controlled by the autonomic nervous system (ANS), could be used to detect the affective state of computer users and therefore find applications in medicine and engineering. The Pupil Diameter (PD) seems to provide a strong indication of the affective state, as found by previous research, but it has not been investigated fully yet. ^ In this study, new approaches based on monitoring and processing the PD signal for off-line and on-line affective assessment ("relaxation" vs. "stress") are proposed. Wavelet denoising and Kalman filtering methods are first used to remove abrupt changes in the raw Pupil Diameter (PD) signal. Then three features (PDmean, PDmax and PDWalsh) are extracted from the preprocessed PD signal for the affective state classification. In order to select more relevant and reliable physiological data for further analysis, two types of data selection methods are applied, which are based on the paired t-test and subject self-evaluation, respectively. In addition, five different kinds of the classifiers are implemented on the selected data, which achieve average accuracies up to 86.43% and 87.20%, respectively. Finally, the receiver operating characteristic (ROC) curve is utilized to investigate the discriminating potential of each individual feature by evaluation of the area under the ROC curve, which reaches values above 0.90. ^ For the on-line affective assessment, a hard threshold is implemented first in order to remove the eye blinks from the PD signal and then a moving average window is utilized to obtain the representative value PDr for every one-second time interval of PD. There are three main steps for the on-line affective assessment algorithm, which are preparation, feature-based decision voting and affective determination. The final results show that the accuracies are 72.30% and 73.55% for the data subsets, which were respectively chosen using two types of data selection methods (paired t-test and subject self-evaluation). ^ In order to further analyze the efficiency of affective recognition through the PD signal, the Galvanic Skin Response (GSR) was also monitored and processed. The highest affective assessment classification rate obtained from GSR processing is only 63.57% (based on the off-line processing algorithm). The overall results confirm that the PD signal should be considered as one of the most powerful physiological signals to involve in future automated real-time affective recognition systems, especially for detecting the "relaxation" vs. "stress" states.^
Resumo:
This dissertation introduces the design of a multimodal, adaptive real-time assistive system as an alternate human computer interface that can be used by individuals with severe motor disabilities. The proposed design is based on the integration of a remote eye-gaze tracking system, voice recognition software, and a virtual keyboard. The methodology relies on a user profile that customizes eye gaze tracking using neural networks. The user profiling feature facilitates the notion of universal access to computing resources for a wide range of applications such as web browsing, email, word processing and editing. ^ The study is significant in terms of the integration of key algorithms to yield an adaptable and multimodal interface. The contributions of this dissertation stem from the following accomplishments: (a) establishment of the data transport mechanism between the eye-gaze system and the host computer yielding to a significantly low failure rate of 0.9%; (b) accurate translation of eye data into cursor movement through congregate steps which conclude with calibrated cursor coordinates using an improved conversion function; resulting in an average reduction of 70% of the disparity between the point of gaze and the actual position of the mouse cursor, compared with initial findings; (c) use of both a moving average and a trained neural network in order to minimize the jitter of the mouse cursor, which yield an average jittering reduction of 35%; (d) introduction of a new mathematical methodology to measure the degree of jittering of the mouse trajectory; (e) embedding an onscreen keyboard to facilitate text entry, and a graphical interface that is used to generate user profiles for system adaptability. ^ The adaptability nature of the interface is achieved through the establishment of user profiles, which may contain the jittering and voice characteristics of a particular user as well as a customized list of the most commonly used words ordered according to the user's preferences: in alphabetical or statistical order. This allows the system to successfully provide the capability of interacting with a computer. Every time any of the sub-system is retrained, the accuracy of the interface response improves even more. ^
Resumo:
In 2009, South American military spending reached a total of $51.8 billion, a fifty percent increased from 2000 expenditures. The five-year moving average of arms transfers to South America was 150 percent higher from 2005 to 2009 than figures for 2000 to 2004.[1] These figures and others have led some observers to conclude that Latin America is engaged in an arms race. Other reasons, however, account for Latin America’s large military expenditure. Among them: Several countries have undertaken long-prolonged modernization efforts, recently made possible by six years of consistent regional growth.[2] A generational shift is at hand. Armed Forces are beginning to shed the stigma and association with past dictatorial regimes.[3] Countries are pursuing specific individual strategies, rather than reacting to purchases made by neighbors. For example, Brazil wants to attain greater control of its Amazon rainforests and offshore territories, Colombia’s spending demonstrates a response to internal threats, and Chile is continuing a modernization process begun in the 1990s.[4] Concerns remain, however: Venezuela continues to demonstrate poor democratic governance and a lack of transparency; neighbor-state relations between Colombia and Venezuela, Peru and Chile, and Bolivia and Paraguay, must all continue to be monitored; and Brazil’s military purchases, although legitimate, will likely result in a large accumulation of equipment.[5] These concerns can be best addressed by strengthening and garnering greater participation in transparent procurement mechanism.[6] The United States can do its part by supporting Latin American efforts to embrace the transparency process. _________________ [1] Bromley, Mark, “An Arms Race in Our Hemisphere? Discussing the Trends and Implications of Military Expenditures in South America,” Brookings Institution Conference, Washington, D.C., June 3rd, 2010, Transcript Pgs. 12,13, and 16 [2] Robledo, Marcos, “The Rearmament Debate: A Chilean Perspective,” Power Point presentation, slide 18, 2010 Western Hemisphere Security Colloquium, Miami, Florida, May 25th-26th, 2010 [3] Yopo, Boris, “¿Carrera Armamentista en la Regiόn?” La Tercera, November 2nd, 2009, http://www.latercera.com/contenido/895_197084_9.shtml, accessed October 8th, 2010 [4] Walser, Ray, “An Arms Race in Our Hemisphere? Discussing the Trends and Implications of Military Expenditures in South America,” Brookings Institution Conference, Washington, D.C., June 3rd, 2010, Transcript Pgs. 49,50,53 and 54 [5] Ibid., Guevara, Iñigo, Pg. 22 [6] Ibid., Bromley, Mark, Pgs. 18 and 19
Resumo:
The classifier support vector machine is used in several problems in various areas of knowledge. Basically the method used in this classier is to end the hyperplane that maximizes the distance between the groups, to increase the generalization of the classifier. In this work, we treated some problems of binary classification of data obtained by electroencephalography (EEG) and electromyography (EMG) using Support Vector Machine with some complementary techniques, such as: Principal Component Analysis to identify the active regions of the brain, the periodogram method which is obtained by Fourier analysis to help discriminate between groups and Simple Moving Average to eliminate some of the existing noise in the data. It was developed two functions in the software R, for the realization of training tasks and classification. Also, it was proposed two weights systems and a summarized measure to help on deciding in classification of groups. The application of these techniques, weights and the summarized measure in the classier, showed quite satisfactory results, where the best results were an average rate of 95.31% to visual stimuli data, 100% of correct classification for epilepsy data and rates of 91.22% and 96.89% to object motion data for two subjects.
Resumo:
Paleoenvironmental proxy data for ocean properties, eolian sediment input, and continental rainfall based on high-resolution analyses of sediment cores from the southwestern Black Sea and the northernmost Gulf of Aqaba were used to infer hydroclimatic changes in northern Anatolia and the northern Red Sea region during the last ~7500 years. Pronounced and coherent multicentennial variations in these records reveal patterns that strongly resemble modern temperature and rainfall anomalies related to the Arctic Oscillation/North Atlantic Oscillation (AO/NAO). These patterns suggest a prominent role of AO/NAO-like atmospheric variability during the Holocene beyond interannual to interdecadal timescales, most likely originating from solar output changes.
Resumo:
This paper presents the determination of a mean solar radiation year and of a typical meteorological year for the region of Funchal in the Madeira Island, Portugal. The data set includes hourly mean and extreme values for air temperature, relative humidity and wind speed and hourly mean values for solar global and diffuse radiation for the period 2004-2014, with maximum data coverage of 99.7%. The determination of the mean solar radiation year consisted, in a first step, in the average of all values for each pair hour/day and, in a second step, in the application of a five days centred moving average of hourly values. The determination of the typical meteorological year was based on Finkelstein-Schafer statistics, which allows to obtain a complete year of real measurements through the selection and combination of typical months, preserving the long term averages while still allowing the analysis of short term events. The typical meteorological year validation was carried out through the comparison of the monthly averages for the typical year with the long term monthly averages. The values obtained were very close, so that the typical meteorological year can accurately represent the long term data series. The typical meteorological year can be used in the simulation of renewable energy systems, namely solar energy systems, and for predicting the energy performance of buildings.
Resumo:
Before using the basic precipitation data in any agroclimatic study to assess the productivity it is important to check the data series for homogeneity. For this purpose data of 105 locations for the period 1912-1981 over northeast Brazil were used. The preliminary study indicate nonhomogeneity in the time series during 1940's at few locations. The amplitude of variation of time series when taken as 10-year moving average show quite different for different regions. It appears that this amplitude is related to time of onset of effective rains in some extent. There is also great diversity in the fluctuations. They present a great regional diversity. Some diversity. Some of the data in the low latitudes indicate presence of four cycles namely 52, 26, 13 & 6.5. years. The 52-year cycle is also evident in the case of onset of southwest Monsoon over a low latitude zone (Kerala Coast) in India. In the case of south Africa the prominent cycles are 60, 30, 15 & 10 similar situation appears to be present in the higher latitudes of northeast Brazil.
Resumo:
Physiological signals, which are controlled by the autonomic nervous system (ANS), could be used to detect the affective state of computer users and therefore find applications in medicine and engineering. The Pupil Diameter (PD) seems to provide a strong indication of the affective state, as found by previous research, but it has not been investigated fully yet. In this study, new approaches based on monitoring and processing the PD signal for off-line and on-line affective assessment (“relaxation” vs. “stress”) are proposed. Wavelet denoising and Kalman filtering methods are first used to remove abrupt changes in the raw Pupil Diameter (PD) signal. Then three features (PDmean, PDmax and PDWalsh) are extracted from the preprocessed PD signal for the affective state classification. In order to select more relevant and reliable physiological data for further analysis, two types of data selection methods are applied, which are based on the paired t-test and subject self-evaluation, respectively. In addition, five different kinds of the classifiers are implemented on the selected data, which achieve average accuracies up to 86.43% and 87.20%, respectively. Finally, the receiver operating characteristic (ROC) curve is utilized to investigate the discriminating potential of each individual feature by evaluation of the area under the ROC curve, which reaches values above 0.90. For the on-line affective assessment, a hard threshold is implemented first in order to remove the eye blinks from the PD signal and then a moving average window is utilized to obtain the representative value PDr for every one-second time interval of PD. There are three main steps for the on-line affective assessment algorithm, which are preparation, feature-based decision voting and affective determination. The final results show that the accuracies are 72.30% and 73.55% for the data subsets, which were respectively chosen using two types of data selection methods (paired t-test and subject self-evaluation). In order to further analyze the efficiency of affective recognition through the PD signal, the Galvanic Skin Response (GSR) was also monitored and processed. The highest affective assessment classification rate obtained from GSR processing is only 63.57% (based on the off-line processing algorithm). The overall results confirm that the PD signal should be considered as one of the most powerful physiological signals to involve in future automated real-time affective recognition systems, especially for detecting the “relaxation” vs. “stress” states.
Resumo:
Las organizaciones y sus entornos son sistemas complejos. Tales sistemas son difíciles de comprender y predecir. Pese a ello, la predicción es una tarea fundamental para la gestión empresarial y para la toma de decisiones que implica siempre un riesgo. Los métodos clásicos de predicción (entre los cuales están: la regresión lineal, la Autoregresive Moving Average y el exponential smoothing) establecen supuestos como la linealidad, la estabilidad para ser matemática y computacionalmente tratables. Por diferentes medios, sin embargo, se han demostrado las limitaciones de tales métodos. Pues bien, en las últimas décadas nuevos métodos de predicción han surgido con el fin de abarcar la complejidad de los sistemas organizacionales y sus entornos, antes que evitarla. Entre ellos, los más promisorios son los métodos de predicción bio-inspirados (ej. redes neuronales, algoritmos genéticos /evolutivos y sistemas inmunes artificiales). Este artículo pretende establecer un estado situacional de las aplicaciones actuales y potenciales de los métodos bio-inspirados de predicción en la administración.