10 resultados para Data series
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Abstract
Resumo:
To enable a mathematically and physically sound execution of the fatigue test and a correct interpretation of its results, statistical evaluation methods are used to assist in the analysis of fatigue testing data. The main objective of this work is to develop step-by-stepinstructions for statistical analysis of the laboratory fatigue data. The scopeof this project is to provide practical cases about answering the several questions raised in the treatment of test data with application of the methods and formulae in the document IIW-XIII-2138-06 (Best Practice Guide on the Statistical Analysis of Fatigue Data). Generally, the questions in the data sheets involve some aspects: estimation of necessary sample size, verification of the statistical equivalence of the collated sets of data, and determination of characteristic curves in different cases. The series of comprehensive examples which are given in this thesis serve as a demonstration of the various statistical methods to develop a sound procedure to create reliable calculation rules for the fatigue analysis.
Resumo:
A recently developed calculation method to determine stoichiometric dissociation constants of weak acids from potentiometric titration data is described. The titration data from three different weak acids in aqueous salt solutions at 25 °C were used as examples of the use of the method. The salt alone determined the ionic strength of the solutions considered in this study, and salt molalities up to 0,5 mol kg -1 were used.
Resumo:
Raw measurement data does not always immediately convey useful information, but applying mathematical statistical analysis tools into measurement data can improve the situation. Data analysis can offer benefits like acquiring meaningful insight from the dataset, basing critical decisions on the findings, and ruling out human bias through proper statistical treatment. In this thesis we analyze data from an industrial mineral processing plant with the aim of studying the possibility of forecasting the quality of the final product, given by one variable, with a model based on the other variables. For the study mathematical tools like Qlucore Omics Explorer (QOE) and Sparse Bayesian regression (SB) are used. Later on, linear regression is used to build a model based on a subset of variables that seem to have most significant weights in the SB model. The results obtained from QOE show that the variable representing the desired final product does not correlate with other variables. For SB and linear regression, the results show that both SB and linear regression models built on 1-day averaged data seriously underestimate the variance of true data, whereas the two models built on 1-month averaged data are reliable and able to explain a larger proportion of variability in the available data, making them suitable for prediction purposes. However, it is concluded that no single model can fit well the whole available dataset and therefore, it is proposed for future work to make piecewise non linear regression models if the same available dataset is used, or the plant to provide another dataset that should be collected in a more systematic fashion than the present data for further analysis.
Resumo:
Identification of order of an Autoregressive Moving Average Model (ARMA) by the usual graphical method is subjective. Hence, there is a need of developing a technique to identify the order without employing the graphical investigation of series autocorrelations. To avoid subjectivity, this thesis focuses on determining the order of the Autoregressive Moving Average Model using Reversible Jump Markov Chain Monte Carlo (RJMCMC). The RJMCMC selects the model from a set of the models suggested by better fitting, standard deviation errors and the frequency of accepted data. Together with deep analysis of the classical Box-Jenkins modeling methodology the integration with MCMC algorithms has been focused through parameter estimation and model fitting of ARMA models. This helps to verify how well the MCMC algorithms can treat the ARMA models, by comparing the results with graphical method. It has been seen that the MCMC produced better results than the classical time series approach.
Resumo:
In the power market, electricity prices play an important role at the economic level. The behavior of a price trend usually known as a structural break may change over time in terms of its mean value, its volatility, or it may change for a period of time before reverting back to its original behavior or switching to another style of behavior, and the latter is typically termed a regime shift or regime switch. Our task in this thesis is to develop an electricity price time series model that captures fat tailed distributions which can explain this behavior and analyze it for better understanding. For NordPool data used, the obtained Markov Regime-Switching model operates on two regimes: regular and non-regular. Three criteria have been considered price difference criterion, capacity/flow difference criterion and spikes in Finland criterion. The suitability of GARCH modeling to simulate multi-regime modeling is also studied.
Resumo:
Due to its non-storability, electricity must be produced at the same time that it is consumed, as a result prices are determined on an hourly basis and thus analysis becomes more challenging. Moreover, the seasonal fluctuations in demand and supply lead to a seasonal behavior of electricity spot prices. The purpose of this thesis is to seek and remove all causal effects from electricity spot prices and remain with pure prices for modeling purposes. To achieve this we use Qlucore Omics Explorer (QOE) for the visualization and the exploration of the data set and Time Series Decomposition method to estimate and extract the deterministic components from the series. To obtain the target series we use regression based on the background variables (water reservoir and temperature). The result obtained is three price series (for Sweden, Norway and System prices) with no apparent pattern.
Resumo:
Chaotic behaviour is one of the hardest problems that can happen in nonlinear dynamical systems with severe nonlinearities. It makes the system's responses unpredictable. It makes the system's responses to behave similar to noise. In some applications it should be avoided. One of the approaches to detect the chaotic behaviour is nding the Lyapunov exponent through examining the dynamical equation of the system. It needs a model of the system. The goal of this study is the diagnosis of chaotic behaviour by just exploring the data (signal) without using any dynamical model of the system. In this work two methods are tested on the time series data collected from AMB (Active Magnetic Bearing) system sensors. The rst method is used to nd the largest Lyapunov exponent by Rosenstein method. The second method is a 0-1 test for identifying chaotic behaviour. These two methods are used to detect if the data is chaotic. By using Rosenstein method it is needed to nd the minimum embedding dimension. To nd the minimum embedding dimension Cao method is used. Cao method does not give just the minimum embedding dimension, it also gives the order of the nonlinear dynamical equation of the system and also it shows how the system's signals are corrupted with noise. At the end of this research a test called runs test is introduced to show that the data is not excessively noisy.
Resumo:
Finansanalytiker har en stor betydelse för finansmarknaderna, speciellt igenom att förmedla information genom resultatprognoser. Typiskt är att analytiker i viss grad är oeniga i sina resultatprognoser, och det är just denna oenighet analytiker emellan som denna avhandling studerar. Då ett företag rapporterar förluster tenderar oenigheten gällande ett företags framtid att öka. På ett intuitivt plan är det lätt att tolka detta som ökad osäkerhet. Det är även detta man finner då man studerar analytikerrapporter - analytiker ser ut att bli mer osäkra då företag börjar gå med förlust, och det är precis då som även oenigheten mellan analytikerna ökar. De matematisk-teoretiska modeller som beskriver analytikers beslutsprocesser har däremot en motsatt konsekvens - en ökad oenighet analytiker emellan kan endast uppkomma ifall analytikerna blir säkrare på ett individuellt plan, där den drivande kraften är asymmetrisk information. Denna avhandling löser motsägelsen mellan ökad säkerhet/osäkerhet som drivkraft bakom spridningen i analytikerprognoser. Genom att beakta mängden publik information som blir tillgänglig via resultatrapporter är det inte möjligt för modellerna för analytikers beslutsprocesser att ge upphov till de nivåer av prognosspridning som kan observeras i data. Slutsatsen blir därmed att de underliggande teoretiska modellerna för prognosspridning är delvis bristande och att spridning i prognoser istället mer troligt följer av en ökad osäkerhet bland analytikerna, i enlighet med vad analytiker de facto nämner i sina rapporter. Resultaten är viktiga eftersom en förståelse av osäkerhet runt t.ex. resultatrapportering bidrar till en allmän förståelse för resultatrapporteringsmiljön som i sin tur är av ytterst stor betydelse för prisbildning på finansmarknader. Vidare används typiskt ökad prognosspridning som en indikation på ökad informationsasymmetri i redovisningsforskning, ett fenomen som denna avhandling därmed ifrågasätter.
Resumo:
Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.