21 resultados para quasi-least squares
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Tämän tutkielman tavoitteena on selvittää Venäjän, Slovakian, Tsekin, Romanian, Bulgarian, Unkarin ja Puolan osakemarkkinoiden heikkojen ehtojen tehokkuutta. Tämä tutkielma on kvantitatiivinen tutkimus ja päiväkohtaiset indeksin sulkemisarvot kerättiin Datastreamin tietokannasta. Data kerättiin pörssien ensimmäisestä kaupankäyntipäivästä aina vuoden 2006 elokuun loppuun saakka. Analysoinnin tehostamiseksi dataa tutkittiin koko aineistolla, sekä kahdella aliperiodilla. Osakemarkkinoiden tehokkuutta on testattu neljällä tilastollisella metodilla, mukaan lukien autokorrelaatiotesti ja epäparametrinen runs-testi. Tavoitteena on myös selvittääesiintyykö kyseisillä markkinoilla viikonpäiväanomalia. Viikonpäiväanomalian esiintymistä tutkitaan käyttämällä pienimmän neliösumman menetelmää (OLS). Viikonpäiväanomalia on löydettävissä kaikilta edellä mainituilta osakemarkkinoilta paitsi Tsekin markkinoilta. Merkittävää, positiivista tai negatiivista autokorrelaatiota, on löydettävissä kaikilta osakemarkkinoilta, myös Ljung-Box testi osoittaa kaikkien markkinoiden tehottomuutta täydellä periodilla. Osakemarkkinoiden satunnaiskulku hylätään runs-testin perusteella kaikilta muilta paitsi Slovakian osakemarkkinoilla, ainakin tarkastellessa koko aineistoa tai ensimmäistä aliperiodia. Aineisto ei myöskään ole normaalijakautunut minkään indeksin tai aikajakson kohdalla. Nämä havainnot osoittavat, että kyseessä olevat markkinat eivät ole heikkojen ehtojen mukaan tehokkaita
Resumo:
Tämän tutkimuksen tarkoituksena on tarkastella esiintyykö Venäjän osakemarkkinoilla kalenterianomalioita. Tutkimus keskittyy Halloween-, kuukausi-, kuunvaihde-, viikonpäivä- ja juhlapäiväanomalioiden tarkasteluun. Tutkimusaineistona käytetään RTS (Russian Trading System) indeksiä. Tarkasteluaika alkaa 1. syyskuuta 1995 ja loppuu 31. joulukuuta 2005. Havaintojen kokonaismäärä on 2584. Tutkimusmenetelmänä käytetään pienimmän neliösumman menetelmää (OLS). Tutkimustulokset osoittavat, että Venäjän osakemarkkinoilla esiintyy Halloween-, kuunvaihde- ja viikonpäiväanomalioita. Sen sijaan kuukausi- ja juhlapäiväanomalioita ei tulosten mukaanesiinny Venäjän osakemarkkinoilla. Tulokset osoittavat lisäksi, että suurin osaanomalioista on merkittävämpiä nykyään kuin Venäjän osakemarkkinoiden ensimmäisinä vuosina. Näiden tulosten perusteella voidaan todeta, että Venäjän osakemarkkinat eivät ole vielä tehokkaat.
Resumo:
Tämän tutkielman tavoitteena on tarkastella Kiinan osakemarkkinoiden tehokkuutta ja random walk -hypoteesin voimassaoloa. Tavoitteena on myös selvittää esiintyykö viikonpäiväanomalia Kiinan osakemarkkinoilla. Tutkimusaineistona käytetään Shanghain osakepörssin A-sarjan,B-sarjan ja yhdistelmä-sarjan ja Shenzhenin yhdistelmä-sarjan indeksien päivittäisiä logaritmisoituja tuottoja ajalta 21.2.1992-30.12.2005 sekä Shenzhenin osakepörssin A-sarjan ja B-sarjan indeksien päivittäisiä logaritmisoituja tuottoja ajalta 5.10.1992-30.12.2005. Tutkimusmenetelminä käytetään neljä tilastollista menetelmää, mukaan lukien autokorrelaatiotestiä, epäparametrista runs-testiä, varianssisuhdetestiä sekä Augmented Dickey-Fullerin yksikköjuuritestiä. Viikonpäiväanomalian esiintymistä tutkitaan käyttämällä pienimmän neliösumman menetelmää (OLS). Testejä tehdään sekä koko aineistolla että kolmella erillisellä ajanjaksolla. Tämän tutkielman empiiriset tulokset tukevat aikaisempia tutkimuksia Kiinan osakemarkkinoiden tehottomuudesta. Lukuun ottamatta yksikköjuuritestien saatuja tuloksia, autokorrelaatio-, runs- ja varianssisuhdetestien perusteella random walk-hypoteesi hylättiin molempien Kiinan osakemarkkinoiden kohdalla. Tutkimustulokset osoittavat, että molemmilla osakepörssillä B-sarjan indeksien käyttäytyminenon ollut huomattavasti enemmän random walk -hypoteesin vastainen kuin A-sarjan indeksit. Paitsi B-sarjan markkinat, molempien Kiinan osakemarkkinoiden tehokkuus näytti myös paranevan vuoden 2001 markkinabuumin jälkeen. Tutkimustulokset osoittavat myös viikonpäiväanomalian esiintyvän Shanghain osakepörssillä, muttei kuitenkaan Shenzhenin osakepörssillä koko tarkasteluajanjaksolla.
Resumo:
Osakemarkkinoilta on jo useiden vuosien ajan julkaistu lukuisia tutkimuksia, joissa on esitetty havaintoja ajallisesta säännönmukaisuudesta osakkeiden hinnoissa, joita ei pystytä selittämään markkinakohtaisilla fundamenteilla. Nämä niin kutsutut kalenterianomaliat esiintyvät tyypillisesti ajallisissa käännepisteissä, kuten vuoden, kuukauden tai viikon vaihtuessa seuraavaksi. Myös erilaisten katkosten, kuten juhlapyhien, kaupankäyntirutiineissa on havaittu aiheuttavan anomalioita. Tutkimuksen tavoitteena oli tutkia osakemarkkinoilla havaittujen kalenterianomalioiden esiintymistä pohjoismaisilla sähkömarkkinoilla. Tutkitut anomaliat olivat viikonpäivä- kuukausi-, kuunvaihde- ja juhlapyhäanomalia. Näiden lisäksi tutkittiin tuottojen käyttäytymistä optioiden erääntymispäivien läheisyydessä. Yksittäisten tuotteiden sijasta tarkastelut suoritettiin sesonki- ja kvartaalituotteista muodostetuilla vuosituotteilla. Testauksessa käytettiin pienimmän neliösumman menetelmää, huomioidenheteroskedastisuuden, autokorrelaation ja multikollineaarisuuden vaikutukset. Pelkkien kalenterimuuttujien lisäksi testit suoritettiin regressiomalleilla, joissa lisäselittäjinä käytettiin spot-hintaa, päästöoikeuden hintaa ja/tai sade-ennusteita. Tarkastelujakso koostui vuosista 1998-2006.
Resumo:
Thedirect torque control (DTC) has become an accepted vector control method besidethe current vector control. The DTC was first applied to asynchronous machines,and has later been applied also to synchronous machines. This thesis analyses the application of the DTC to permanent magnet synchronous machines (PMSM). In order to take the full advantage of the DTC, the PMSM has to be properly dimensioned. Therefore the effect of the motor parameters is analysed taking the control principle into account. Based on the analysis, a parameter selection procedure is presented. The analysis and the selection procedure utilize nonlinear optimization methods. The key element of a direct torque controlled drive is the estimation of the stator flux linkage. Different estimation methods - a combination of current and voltage models and improved integration methods - are analysed. The effect of an incorrect measured rotor angle in the current model is analysed andan error detection and compensation method is presented. The dynamic performance of an earlier presented sensorless flux estimation method is made better by improving the dynamic performance of the low-pass filter used and by adapting the correction of the flux linkage to torque changes. A method for the estimation ofthe initial angle of the rotor is presented. The method is based on measuring the inductance of the machine in several directions and fitting the measurements into a model. The model is nonlinear with respect to the rotor angle and therefore a nonlinear least squares optimization method is needed in the procedure. A commonly used current vector control scheme is the minimum current control. In the DTC the stator flux linkage reference is usually kept constant. Achieving the minimum current requires the control of the reference. An on-line method to perform the minimization of the current by controlling the stator flux linkage reference is presented. Also, the control of the reference above the base speed is considered. A new estimation flux linkage is introduced for the estimation of the parameters of the machine model. In order to utilize the flux linkage estimates in off-line parameter estimation, the integration methods are improved. An adaptive correction is used in the same way as in the estimation of the controller stator flux linkage. The presented parameter estimation methods are then used in aself-commissioning scheme. The proposed methods are tested with a laboratory drive, which consists of a commercial inverter hardware with a modified software and several prototype PMSMs.
Resumo:
Recent advances in machine learning methods enable increasingly the automatic construction of various types of computer assisted methods that have been difficult or laborious to program by human experts. The tasks for which this kind of tools are needed arise in many areas, here especially in the fields of bioinformatics and natural language processing. The machine learning methods may not work satisfactorily if they are not appropriately tailored to the task in question. However, their learning performance can often be improved by taking advantage of deeper insight of the application domain or the learning problem at hand. This thesis considers developing kernel-based learning algorithms incorporating this kind of prior knowledge of the task in question in an advantageous way. Moreover, computationally efficient algorithms for training the learning machines for specific tasks are presented. In the context of kernel-based learning methods, the incorporation of prior knowledge is often done by designing appropriate kernel functions. Another well-known way is to develop cost functions that fit to the task under consideration. For disambiguation tasks in natural language, we develop kernel functions that take account of the positional information and the mutual similarities of words. It is shown that the use of this information significantly improves the disambiguation performance of the learning machine. Further, we design a new cost function that is better suitable for the task of information retrieval and for more general ranking problems than the cost functions designed for regression and classification. We also consider other applications of the kernel-based learning algorithms such as text categorization, and pattern recognition in differential display. We develop computationally efficient algorithms for training the considered learning machines with the proposed kernel functions. We also design a fast cross-validation algorithm for regularized least-squares type of learning algorithm. Further, an efficient version of the regularized least-squares algorithm that can be used together with the new cost function for preference learning and ranking tasks is proposed. In summary, we demonstrate that the incorporation of prior knowledge is possible and beneficial, and novel advanced kernels and cost functions can be used in algorithms efficiently.
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
The purpose of the thesis is to analyze whether the returns of general stock market indices of Estonia, Latvia and Lithuania follow the random walk hypothesis (RWH), and in addition, whether they are consistent with the weak-form efficiency criterion. Also the existence of the day-of-the-week anomaly is examined in the same regional markets. The data consists of daily closing quotes of the OMX Tallinn, Riga and Vilnius total return indices for the sample period from January 3, 2000 to August 28, 2009. Moreover, the full sample period is also divided into two sub-periods. The RWH is tested by applying three quantitative methods (i.e. the Augmented Dickey-Fuller unit root test, serial correlation test and non-parametric runs test). Ordinary Least Squares (OLS) regression with dummy variables is employed to detect the day-of-the-week anomalies. The random walk hypothesis (RWH) is rejected in the Estonian and Lithuanian stock markets. The Latvian stock market exhibits more efficient behaviour, although some evidence of inefficiency is also found, mostly during the first sub-period from 2000 to 2004. Day-of-the-week anomalies are detected on every stock market examined, though no longer during the later sub-period.
Resumo:
Cooling crystallization is one of the most important purification and separation techniques in the chemical and pharmaceutical industry. The product of the cooling crystallization process is always a suspension that contains both the mother liquor and the product crystals, and therefore the first process step following crystallization is usually solid-liquid separation. The properties of the produced crystals, such as their size and shape, can be affected by modifying the conditions during the crystallization process. The filtration characteristics of solid/liquid suspensions, on the other hand, are strongly influenced by the particle properties, as well as the properties of the liquid phase. It is thus obvious that the effect of the changes made to the crystallization parameters can also be seen in the course of the filtration process. Although the relationship between crystallization and filtration is widely recognized, the number of publications where these unit operations have been considered in the same context seems to be surprisingly small. This thesis explores the influence of different crystallization parameters in an unseeded batch cooling crystallization process on the external appearance of the product crystals and on the pressure filtration characteristics of the obtained product suspensions. Crystallization experiments are performed by crystallizing sulphathiazole (C9H9N3O2S2), which is a wellknown antibiotic agent, from different mixtures of water and n-propanol in an unseeded batch crystallizer. The different crystallization parameters that are studied are the composition of the solvent, the cooling rate during the crystallization experiments carried out by using a constant cooling rate throughout the whole batch, the cooling profile, as well as the mixing intensity during the batch. The obtained crystals are characterized by using an automated image analyzer and the crystals are separated from the solvent through constant pressure batch filtration experiments. Separation characteristics of the suspensions are described by means of average specific cake resistance and average filter cake porosity, and the compressibilities of the cakes are also determined. The results show that fairly large differences can be observed between the size and shape of the crystals, and it is also shown experimentally that the changes in the crystal size and shape have a direct impact on the pressure filtration characteristics of the crystal suspensions. The experimental results are utilized to create a procedure that can be used for estimating the filtration characteristics of solid-liquid suspensions according to the particle size and shape data obtained by image analysis. Multilinear partial least squares regression (N-PLS) models are created between the filtration parameters and the particle size and shape data, and the results presented in this thesis show that relatively obvious correlations can be detected with the obtained models.
Resumo:
This dissertation is based on 5 articles which deal with reaction mechanisms of the following selected industrially important organic reactions: 1. dehydrocyclization of n-butylbenzene to produce naphthalene 2. dehydrocyclization of 1-(p-tolyl)-2-methylbutane (MB) to produce 2,6-dimethylnaphthalene 3. esterification of neopentyl glycol (NPG) with different carboxylic acids to produce monoesters 4. skeletal isomerization of 1-pentene to produce 2-methyl-1-butene and 2-methyl-2-butene The results of initial- and integral-rate experiments of n-butylbenzene dehydrocyclization over selfmade chromia/alumina catalyst were applied when investigating reaction 2. Reaction 2 was performed using commercial chromia/alumina of different acidity, platina on silica and vanadium/calcium/alumina as catalysts. On all catalysts used for the dehydrocyclization, major reactions were fragmentation of MB and 1-(p-tolyl)-2-methylbutenes (MBes), dehydrogenation of MB, double bond transfer, hydrogenation and 1,6-cyclization of MBes. Minor reactions were 1,5-cyclization of MBes and methyl group fragmentation of 1,6- cyclization products. Esterification reactions of NPG were performed using three different carboxylic acids: propionic, isobutyric and 2-ethylhexanoic acid. Commercial heterogeneous gellular (Dowex 50WX2), macroreticular (Amberlyst 15) type resins and homogeneous para-toluene sulfonic acid were used as catalysts. At first NPG reacted with carboxylic acids to form corresponding monoester and water. Then monoester esterified with carboxylic acid to form corresponding diester. In disproportionation reaction two monoester molecules formed NPG and corresponding diester. All these three reactions can attain equilibrium. Concerning esterification, water was removed from the reactor in order to prevent backward reaction. Skeletal isomerization experiments of 1-pentene were performed over HZSM-22 catalyst. Isomerization reactions of three different kind were detected: double bond, cis-trans and skeletal isomerization. Minor side reaction were dimerization and fragmentation. Monomolecular and bimolecular reaction mechanisms for skeletal isomerization explained experimental results almost equally well. Pseudohomogeneous kinetic parameters of reactions 1 and 2 were estimated by usual least squares fitting. Concerning reactions 3 and 4 kinetic parameters were estimated by the leastsquares method, but also the possible cross-correlation and identifiability of parameters were determined using Markov chain Monte Carlo (MCMC) method. Finally using MCMC method, the estimation of model parameters and predictions were performed according to the Bayesian paradigm. According to the fitting results suggested reaction mechanisms explained experimental results rather well. When the possible cross-correlation and identifiability of parameters (Reactions 3 and 4) were determined using MCMC method, the parameters identified well, and no pathological cross-correlation could be seen between any parameter pair.
Resumo:
The focus of this dissertation is the motivational influences on transfer in higher education and professional training contexts. To estimate these motivational influences, the dissertation includes seven individual studies that are structured in two parts. Part I, Dimensions, aims at identifying the dimensionality of motivation to transfer and its structural relations with training-related antecedents and outcomes. Part II, Boundary Conditions, aims at testing the predictive validity of motivation theories used in contemporary training research under different study conditions. Data in this dissertation was gathered from multi-item questionnaires, which were analyzed differently in Part I and Part II. Studies in Part I employed exploratory and confirmatory factor analysis, structural equation modeling, partial least squares (PLS) path modeling, and mediation analysis. Studies in Part II used artifact distribution meta-analysis, (nested) subgroup analysis, and weighted least squares (WLS) multiple regression. Results demonstrate that motivation to transfer can be conceptualized as a three-dimensional construct, including autonomous motivation to transfer, controlled motivation to transfer, and intention to transfer, given a theoretical framework informed by expectancy theory, self-determination theory, and the theory of planned behavior. Results also demonstrate that a range of boundary conditions moderates motivational influences on transfer. To test the predictive validity of expectancy theory, social cognitive theory, and the theory of goal orientations under different study settings, a total of 17 boundary conditions were meta-analyzed, including age; assessment criterion; assessment source; attendance policy; collaboration among trainees; computer support; instruction; instrument used to measure motivation; level of education; publication type; social training context; SS/SMC bias; study setting; survey modality; type of knowledge being trained; use of a control group; and work context. Together, the findings cumulated in this thesis support the basic premise that motivation is centrally important for transfer, but that motivational influences need to be understood from a more differentiated perspective than commonly found in the literature, in order to account for several dimensions and boundary conditions. The results of this dissertation across the seven individual studies are reflected in terms of their implications for theory development and their significance for training evaluation and the design of training environments. Limitations and directions to take in future research are discussed.
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.