24 resultados para Least squares method
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
The purpose of the thesis is to analyze whether the returns of general stock market indices of Estonia, Latvia and Lithuania follow the random walk hypothesis (RWH), and in addition, whether they are consistent with the weak-form efficiency criterion. Also the existence of the day-of-the-week anomaly is examined in the same regional markets. The data consists of daily closing quotes of the OMX Tallinn, Riga and Vilnius total return indices for the sample period from January 3, 2000 to August 28, 2009. Moreover, the full sample period is also divided into two sub-periods. The RWH is tested by applying three quantitative methods (i.e. the Augmented Dickey-Fuller unit root test, serial correlation test and non-parametric runs test). Ordinary Least Squares (OLS) regression with dummy variables is employed to detect the day-of-the-week anomalies. The random walk hypothesis (RWH) is rejected in the Estonian and Lithuanian stock markets. The Latvian stock market exhibits more efficient behaviour, although some evidence of inefficiency is also found, mostly during the first sub-period from 2000 to 2004. Day-of-the-week anomalies are detected on every stock market examined, though no longer during the later sub-period.
Resumo:
Cooling crystallization is one of the most important purification and separation techniques in the chemical and pharmaceutical industry. The product of the cooling crystallization process is always a suspension that contains both the mother liquor and the product crystals, and therefore the first process step following crystallization is usually solid-liquid separation. The properties of the produced crystals, such as their size and shape, can be affected by modifying the conditions during the crystallization process. The filtration characteristics of solid/liquid suspensions, on the other hand, are strongly influenced by the particle properties, as well as the properties of the liquid phase. It is thus obvious that the effect of the changes made to the crystallization parameters can also be seen in the course of the filtration process. Although the relationship between crystallization and filtration is widely recognized, the number of publications where these unit operations have been considered in the same context seems to be surprisingly small. This thesis explores the influence of different crystallization parameters in an unseeded batch cooling crystallization process on the external appearance of the product crystals and on the pressure filtration characteristics of the obtained product suspensions. Crystallization experiments are performed by crystallizing sulphathiazole (C9H9N3O2S2), which is a wellknown antibiotic agent, from different mixtures of water and n-propanol in an unseeded batch crystallizer. The different crystallization parameters that are studied are the composition of the solvent, the cooling rate during the crystallization experiments carried out by using a constant cooling rate throughout the whole batch, the cooling profile, as well as the mixing intensity during the batch. The obtained crystals are characterized by using an automated image analyzer and the crystals are separated from the solvent through constant pressure batch filtration experiments. Separation characteristics of the suspensions are described by means of average specific cake resistance and average filter cake porosity, and the compressibilities of the cakes are also determined. The results show that fairly large differences can be observed between the size and shape of the crystals, and it is also shown experimentally that the changes in the crystal size and shape have a direct impact on the pressure filtration characteristics of the crystal suspensions. The experimental results are utilized to create a procedure that can be used for estimating the filtration characteristics of solid-liquid suspensions according to the particle size and shape data obtained by image analysis. Multilinear partial least squares regression (N-PLS) models are created between the filtration parameters and the particle size and shape data, and the results presented in this thesis show that relatively obvious correlations can be detected with the obtained models.
Resumo:
The focus of this dissertation is the motivational influences on transfer in higher education and professional training contexts. To estimate these motivational influences, the dissertation includes seven individual studies that are structured in two parts. Part I, Dimensions, aims at identifying the dimensionality of motivation to transfer and its structural relations with training-related antecedents and outcomes. Part II, Boundary Conditions, aims at testing the predictive validity of motivation theories used in contemporary training research under different study conditions. Data in this dissertation was gathered from multi-item questionnaires, which were analyzed differently in Part I and Part II. Studies in Part I employed exploratory and confirmatory factor analysis, structural equation modeling, partial least squares (PLS) path modeling, and mediation analysis. Studies in Part II used artifact distribution meta-analysis, (nested) subgroup analysis, and weighted least squares (WLS) multiple regression. Results demonstrate that motivation to transfer can be conceptualized as a three-dimensional construct, including autonomous motivation to transfer, controlled motivation to transfer, and intention to transfer, given a theoretical framework informed by expectancy theory, self-determination theory, and the theory of planned behavior. Results also demonstrate that a range of boundary conditions moderates motivational influences on transfer. To test the predictive validity of expectancy theory, social cognitive theory, and the theory of goal orientations under different study settings, a total of 17 boundary conditions were meta-analyzed, including age; assessment criterion; assessment source; attendance policy; collaboration among trainees; computer support; instruction; instrument used to measure motivation; level of education; publication type; social training context; SS/SMC bias; study setting; survey modality; type of knowledge being trained; use of a control group; and work context. Together, the findings cumulated in this thesis support the basic premise that motivation is centrally important for transfer, but that motivational influences need to be understood from a more differentiated perspective than commonly found in the literature, in order to account for several dimensions and boundary conditions. The results of this dissertation across the seven individual studies are reflected in terms of their implications for theory development and their significance for training evaluation and the design of training environments. Limitations and directions to take in future research are discussed.
Resumo:
Singular Value Decomposition (SVD), Principal Component Analysis (PCA) and Multiple Linear Regression (MLR) are some of the mathematical pre- liminaries that are discussed prior to explaining PLS and PCR models. Both PLS and PCR are applied to real spectral data and their di erences and similarities are discussed in this thesis. The challenge lies in establishing the optimum number of components to be included in either of the models but this has been overcome by using various diagnostic tools suggested in this thesis. Correspondence analysis (CA) and PLS were applied to ecological data. The idea of CA was to correlate the macrophytes species and lakes. The di erences between PLS model for ecological data and PLS for spectral data are noted and explained in this thesis. i
Resumo:
Experiential marketing is increasingly seen as a new magical key to consumers’ hearts. Brands are turning brick-and-mortar stores into state of the art retail spaces where memorable experiences and strong brand relationships are hoped to be born. Around the globe, several brands have opened up a special format of stores – the experience store. Although many speculations on the positive effects of experiences have been presented, few studies have provided empirical, quantified evidence for the link between store experiences and brand success. In consequence, research was needed to find out whether experience stores truly are so special. The purpose of this thesis was to investigate whether store experiences are capable of building brands and influencing store performance. For this purpose, empirical research was conducted in the Samsung Experience Store Helsinki. As main constructs of the study, store experience, brand equity, store performance, and product class involvement were measured, along with relevant background variables. Data was collected with an electronic survey from actual customers of the store, resulting in a sample of 131 respondents. Partial least squares structural equations modeling (PLS) was used for the analysis of the research model. Also, regression analysis was conducted to account for mediation and moderation effects. The results showed that store experiences do positively influence first, store performance, and second, separate dimensions of brand equity (that is, brand awareness, brand personality, and brand loyalty). Also, the effect of store experiences on store performance was found to be mediated by brand equity. Interestingly, customers’ product class involvement was detected to moderate the effect of store experience on store performance. That is, those who were highly involved with electronics had greater store experiences, and also displayed a stronger linkage between store experience and store performance. The results encourage marketers to continue with efforts to create great experiences for their customers. Experience stores can – and should be seen – as both powerful brand building tools and profitable sales channels. The creation of exceptional experiences can act as an important function of physical stores in the face of severe online competition.
Resumo:
This work investigates theoretical properties of symmetric and anti-symmetric kernels. First chapters give an overview of the theory of kernels used in supervised machine learning. Central focus is on the regularized least squares algorithm, which is motivated as a problem of function reconstruction through an abstract inverse problem. Brief review of reproducing kernel Hilbert spaces shows how kernels define an implicit hypothesis space with multiple equivalent characterizations and how this space may be modified by incorporating prior knowledge. Mathematical results of the abstract inverse problem, in particular spectral properties, pseudoinverse and regularization are recollected and then specialized to kernels. Symmetric and anti-symmetric kernels are applied in relation learning problems which incorporate prior knowledge that the relation is symmetric or anti-symmetric, respectively. Theoretical properties of these kernels are proved in a draft this thesis is based on and comprehensively referenced here. These proofs show that these kernels can be guaranteed to learn only symmetric or anti-symmetric relations, and they can learn any relations relative to the original kernel modified to learn only symmetric or anti-symmetric parts. Further results prove spectral properties of these kernels, central result being a simple inequality for the the trace of the estimator, also called the effective dimension. This quantity is used in learning bounds to guarantee smaller variance.
Resumo:
The aim of this study was to contribute to the current knowledge-based theory by focusing on a research gap that exists in the empirically proven determination of the simultaneous but differentiable effects of intellectual capital (IC) assets and knowledge management (KM) practices on organisational performance (OP). The analysis was built on the past research and theoreticised interactions between the latent constructs specified using the survey-based items that were measured from a sample of Finnish companies for IC and KM and the dependent construct for OP determined using information available from financial databases. Two widely used and commonly recommended measures in the literature on management science, i.e. the return on total assets (ROA) and the return on equity (ROE), were calculated for OP. Thus the investigation of the relationship between IC and KM impacting OP in relation to the hypotheses founded was possible to conduct using objectively derived performance indicators. Using financial OP measures also strengthened the dynamic features of data needed in analysing simultaneous and causal dependences between the modelled constructs specified using structural path models. The estimates were obtained for the parameters of structural path models using a partial least squares-based regression estimator. Results showed that the path dependencies between IC and OP or KM and OP were always insignificant when analysed separate to any other interactions or indirect effects caused by simultaneous modelling and regardless of the OP measure used that was either ROA or ROE. The dependency between the constructs for KM and IC appeared to be very strong and was always significant when modelled simultaneously with other possible interactions between the constructs and using either ROA or ROE to define OP. This study, however, did not find statistically unambiguous evidence for proving the hypothesised causal mediation effects suggesting, for instance, that the effects of KM practices on OP are mediated by the IC assets. Due to the fact that some indication about the fluctuations of causal effects was assessed, it was concluded that further studies are needed for verifying the fundamental and likely hidden causal effects between the constructs of interest. Therefore, it was also recommended that complementary modelling and data processing measures be conducted for elucidating whether the mediation effects occur between IC, KM and OP, the verification of which requires further investigations of measured items and can be build on the findings of this study.
Resumo:
The present world energy production is heavily relying on the combustion of solid fuels like coals, peat, biomass, municipal solid waste, whereas the share of renewable fuels is anticipated to increase in the future to mitigate climate change. In Finland, peat and wood are widely used for energy production. In any case, the combustion of solid fuels results in generation of several types of thermal conversion residues, such as bottom ash, fly ash, and boiler slag. The predominant residue type is determined by the incineration technology applied, while its composition is primarily relevant to the composition of fuels combusted. An extensive research has been conducted on technical suitability of ash for multiple recycling methods. Most of attention was drawn to the recycling of the coal combustion residues, as coal is the primary solid fuel consumed globally. The recycling methods of coal residues include utilization in a cement industry, in concrete manufacturing, and mine backfilling, to name few. Biomass combustion residues were also studied to some extent with forest fertilization, road construction, and road stabilization being the predominant utilization options. Lastly, residues form municipal solid waste incineration attracted more attention recently following the growing number of waste incineration plants globally. The recycling methods of waste incineration residues are the most limited due to its hazardous nature and varying composition, and include, among others, landfill construction, road construction, mine backfilling. In the study, environmental and economic aspects of multiple recycling options of thermal conversion residues generated within a case-study area were studied. The case-study area was South-East Finland. The environmental analysis was performed using an internationally recognized methodology — life cycle assessment. Economic assessment was conducted applying a widely used methodology — cost-benefit analysis. Finally, the results of the analyses were combined to enable easier comparison of the recycling methods. The recycling methods included the use of ash in forest fertilization, road construction, road stabilization, and landfill construction. Ash landfilling was set as a baseline scenario. Quantitative data about the amounts of ash generated and its composition was obtained from companies, their environmental reports, technical reports and other previously published literature. Overall, the amount of ash in the case-study area was 101 700 t. However, the data about 58 400 t of fly ash and 35 100 t of bottom ash and boiler slag were included in the study due to lack of data about leaching of heavy metals in some cases. The recycling methods were modelled according to the scientific studies published previously. Overall, the results of the study indicated that ash utilization for fertilization and neutralization of 17 600 ha of forest was the most economically beneficial method, which resulted in the net present value increase by 58% compared to ash landfilling. Regarding the environmental impact, the use of ash in the construction of 11 km of roads was the most attractive method with decreased environmental impact of 13% compared to ash landfilling. The least preferred method was the use of ash for landfill construction since it only enabled 11% increase of net present value, while inducing additional 1% of negative impact on the environment. Therefore, a following recycling route was proposed in the study. Where possible and legally acceptable, recycle fly and bottom ash for forest fertilization, which has strictest requirements out of all studied methods. If the quality of fly ash is not suitable for forest fertilization, then it should be utilized, first, in paved road construction, second, in road stabilization. Bottom ash not suitable for forest fertilization, as well as boiler slag, should be used in landfill construction. Landfilling should only be practiced when recycling by either of the methods is not possible due to legal requirements or there is not enough demand on the market. Current demand on ash and possible changes in the future were assessed in the study. Currently, the area of forest fertilized in the case-study are is only 451 ha, whereas about 17 600 ha of forest could be fertilized with ash generated in the region. Provided that the average forest fertilizing values in Finland are higher and the area treated with fellings is about 40 000 ha, the amount of ash utilized in forest fertilization could be increased. Regarding road construction, no new projects launched by the Center of Economic Development, Transport and the Environment in the case-study area were identified. A potential application can be found in the construction of private roads. However, no centralized data about such projects is available. The use of ash in stabilization of forest roads is not expected to increased in the future with a current downwards trend in the length of forest roads built. Finally, the use of ash in landfill construction is not a promising option due to the reducing number of landfills in operation in Finland.