11 resultados para Structured methods
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
Investment decision-making on far-reaching innovation ideas is one of the key challenges practitioners and academics face in the field of innovation management. However, the management practices and theories strongly rely on evaluation systems that do not fit in well with this setting. These systems and practices normally cannot capture the value of future opportunities under high uncertainty because they ignore the firm’s potential for growth and flexibility. Real options theory and options-based methods have been offered as a solution to facilitate decision-making on highly uncertain investment objects. Much of the uncertainty inherent in these investment objects is attributable to unknown future events. In this setting, real options theory and methods have faced some challenges. First, the theory and its applications have largely been limited to market-priced real assets. Second, the options perspective has not proved as useful as anticipated because the tools it offers are perceived to be too complicated for managerial use. Third, there are challenges related to the type of uncertainty existing real options methods can handle: they are primarily limited to parametric uncertainty. Nevertheless, the theory is considered promising in the context of far-reaching and strategically important innovation ideas. The objective of this dissertation is to clarify the potential of options-based methodology in the identification of innovation opportunities. The constructive research approach gives new insights into the development potential of real options theory under non-parametric and closeto- radical uncertainty. The distinction between real options and strategic options is presented as an explanans for the discovered limitations of the theory. The findings offer managers a new means of assessing future innovation ideas based on the frameworks constructed during the course of the study.
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
Phenomena in cyber domain, especially threats to security and privacy, have proven an increasingly heated topic addressed by different writers and scholars at an increasing pace – both nationally and internationally. However little public research has been done on the subject of cyber intelligence. The main research question of the thesis was: To what extent is the applicability of cyber intelligence acquisition methods circumstantial? The study was conducted in sequential a manner, starting with defining the concept of intelligence in cyber domain and identifying its key attributes, followed by identifying the range of intelligence methods in cyber domain, criteria influencing their applicability, and types of operatives utilizing cyber intelligence. The methods and criteria were refined into a hierarchical model. The existing conceptions of cyber intelligence were mapped through an extensive literature study on a wide variety of sources. The established understanding was further developed through 15 semi-structured interviews with experts of different backgrounds, whose wide range of points of view proved to substantially enhance the perspective on the subject. Four of the interviewed experts participated in a relatively extensive survey based on the constructed hierarchical model on cyber intelligence that was formulated in to an AHP hierarchy and executed in the Expert Choice Comparion online application. It was concluded that Intelligence in cyber domain is an endorsing, cross-cutting intelligence discipline that adds value to all aspects of conventional intelligence and furthermore that it bears a substantial amount of characteristic traits – both advantageous and disadvantageous – and furthermore that the applicability of cyber intelligence methods is partly circumstantially limited.
Resumo:
Tiivistelmä: Harvennusmenetelmien vertailu ojitetun turvemaan männikössä. Simulointitutkimus
Resumo:
Summary