24 resultados para Semi-supervised clustering
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
This thesis is about detection of local image features. The research topic belongs to the wider area of object detection, which is a machine vision and pattern recognition problem where an object must be detected (located) in an image. State-of-the-art object detection methods often divide the problem into separate interest point detection and local image description steps, but in this thesis a different technique is used, leading to higher quality image features which enable more precise localization. Instead of using interest point detection the landmark positions are marked manually. Therefore, the quality of the image features is not limited by the interest point detection phase and the learning of image features is simplified. The approach combines both interest point detection and local description into one phase for detection. Computational efficiency of the descriptor is therefore important, leaving out many of the commonly used descriptors as unsuitably heavy. Multiresolution Gabor features has been the main descriptor in this thesis and improving their efficiency is a significant part. Actual image features are formed from descriptors by using a classifierwhich can then recognize similar looking patches in new images. The main classifier is based on Gaussian mixture models. Classifiers are used in one-class classifier configuration where there are only positive training samples without explicit background class. The local image feature detection method has been tested with two freely available face detection databases and a proprietary license plate database. The localization performance was very good in these experiments. Other applications applying the same under-lying techniques are also presented, including object categorization and fault detection.
Resumo:
The purpose of this study was to analyse the nursing student-patient relationship and factors associated with this relationship from the point of view of both students and patients, and to identify factors that predict the type of relationship. The ultimate goal is to improve supervised clinical practicum with a view to supporting students in their reciprocal collaborative relationships with patients, increase their preparedness to meet patients’ health needs, and thus to enhance the quality of patient care. The study was divided into two phases. In the first phase (1999-2005), a literature review concerning the student-patient relationship was conducted (n=104 articles) and semi-structured interviews carried out with nursing students (n=30) and internal medicine patients (n=30). Data analysis was by means of qualitative content analysis and Student-Patient Relationship Scales, which were specially developed for this research. In the second phase (2005-2007), the data were collected by SPR scales among nursing students (n=290) and internal medicine patients (n=242). The data were analysed statistically by SPSS 12.0 software. The results revealed three types of student-patient relationship: a mechanistic relationship focusing on the student’s learning needs; an authoritative relationship focusing on what the student assumes is in the patient’s best interest; and a facilitative relationship focusing on the common good of both student and patient. Students viewed their relationship with patients more often as facilitative and authoritative than mechanistic, while in patients’ assessments the authoritative relationship occurred most frequently and the facilitative relationship least frequently. Furthermore, students’ and patients’ views on their relationships differed significantly. A number of background factors, contextual factors and consequences of the relationship were found to be associated with the type of relationship. In the student data, factors that predicted the type of relationship were age, current year of study and support received in the relationship with patient. The higher the student’s age, the more likely the relationship with the patient was facilitative. Fourth year studies and the support of a person other than a supervisor were significantly associated with an authoritative relationship. Among patients, several factors were found to predict the type of nursing student-patient relationships. Significant factors associated with a facilitative relationship were university-level education, several previous hospitalizations, admission to hospital for a medical problem, experience of caring for an ill family member and patient’s positive perception of atmosphere during collaboration and of student’s personal and professional growth. In patients, positive perceptions of student’s personal and professional attributes and patient’s improved health and a greater commitment to self-care, on the other hand, were significantly associated with an authoritative relationship, whereas positive perceptions of one’s own attributes as a patient were significantly associated with a mechanistic relationship. It is recommended that further research on the student-patient relationship and related factors should focus on questions of content, methodology and education.
Resumo:
The main objective of this study is to assess the potential of the information technology industry in the Saint Petersburg area to become one of the new key industries in the Russian economy. To achieve this objective, the study analyzes especially the international competitiveness of the industry and the conditions for clustering. Russia is currently heavily dependent on its natural resources, which are the main source of its recent economic growth. In order to achieve good long-term economic performance, Russia needs diversification in its well-performing industries in addition to the ones operating in the field of natural resources. The Russian government has acknowledged this and started special initiatives to promote such other industries as information technology and nanotechnology. An interesting industry that is basically less than 20 years old and fast growing in Russia, is information technology. Information technology activities and markets are mainly concentrated in Russia’s two biggest cities, Moscow and Saint Petersburg, and areas around them. The information technology industry in the Saint Petersburg area, although smaller than Moscow, is especially dynamic and is gaining increasing foreign company presence. However, the industry is not yet internationally competitive as it lacks substantial and sustainable competitive advantages. The industry is also merely a potential global information technology cluster, as it lacks the competitive edge and a wide supplier and manufacturing base and other related parts of the whole information technology value system. Alone, the industry will not become a key industry in Russia, but it will, on the other hand, have an important supporting role for the development of other industries. The information technology market in the Saint Petersburg area is already large and if more tightly integrated to Moscow, they will together form a huge and still growing market sufficient for most companies operating in Russia currently and in the future. Therefore, the potential of information technology inside Russia is immense.
Resumo:
The main objective of this study is to assess the potential of the information technology industry in the Saint Petersburg area to become one of the new key industries in the Russian economy. To achieve this objective, the study analyzes especially the international competitiveness of the industry and the conditions for clustering. Russia is currently heavily dependent on its natural resources, which are the main source of its recent economic growth. In order to achieve good long-term economic performance, Russia needs diversification in its well-performing industries in addition to the ones operating in the field of natural resources. The Russian government has acknowledged this and started special initiatives to promote such other industries as information technology and nanotechnology. An interesting industry that is basically less than 20 years old and fast growing in Russia, is information technology. Information technology activities and markets are mainly concentrated in Russia’s two biggest cities, Moscow and Saint Petersburg, and areas around them. The information technology industry in the Saint Petersburg area, although smaller than Moscow, is especially dynamic and is gaining increasing foreign company presence. However, the industry is not yet internationally competitive as it lacks substantial and sustainable competitive advantages. The industry is also merely a potential global information technology cluster, as it lacks the competitive edge and a wide supplier and manufacturing base and other related parts of the whole information technology value system. Alone, the industry will not become a key industry in Russia, but it will, on the other hand, have an important supporting role for the development of other industries. The information technology market in the Saint Petersburg area is already large and if more tightly integrated to Moscow, they will together form a huge and still growing market sufficient for most companies operating in Russia currently and in the future. Therefore, the potential of information technology inside Russia is immense.
Resumo:
In this thesis we study the field of opinion mining by giving a comprehensive review of the available research that has been done in this topic. Also using this available knowledge we present a case study of a multilevel opinion mining system for a student organization's sales management system. We describe the field of opinion mining by discussing its historical roots, its motivations and applications as well as the different scientific approaches that have been used to solve this challenging problem of mining opinions. To deal with this huge subfield of natural language processing, we first give an abstraction of the problem of opinion mining and describe the theoretical frameworks that are available for dealing with appraisal language. Then we discuss the relation between opinion mining and computational linguistics which is a crucial pre-processing step for the accuracy of the subsequent steps of opinion mining. The second part of our thesis deals with the semantics of opinions where we describe the different ways used to collect lists of opinion words as well as the methods and techniques available for extracting knowledge from opinions present in unstructured textual data. In the part about collecting lists of opinion words we describe manual, semi manual and automatic ways to do so and give a review of the available lists that are used as gold standards in opinion mining research. For the methods and techniques of opinion mining we divide the task into three levels that are the document, sentence and feature level. The techniques that are presented in the document and sentence level are divided into supervised and unsupervised approaches that are used to determine the subjectivity and polarity of texts and sentences at these levels of analysis. At the feature level we give a description of the techniques available for finding the opinion targets, the polarity of the opinions about these opinion targets and the opinion holders. Also at the feature level we discuss the various ways to summarize and visualize the results of this level of analysis. In the third part of our thesis we present a case study of a sales management system that uses free form text and that can benefit from an opinion mining system. Using the knowledge gathered in the review of this field we provide a theoretical multi level opinion mining system (MLOM) that can perform most of the tasks needed from an opinion mining system. Based on the previous research we give some hints that many of the laborious market research tasks that are done by the sales force, which uses this sales management system, can improve their insight about their partners and by that increase the quality of their sales services and their overall results.
Resumo:
Arkit: A4 [B2].