29 resultados para Supervised and Unsupervised Classification
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Tärkeä tehtävä ympäristön tarkkailussa on arvioida ympäristön nykyinen tila ja ihmisen siihen aiheuttamat muutokset sekä analysoida ja etsiä näiden yhtenäiset suhteet. Ympäristön muuttumista voidaan hallita keräämällä ja analysoimalla tietoa. Tässä diplomityössä on tutkittu vesikasvillisuudessa hai vainuja muutoksia käyttäen etäältä hankittua mittausdataa ja kuvan analysointimenetelmiä. Ympäristön tarkkailuun on käytetty Suomen suurimmasta järvestä Saimaasta vuosina 1996 ja 1999 otettuja ilmakuvia. Ensimmäinen kuva-analyysin vaihe on geometrinen korjaus, jonka tarkoituksena on kohdistaa ja suhteuttaa otetut kuvat samaan koordinaattijärjestelmään. Toinen vaihe on kohdistaa vastaavat paikalliset alueet ja tunnistaa kasvillisuuden muuttuminen. Kasvillisuuden tunnistamiseen on käytetty erilaisia lähestymistapoja sisältäen valvottuja ja valvomattomia tunnistustapoja. Tutkimuksessa käytettiin aitoa, kohinoista mittausdataa, minkä perusteella tehdyt kokeet antoivat hyviä tuloksia tutkimuksen onnistumisesta.
Resumo:
Fluent health information flow is critical for clinical decision-making. However, a considerable part of this information is free-form text and inabilities to utilize it create risks to patient safety and cost-effective hospital administration. Methods for automated processing of clinical text are emerging. The aim in this doctoral dissertation is to study machine learning and clinical text in order to support health information flow.First, by analyzing the content of authentic patient records, the aim is to specify clinical needs in order to guide the development of machine learning applications.The contributions are a model of the ideal information flow,a model of the problems and challenges in reality, and a road map for the technology development. Second, by developing applications for practical cases,the aim is to concretize ways to support health information flow. Altogether five machine learning applications for three practical cases are described: The first two applications are binary classification and regression related to the practical case of topic labeling and relevance ranking.The third and fourth application are supervised and unsupervised multi-class classification for the practical case of topic segmentation and labeling.These four applications are tested with Finnish intensive care patient records.The fifth application is multi-label classification for the practical task of diagnosis coding. It is tested with English radiology reports.The performance of all these applications is promising. Third, the aim is to study how the quality of machine learning applications can be reliably evaluated.The associations between performance evaluation measures and methods are addressed,and a new hold-out method is introduced.This method contributes not only to processing time but also to the evaluation diversity and quality. The main conclusion is that developing machine learning applications for text requires interdisciplinary, international collaboration. Practical cases are very different, and hence the development must begin from genuine user needs and domain expertise. The technological expertise must cover linguistics,machine learning, and information systems. Finally, the methods must be evaluated both statistically and through authentic user-feedback.
Resumo:
The purpose of the thesis is to classify suppliers and to enhance strategic purchasing in the case company. Supplier classification is conducted to fulfill the requirements of the company quality manual and international quality standards. To gain more benefit, a strategic purchasing tool, Kraljic’s purchasing portfolio and analytical hierarchy process are utilized for the base of supplier classification. Purchasing portfolio is used to give quick and easy visual insight on product group management form the viewpoint of purchasing. From the base on purchasing portfolio alternative purchasing and supplier strategies can be formed that enhance the strategic orientation of purchasing. Thus purchasing portfolio forces the company to orient on proactive and strategic purchasing. As a result a survey method for implementing purchasing portfolio in the company is developed that exploits analytical hierarchy process. Experts from the company appoint the categorization criteria and in addition, participate in the survey to categorize product groups on the portfolio. Alternative purchasing strategies are formed. Suppliers are classified depending on the importance and characteristics of the product groups supplied.
Resumo:
The overwhelming amount and unprecedented speed of publication in the biomedical domain make it difficult for life science researchers to acquire and maintain a broad view of the field and gather all information that would be relevant for their research. As a response to this problem, the BioNLP (Biomedical Natural Language Processing) community of researches has emerged and strives to assist life science researchers by developing modern natural language processing (NLP), information extraction (IE) and information retrieval (IR) methods that can be applied at large-scale, to scan the whole publicly available biomedical literature and extract and aggregate the information found within, while automatically normalizing the variability of natural language statements. Among different tasks, biomedical event extraction has received much attention within BioNLP community recently. Biomedical event extraction constitutes the identification of biological processes and interactions described in biomedical literature, and their representation as a set of recursive event structures. The 2009–2013 series of BioNLP Shared Tasks on Event Extraction have given raise to a number of event extraction systems, several of which have been applied at a large scale (the full set of PubMed abstracts and PubMed Central Open Access full text articles), leading to creation of massive biomedical event databases, each of which containing millions of events. Sinece top-ranking event extraction systems are based on machine-learning approach and are trained on the narrow-domain, carefully selected Shared Task training data, their performance drops when being faced with the topically highly varied PubMed and PubMed Central documents. Specifically, false-positive predictions by these systems lead to generation of incorrect biomolecular events which are spotted by the end-users. This thesis proposes a novel post-processing approach, utilizing a combination of supervised and unsupervised learning techniques, that can automatically identify and filter out a considerable proportion of incorrect events from large-scale event databases, thus increasing the general credibility of those databases. The second part of this thesis is dedicated to a system we developed for hypothesis generation from large-scale event databases, which is able to discover novel biomolecular interactions among genes/gene-products. We cast the hypothesis generation problem as a supervised network topology prediction, i.e predicting new edges in the network, as well as types and directions for these edges, utilizing a set of features that can be extracted from large biomedical event networks. Routine machine learning evaluation results, as well as manual evaluation results suggest that the problem is indeed learnable. This work won the Best Paper Award in The 5th International Symposium on Languages in Biology and Medicine (LBM 2013).
Resumo:
In this thesis we study the field of opinion mining by giving a comprehensive review of the available research that has been done in this topic. Also using this available knowledge we present a case study of a multilevel opinion mining system for a student organization's sales management system. We describe the field of opinion mining by discussing its historical roots, its motivations and applications as well as the different scientific approaches that have been used to solve this challenging problem of mining opinions. To deal with this huge subfield of natural language processing, we first give an abstraction of the problem of opinion mining and describe the theoretical frameworks that are available for dealing with appraisal language. Then we discuss the relation between opinion mining and computational linguistics which is a crucial pre-processing step for the accuracy of the subsequent steps of opinion mining. The second part of our thesis deals with the semantics of opinions where we describe the different ways used to collect lists of opinion words as well as the methods and techniques available for extracting knowledge from opinions present in unstructured textual data. In the part about collecting lists of opinion words we describe manual, semi manual and automatic ways to do so and give a review of the available lists that are used as gold standards in opinion mining research. For the methods and techniques of opinion mining we divide the task into three levels that are the document, sentence and feature level. The techniques that are presented in the document and sentence level are divided into supervised and unsupervised approaches that are used to determine the subjectivity and polarity of texts and sentences at these levels of analysis. At the feature level we give a description of the techniques available for finding the opinion targets, the polarity of the opinions about these opinion targets and the opinion holders. Also at the feature level we discuss the various ways to summarize and visualize the results of this level of analysis. In the third part of our thesis we present a case study of a sales management system that uses free form text and that can benefit from an opinion mining system. Using the knowledge gathered in the review of this field we provide a theoretical multi level opinion mining system (MLOM) that can perform most of the tasks needed from an opinion mining system. Based on the previous research we give some hints that many of the laborious market research tasks that are done by the sales force, which uses this sales management system, can improve their insight about their partners and by that increase the quality of their sales services and their overall results.
Resumo:
Tropical forests are sources of many ecosystem services, but these forests are vanishing rapidly. The situation is severe in Sub-Saharan Africa and especially in Tanzania. The causes of change are multidimensional and strongly interdependent, and only understanding them comprehensively helps to change the ongoing unsustainable trends of forest decline. Ongoing forest changes, their spatiality and connection to humans and environment can be studied with the methods of Land Change Science. The knowledge produced with these methods helps to make arguments about the actors, actions and causes that are behind the forest decline. In this study of Unguja Island in Zanzibar the focus is in the current forest cover and its changes between 1996 and 2009. The cover and changes are measured with often used remote sensing methods of automated land cover classification and post-classification comparison from medium resolution satellite images. Kernel Density Estimation is used to determine the clusters of change, sub-area –analysis provides information about the differences between regions, while distance and regression analyses connect changes to environmental factors. These analyses do not only explain the happened changes, but also allow building quantitative and spatial future scenarios. Similar study has not been made for Unguja and therefore it provides new information, which is beneficial for the whole society. The results show that 572 km2 of Unguja is still forested, but 0,82–1,19% of these forests are disappearing annually. Besides deforestation also vertical degradation and spatial changes are significant problems. Deforestation is most severe in the communal indigenous forests, but also agroforests are decreasing. Spatially deforestation concentrates to the areas close to the coastline, population and Zanzibar Town. Biophysical factors on the other hand do not seem to influence the ongoing deforestation process. If the current trend continues there should be approximately 485 km2 of forests remaining in 2025. Solutions to these deforestation problems should be looked from sustainable land use management, surveying and protection of the forests in risk areas and spatially targeted self-sustainable tree planting schemes.
Resumo:
Rare-earth based upconverting nanoparticles (UCNPs) have attracted much attention due to their unique luminescent properties. The ability to convert multiple photons of lower energy to ones with higher energy through an upconversion (UC) process offers a wide range of applications for UCNPs. The emission intensities and wavelengths of UCNPs are important performance characteristics, which determine the appropriate applications. However, insufficient intensities still limit the use of UCNPs; especially the efficient emission of blue and ultraviolet (UV) light via upconversion remains challenging, as these events require three or more near-infrared (NIR) photons. The aim of the study was to enhance the blue and UV upconversion emission intensities of Tm3+ doped NaYF4 nanoparticles and to demonstrate their utility in in vitro diagnostics. As the distance between the sensitizer and the activator significantly affect the energy transfer efficiency, different strategies were explored to change the local symmetry around the doped lanthanides. One important strategy is the intentional co-doping of active (participate in energy transfer) or passive (do not participate in energy transfer) impurities into the host matrix. The roles of doped passive impurities (K+ and Sc3+) in enhancing the blue and UV upconversions, as well as in influencing the intense UV upconversion emission through excess sensitization (active impurity) were studied. Additionally, the effects of both active and passive impurity doping on the morphological and optical performance of UCNPs were investigated. The applicability of UV emitting UCNPs as an internal light source for glucose sensing in a dry chemistry test strip was demonstrated. The measurements were in agreement with the traditional method based on reflectance measurements using an external UV light source. The use of UCNPs in the glucose test strip offers an alternative detection method with advantages such as control signals for minimizing errors and high penetration of the NIR excitation through the blood sample, which gives more freedom for designing the optical setup. In bioimaging, the excitation of the UCNPs in the transparent IR region of the tissue permits measurements, which are free of background fluorescence and have a high signal-to-background ratio. In addition, the narrow emission bandwidth of the UCNPs enables multiplexed detections. An array-in-well immunoassay was developed using two different UC emission colours. The differentiation between different viral infections and the classification of antibody responses were achieved based on both the position and colour of the signal. The study demonstrates the potential of spectral and spatial multiplexing in the imaging based array-in-well assays.
Resumo:
Due to the large number of characteristics, there is a need to extract the most relevant characteristicsfrom the input data, so that the amount of information lost in this way is minimal, and the classification realized with the projected data set is relevant with respect to the original data. In order to achieve this feature extraction, different statistical techniques, as well as the principal components analysis (PCA) may be used. This thesis describes an extension of principal components analysis (PCA) allowing the extraction ofa finite number of relevant features from high-dimensional fuzzy data and noisy data. PCA finds linear combinations of the original measurement variables that describe the significant variation in the data. The comparisonof the two proposed methods was produced by using postoperative patient data. Experiment results demonstrate the ability of using the proposed two methods in complex data. Fuzzy PCA was used in the classificationproblem. The classification was applied by using the similarity classifier algorithm where total similarity measures weights are optimized with differential evolution algorithm. This thesis presents the comparison of the classification results based on the obtained data from the fuzzy PCA.
Resumo:
Monissasovelluksissa on hyvin tärkeää vähentää valolähteen vaikutusta kohteen oikean värin havainnoimiseksi. Tämä on tarpeen mm. virtuaalisissa museoissa, telelääketieteessä, verkkokaupassa ja verkkorahassa. Tässä tutkielmassa on kehitetty tekniikkaa kirkkaiden heijastusten poistoon spektrikuvista. Työ sisältää katsauksen yleisen värillisen kuvan ymmärtämiseen, mihin perustuen analysoitiin erilaisia kirkkaiden heijastusten poistO'tekniikoita. Työssä kehitettiin uusi kirkkaiden heijastusten poistO'menetelmä, joka perustuu dikromaattiseen heijastus-malliin, joka kuvaa spektrisen datan objektin omaan väriin ja valaisevan valon väriin perustuen. Ehdotettu kirkkaiden heijastusten poistO'menetelmä hyödyntää erilaisia olemassaolevia menetelmiä, kuten pääkomponenttimenetelmää ja tiedon luokittelu-menetelmää. Yritys kehittää nopeasti toimiva algoritmi, joka myös suoriutuu tehtävästä hyvin, on onnistunut. Kokeet toteutettiin ehdotetun menetelmän mukaisesti ja toimivalla algoritmilla saatiin halutut lopputulokset. Edelleentyö sisältää ehdotuksia esitetyn algoritmin parantamiseksi.
Resumo:
Työn tarkoituksena on selvittää, miten käyttötietämystä hyödynnetään prosessisuunnittelussa. Tavoitteena on löytää keinoja parantaa käyttötietämyksen hallintaa suunnitteluprosessin aikana ja selvittää, vaikuttaako tämä prosessisuunnittelun laatuun.Prosessisuunnittelun laatua arvioidaan seitsemällä kriteerillä, jotka ovat investointikustannukset, käyttökustannukset, turvallisuus, ympäristövaikutukset, käytettävyys, innovatiivisuus ja aikataulu. Suunnitteluprosessi jaetaan kolmeen vaiheeseen: esisuunnitteluun, perussuunnitteluun ja detaljisuunnitteluun. Prosessisuunnittelua, investointiprojektia, prosessisuunnittelun laatukriteerejä, suunnitteluprosessin eri vaiheita ja käyttötietämyksen luokittelua tarkastellaan yleisesti. Työssä selvitettiin käyttötietämyksen hyödyntämistä Kemiralla. Aluksi muotoiltiin yleisiä väittämiä käyttötietämyksen hyödyntämisestä Kemiran ulkopuolisten eri alojen asiantuntijoiden haastattelujen perusteella. Tämän jälkeen Kemiran prosessisuunnittelijat arvioivat väittämiä. Arvioiden perusteella tehtiin johtopäätöksiä yleisesti käyttötietämyksen hyödyntämisestä prosessisuunnittelussa. Seuraavaksi haastateltiin kahdessa erityyppisessä case-projektissa mukana olleita henkilöitä ja muotoiltiin yleiset väittämät näihin projekteihin sopiviksi. Projekteissa mukana olleet henkilöt arvioivat väittämiä, ja näiden arvioiden perusteella projekteja vertailtiin keskenään. Lopussa esitetään johtopäätökset kaikkien väittämien arvioiden perusteella. Johtopäätöksenä voidaan todeta, että käyttötietämystä voidaan hyödyntää kaikissa suunnittelun vaiheissa, mutta paras hyöty saadaan perus- ja detaljisuunnittelussa. Käyttötietämyksellä voidaan vaikuttaa joihinkin prosessisuunnittelun laatukriteereihin, kuten esimerkiksi käytettävyyteen ja turvallisuuteen enemmän kuin muihin. Kemiralle suositellaan nykyisten tiedonhallintamenetelmien kehittämistä, jotta käyttötietämyksen saatavuus ja sen siirtäminen paranisi. Pr
Resumo:
Tutkimuksen tavoitteena oli selvittää MRO-tuotteiden hankinnassa käytettäviä liiketoimintasuhdemuotoja sekä huomioitavia asioita siirryttäessä kohti yhteistyötä toimittajan kanssa. Tutkimus toteutettiin kvalitatiivisena case-tutkimuksena, jossa aineiston kokoaminen pohjautui haastatteluihin, sisäisiin dokumentaatioihin sekä osallistuvaan havainnointiin. Analysointi tapahtui teoreettisen tuoteluokittelun pohjalta sekä luokitteluryhmien tarkastelulla käytännössä. Tutkimuksen keskeisimpänä tuloksena on havainto yhteistyösuhteiden käytön lisääntymisestä vakiotuotteiden hankinnassa. Tämä johtuu pyrkimyksestä suorittaa ko. tuotteiden hankinta mahdollisimman vähin resurssein, jolloin hankintojen huomio voidaan keskittää kriittisempiin tuotteisiin. MRO-tuotteissa käytettäviä yleisimpiä liiketoimintasuhteita ovat kilpailutus sekä vuosi- ja puitesopimukset. Ylläpitosopimukset ja kumppanuus-suhteet ovat mahdollisia, kun tuotteiden strateginen merkitys nousee merkittäväksi ja osapuolten välillä vallitsee korkea luottamus.
Resumo:
According to many academic researches, the development of marketing capabilities can enhance organizational performance. Similarly, downstream marketing capabilities have an important role in accomplishment the organizational goals. Particularly the downstream marketing capabilities identified in this research are the Marketing Communication, Selling, Marketing implementation, and Market information management. These four capabilities are summarized under the following abilities. First, the ability to manage customers’ opinion regarding the offered value from the organization. Second, the ability of the organization to obtain orders from new and established customers. Third, the ability of aligning and translate the marketing strategy into an operating action plan along with the deployment of the organizational resources. Forth, the continuous process of gathering and managing information about the markets. Moreover, the literature review of this research shed light on the elements that compose the downstream marketing capabilities. Specifically, this research examined the downstream processes and the required information required to control these processes based on the American Productivity and Quality Center’s Process Classification Framework. Furthermore, the literature review examined some of the technological tools that are used in marketing processes, and also some managerial implication regarding the management of the downstream marketing employees. Along with the investigation of downstream marketing capabilities, the literature review investigated the utilization and the benefits of Component Business Model and Process Classification Framework, as they are defined by the organizations that developed them. Besides this initial study, the research presents how the examined organization is using the two frameworks together by cross-referring them. Finally, the research presents the optimal deployment of the collected downstream capabilities elements in the current organizational structure. The optimal deployment has been grounded on the information collected from the literature review but also from internal documentation, provided from the examined organization. By comparing the optimal deployment and the current condition on the organization, the research exhibits some points for improvement, but also some of the projects that are currently in progress inside the organization and eventually will provide solutions to these downsides.
Resumo:
Problem of modeling of anaesthesia depth level is studied in this Master Thesis. It applies analysis of EEG signals with nonlinear dynamics theory and further classification of obtained values. The main stages of this study are the following: data preprocessing; calculation of optimal embedding parameters for phase space reconstruction; obtaining reconstructed phase portraits of each EEG signal; formation of the feature set to characterise obtained phase portraits; classification of four different anaesthesia levels basing on previously estimated features. Classification was performed with: Linear and quadratic Discriminant Analysis, k Nearest Neighbours method and online clustering. In addition, this work provides overview of existing approaches to anaesthesia depth monitoring, description of basic concepts of nonlinear dynamics theory used in this Master Thesis and comparative analysis of several different classification methods.
Resumo:
Suomen ydinenergialaki vaatii ydinenergian käytössä syntyvän ydinjätteen käsittelyn ja varastoinnin sekä loppusijoittamisen Suomeen. Fortumin ja TVO:n ydinvoimalaitoksissa syntyvä käytetty ydinpolttoaine tullaan kapseloimaan ja loppusijoittamaan Olkiluotoon rakennettavassa kapselointi- ja loppusijoituslaitoksessa. Tämän työn tavoitteena on muodostaa kokonaiskuva kapselointi- ja loppusijoituslaitoksen säteilysuojelusta aikaisemmin tehtyjen selvitysten ja suunnitelmien perusteella. Kapselointilaitoksella käytetty ydinpolttoaine suljetaan kuparikapseleihin, jotka loppusijoitetaan maan alle loppusijoituslaitoksella. Työn aluksi kuvataan loppusijoitusmenetelmä ja kapselointi- ja loppusijoituslaitoksen käyttötoiminta. Tämän jälkeen käsitellään lainsäädäntöä ja viranomaisohjeita, jotka ohjaavat ydinlaitosten säteilysuojelua. Seuraavaksi käsitellään kapselointi- ja loppusijoituslaitoksella olevia säteilylähteitä. Lisäksi työssä käsitellään kapselointi- ja loppusijoituslaitokselle suunniteltua valvonta-aluetta ja sen säteilyolosuhteiden mukaista vyöhykejakoa. Työssä saatiin tulokseksi kokonaiskuva kapselointi- ja loppusijoituslaitoksen säteilysuojelusta. Kokonaiskuvan muodostamisen lisäksi laadittiin alustavia suunnitelmia käyttötoiminnan säteilysuojelun järjestämisestä. Lisäksi laadittiin ehdotuksia valvonta-alueen tarkemmista rajoista loppusijoituslaitoksella sekä havaittiin laitosten säteilysuojeluun liittyviä ongelmia ja esitettiin ratkaisuja niihin. Ongelmaksi osoittautui muun muassa, että kapselointi- ja loppusijoituslaitoksen valvonta-alueiden luonteiden eroa ei ollut huomioitu suunnitelmissa. Lisäksi todettiin, että nykyisin ydinlaitoksilla käytössä oleva valvonta-alueen vyöhykejako ei vastaa kapselointi- ja loppusijoituslaitosten tarpeita. Näihin esitettiin ratkaisuiksi laitosten välille perustettavaa kenkärajaa ja uuden korkeamman säteilyvyöhykkeen käyttöönottoa.
Resumo:
Customer satisfaction should be the main focus for all of the parts of the business. Usually supply chain behind the business is in a key role when this focus is pursued especially in repair service business. When focusing on the materials that are needed to make repairs to equipment under service contracts, the time aspect of quality is critical. Do late deliveries from supplier have an effect on the service performance of repairs when distribution center of a centralized purchasing unit is acting as a buffer between suppliers and repair service business? And if so, how should the improvement efforts be prioritized? These are the two main questions that this thesis focuses on. Correlation and linear regression was tested between service levels of supplier and distribution center. Percentage of on-time deliveries were compared to outbound delivery service level. It was found that there is statistically significant correlation between inbound and outbound operations success. The other main question of the thesis, improvement prioritization, was answered by creating material availability based supplier classification and additional to that, by developing the decision process for the analysis of most critical suppliers. This was built on a basis of previous supplier and material classification methods.