27 resultados para score test information matrix artificial regression
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Työn tarkoituksena oli tutkia kuinka kaasukuplat jakautuvat sellususpensioon, kun prosessiolosuhteita muutetaan. Kuplien kokojakauman avulla pyritään kartoittamaan kuinka kaasukuplat pilkkoutuvat ja onko olemassa raja-arvoa, milloin tehon lisäys ei enää pilko sellususpensiossa olevia kuplia pienemmiksi. Jakaumien avulla voidaan mahdollisesti kehittää kaasunpoistoa. Työssä selvitettiin voidaanko kameratekniikkaa käyttää kuplakokojen määrittämiseen sellusulpusta. Läpinäkymätön sellumassa tarjoaa kuvaukselle haasteellisen ympäristön. Myöskään kirjallisuudessa ei vastaavaa menetelmää aikaisemmin oltu käytetty. Kuvatusta materiaalista laskettiin kuplien halkaisijat, joita pyrittiin tarkastelemaan tilastollisesti. Tilastollinen tarkastelu toi eroja mittauspisteiden välille. Kuplien halkaisijoiden perusteella mallinnettiin kuplakokoon vaikuttavat prosessisuureet lineaarisella regressioanalyysillä. Mallinnuksen perusteella saatiinvasteisiin vaikuttavat riippumattomat muuttujat ja niiden matemaattiset malliyhtälöt. Tuloksina saatiin selville, että kuplien kokojakaumissa on eroja sekoitussäiliön eri puolilla. Sekoitussäiliössä suurten kuplien suhteellinen osuus kasvaa kaasupitoisuuden ja sakeuden noustessa. Mallinnuksen tärkeimpänä tuloksena voidaan todeta, että sakeus ja kaasutilavuus vaikuttavat kuplakokoon kasvattavasti. Kierrosnopeuden kasvattaminen pienentää kuplakokoa. Visuaalisen informaation avulla on helpompi ymmärtää kuinka kuplat käyttäytyvät.
Resumo:
TAVOITTEET: Tämän tutkielman tarkoitus on tarkastella eri toimialojen likviditeettitasoja vuosien 2007 ja 2013 välillä. Se tarkastelee myös kassanhallinnan ja likviditeetin kirjallisuutta, erilaisia likviditeettiä kuvaavia tunnuslukuja sekä asioita, joilla on vaikutusta likviditeettiin. Tämän lisäksi se tutkii informaatio ja kommunikaatio sektoria tarkemmin. DATA: Data on kerätty Orbis tietokannasta. Toimialakohtaiset keskiarvot on laskettu joko kappaleen 2 esittämillä kaavoilla tai noudettu suoraan tietokannasta. Hajonta kuvaajat on tehty Excelillä ja korrelaatio matriisi ja regressioanalyysit SAS EG:llä. TULOKSET: Tämä tutkimus esittää toimialakohtaiset keskiarvot liquidity ratiosta, solvency ratiosta sekä gearingista, kuten monista muista likviditeettiä kuvaavista tai siihen vaikuttavista tunnusluvuista. Tutkimus osoittaa, että keskimäärin likviditeetti ja maksuvalmius ovat säilyneet melko samana, mutta toimialakohtaiset muutokset ovat voimakkaita. IC sektorilla likviditeettiin vaikuttaa katetuotto, työntekijöiden määrä, liikevaihto, taseen määrä sekä maksuaika.
Resumo:
Objektive: To examine differences in the degree of self-esteem and family support among adolescents involved in different aggression roles from Ostrobothnia in Finland and to examine the relation between aggression role, family support and self-esteem. Method: A sample of 3512 adolescents in school at grades 7 and 9 from Ostrobothnia was considered for this study. The sample consisted of 1741 boys and 1771 girls with the mean age of 14.3 years and SD of 1.10 years. Aggression was measured with the Mini Direct Indirect Aggression inventory (Mini-DIA) by Österman and Björkqvist (2008), self-esteem was measured with the Rosenberg Self-Esteem Scale (RSES) by Rosenberg (1965) and family support was measured with the family support part from the Multidimensional Scale of Perceived Social Support (PSSS) by Zimet, Dahlem, Zimet and Farley (1988). Chi-square test, multivariate analysis and regression analyses were carried out. Results: The boys reported higher self-esteem and received higher family support than girls. The adolescents who were involved in aggression as victims or perpetrators reported lower self-esteem and family support than adolescents who were not involved in aggression. The regression analyses showed that family support and aggression role had significant effects on the adolescents’ self-esteem in both boys and girls. There was also an interaction effect between family support and aggression role for girls, so that the difference in self-esteem between perpetrator-victims and control group for example was higher for girls with low family support than for girls with high family support.
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
The along-scan radiometric gradient causes severe interpretation problems in Landsat images of tropical forests. It creates a decreasing trend in pixel values with the column number of the image. In practical applications it has been corrected assuming the trend to be linear within structurally similar forests. This has improved the relation between floristic and remote sensing information, but just in some cases. I use 3 Landsat images and 105 floristic inventories to test the assumption of linearity, and to examine how the gradient and linear corrections affect the relation between floristic and Landsat data. Results suggest the gradient to be linear in infrared bands. Also, the relation between floristic and Landsat data could be conditioned by the distribution of the sampling sites and the direction in which images are mosaicked. Additionally, there seems to be a conjunction between the radiometric gradient and a natural east-west vegetation gradient common in Western Amazonia. This conjunction might have enhanced artificially correlations between field and remotely-sensed information in previous studies. Linear corrections may remove such artificial enhancement, but along with true and relevant spectral information about floristic patterns, because they can´t separate the radiometric gradient from a natural one.
Resumo:
In this MA thesis, test anxiety related to English exams among Finnish upper secondary school students was studied. In addition, the ways students try to cope with test anxiety were investigated. The purpose of the study was to investigate gender differences in test anxiety, the effects of test anxiety on academic performance and relationships between test anxiety, academic performance and coping strategies. Test anxiety and coping strategies were analysed as scores of questionnaire responses. Coping strategies comprised of three categories – task-orientation and preparation, seeking social support and avoidance. Academic performance was analysed as teacher ratings of general performance in English exams. In total 67 subjects were studied. The subjects were Finnish general upper secondary school students. The data were collected by using online questionnaires. This data were mainly quantitative, but also qualitative elements were included. The quantitative data were analysed by using statistical methods. The results showed that females experienced statistically significantly more test anxiety than males. In addition, a statistically significant correlation was found between test anxiety levels and academic performance ratings of the subjects: the higher the test anxiety score, the lower the academic performance rating. A meaningful correlation was found between test anxiety and seeking social support as a coping strategy: a higher test anxiety score was related to using social support as a coping strategy. However, no relationships were found between academic performance and the three coping strategies when quantitative and qualitative data were analysed. Therefore, different coping strategies per se did not seem to be related to academic performance, but instead it was assumed that the effectiveness of coping strategies is dependent on individual differences. In order to obtain more generalisable results and to gain more understanding of test anxiety and coping with it, a larger number of subjects form different areas of Finland and of different ages could be examined in future studies. Moreover, cross-national and cross-cultural studies could provide valuable information. As a practical recommendation for educational purposes, the results of this study indicated that a more individualised approach is needed.
Resumo:
Kolmen eri hitsausliitoksen väsymisikä arvio on analysoitu monimuuttuja regressio analyysin avulla. Regression perustana on laaja S-N tietokanta joka on kerätty kirjallisuudesta. Tarkastellut liitokset ovat tasalevy liitos, krusiformi liitos ja pitkittäisripa levyssä. Muuttujina ovat jännitysvaihtelu, kuormitetun levyn paksuus ja kuormitus tapa. Paksuus effekti on käsitelty uudelleen kaikkia kolmea liitosta ajatellen. Uudelleen käsittelyn avulla on varmistettu paksuus effektin olemassa olo ennen monimuuttuja regressioon siirtymistä. Lineaariset väsymisikä yhtalöt on ajettu kolmelle hitsausliitokselle ottaen huomioon kuormitetun levyn paksuus sekä kuormitus tapa. Väsymisikä yhtalöitä on verrattu ja keskusteltu testitulosten valossa, jotka on kerätty kirjallisuudesta. Neljä tutkimustaon tehty kerättyjen väsymistestien joukosta ja erilaisia väsymisikä arvio metodeja on käytetty väsymisiän arviointiin. Tuloksia on tarkasteltu ja niistä keskusteltu oikeiden testien valossa. Tutkimuksissa on katsottu 2mm ja 6mm symmetristäpitkittäisripaa levyssä, 12.7mm epäsymmetristä pitkittäisripaa, 38mm symmetristä pitkittäisripaa vääntökuormituksessa ja 25mm/38mm kuorman kantavaa krusiformi liitosta vääntökuormituksessa. Mallinnus on tehty niin lähelle testi liitosta kuin mahdollista. Väsymisikä arviointi metodit sisältävät hot-spot metodin jossa hot-spot jännitys on laskettu kahta lineaarista ja epälineaarista ekstrapolointiakäyttäen sekä paksuuden läpi integrointia käyttäen. Lovijännitys ja murtumismekaniikka metodeja on käytetty krusiformi liitosta laskiessa.
Resumo:
Superheater corrosion causes vast annual losses for the power companies. With a reliable corrosion prediction method, the plants can be designed accordingly, and knowledge of fuel selection and determination of process conditions may be utilized to minimize superheater corrosion. Growing interest to use recycled fuels creates additional demands for the prediction of corrosion potential. Models depending on corrosion theories will fail, if relations between the inputs and the output are poorly known. A prediction model based on fuzzy logic and an artificial neural network is able to improve its performance as the amount of data increases. The corrosion rate of a superheater material can most reliably be detected with a test done in a test combustor or in a commercial boiler. The steel samples can be located in a special, temperature-controlled probe, and exposed to the corrosive environment for a desired time. These tests give information about the average corrosion potential in that environment. Samples may also be cut from superheaters during shutdowns. The analysis ofsamples taken from probes or superheaters after exposure to corrosive environment is a demanding task: if the corrosive contaminants can be reliably analyzed, the corrosion chemistry can be determined, and an estimate of the material lifetime can be given. In cases where the reason for corrosion is not clear, the determination of the corrosion chemistry and the lifetime estimation is more demanding. In order to provide a laboratory tool for the analysis and prediction, a newapproach was chosen. During this study, the following tools were generated: · Amodel for the prediction of superheater fireside corrosion, based on fuzzy logic and an artificial neural network, build upon a corrosion database developed offuel and bed material analyses, and measured corrosion data. The developed model predicts superheater corrosion with high accuracy at the early stages of a project. · An adaptive corrosion analysis tool based on image analysis, constructedas an expert system. This system utilizes implementation of user-defined algorithms, which allows the development of an artificially intelligent system for thetask. According to the results of the analyses, several new rules were developed for the determination of the degree and type of corrosion. By combining these two tools, a user-friendly expert system for the prediction and analyses of superheater fireside corrosion was developed. This tool may also be used for the minimization of corrosion risks by the design of fluidized bed boilers.
Resumo:
The objective of this thesis is to find out how information and communication technology affects the global consumption of printing and writing papers. Another objective is to find out, whether there are differences between paper grades in these effects. The empirical analysis is conducted by linear regression analysis using three sets of country-level panel data from 1990-2006. Data set of newsprint contains 95 countries, data set of uncoated woodfree paper 61 countries and data set of coated mechanical paper 42 countries. The material is based on paper consumption data of RISI’s Industry Statistics Database and on the information and communication technology data of GMID-database. Results indicate that number of Internet users has statistically significant negative effect on the consumption of newsprint and on the consumption of coated mechanical paper and number of mobile telephone users has positive effect on the consumptions of these papers. Results also indicate that information and communication technologies have only small effect on consumption of uncoated woodfree paper or no significant effect at all, but these results are more uncertain to some extent.
Resumo:
Diplomityössä luodaan viitekehys tuotetiedonhallintajärjestelmän esisuunnittelua varten. Siinä on kolme ulottuvuutta: lisäarvontuotto-, toiminnallisuus- ja ohjelmistoulottuvuus. Viitekehys auttaa- tunnistamaan lisäarvontuottokomponentit, joihin voidaan vaikuttaa tiettyjen ohjelmistoluokkien tarjoamilla tuotetiedonhallintatoiminnallisuuksilla. Viitekehyksen järjestelmäsuunnittelullista näkökulmaa hyödynnetään tutkittavissa yritystapauksissa perustuen laskentamatriisin muotoon mallinnettuihin ulottuvuuksien välisiin suhteisiin. Matriisiin syötetään lisäarvontuotto- ja toiminnallisuuskomponenttien saamat tärkeydet kohdeyrityksessä suoritetussa haastattelututkimuksessa. Matriisin tuotos on tietyn ohjelmiston soveltuvuus kyseisen yrityksen tapauksessa. Soveltuvuus on joukko tunnuslukuja, jotka analysoidaan tulostenkäsittelyvaiheessa. Soveltuvuustulokset avustavat kohdeyritystä sen valitessa lähestymistapaansa tuotetiedonhallintaan - ja kuvaavat esisuunnitellun tuotetiedonhallintajärjestelmän. Viitekehyksen rakentaminen vaatii perinpohjaisen lähestymistavan merkityksellisten lisäarvontuotto- ja toiminnallisuuskomponenttien sekä ohjelmistoluokkien määrittämiseen. Määritystyö perustuu työssä yksityiskohtaisesti laadittujen menetelmien ja komponenttiryhmitysten hyödyntämiselle. Kunkin alueen analysointi mahdollistaa viitekehyksen ja laskentamatriisin rakentamisen yhdenmukaisten määritysten perusteella. Viitekehykselle on ominaista sen muunneltavuus. Nykymuodossaan se soveltuu elektroniikka- ja high-tech yrityksille. Viitekehystä voidaan hyödyntää myös muilla toimialoilla muokkaamalla lisäarvontuottokomponentteja kunkin toimialan intressien mukaisesti. Vastaavasti analysoitava ohjelmisto voidaan valita tapauskohtaisesti. Laskentamatriisi on kuitenkin ensin päivitettävä valitun ohjelmiston kyvykkyyksillä, minkä jälkeen viitekehys voi tuottaa soveltuvuustuloksia kyseiseen yritystapaukseen perustuen
Resumo:
Työn tarkoituksena on kerätä yhteen tiedot kaikista maailmalta löytyvistä ison LOCA:n ulospuhallusvaiheen tutkimiseen käytetyistä koelaitteistoista. Työn tarkoituksena on myös antaa pohjaa päätökselle, onko tarpeellista rakentaa uusi koelaitteisto nesterakenne-vuorovaikutuskoodien laskennan validoimista varten. Ennen varsinaisen koelaitteiston rakentamista olisi tarkoituksenmukaista myös rakentaa pienempi pilottikoelaitteisto, jolla voitaisiin testata käytettäviä mittausmenetelmiä. Sopivaa mittausdataa tarvitaan uusien CFD-koodien ja rakenneanalyysikoodien kytketyn laskennan validoimisessa. Näitä koodeja voidaan käyttää esimerkiksi arvioitaessa reaktorin sisäosien rakenteellista kestävyyttä ison LOCA:n ulospuhallusvaiheen aikana. Raportti keskittyy maailmalta löytyviin koelaitteistoihin, uuden koelaitteiston suunnitteluperusteisiin sekä aiheeseen liittyviin yleisiin asioihin. Raportti ei korvaa olemassa olevia validointimatriiseja, mutta sitä voi käyttää apuna etsittäessä validointitarkoituksiin sopivaa ison LOCA:n ulospuhallusvaiheen koelaitteistoa.
Resumo:
The objective of the thesis was to explore the nature and characteristics of customer-related internal communication in a global industrial matrix organization during a specific customer relationship, and how it could be improved. The theoretical part of the study views the field of the concepts of intra-organizational information and knowledge sharing. The theoretical part also views the internal communications influences to customer relationships, its problematic, and the suggestions to improve internal communication in literature. The empirical part of the study was conducted with the Content Analysis and the Social Network Analysis as research methods. The data was collected by interviews and a questionnaire. Internal communication was observed first generally within the organization from the point of view of a certain business, and secondly, during a specific customer relationship at personal level and at departmental level. The results of the study describe the nature and characteristics of internal communication in the organization. The results give 13 suggestions for improving internal communication in the organization. Although the study has been done in one specific organization, it also offers insights for other organizations as well as managers to improve their internal communication.
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
Fluent health information flow is critical for clinical decision-making. However, a considerable part of this information is free-form text and inabilities to utilize it create risks to patient safety and cost-effective hospital administration. Methods for automated processing of clinical text are emerging. The aim in this doctoral dissertation is to study machine learning and clinical text in order to support health information flow.First, by analyzing the content of authentic patient records, the aim is to specify clinical needs in order to guide the development of machine learning applications.The contributions are a model of the ideal information flow,a model of the problems and challenges in reality, and a road map for the technology development. Second, by developing applications for practical cases,the aim is to concretize ways to support health information flow. Altogether five machine learning applications for three practical cases are described: The first two applications are binary classification and regression related to the practical case of topic labeling and relevance ranking.The third and fourth application are supervised and unsupervised multi-class classification for the practical case of topic segmentation and labeling.These four applications are tested with Finnish intensive care patient records.The fifth application is multi-label classification for the practical task of diagnosis coding. It is tested with English radiology reports.The performance of all these applications is promising. Third, the aim is to study how the quality of machine learning applications can be reliably evaluated.The associations between performance evaluation measures and methods are addressed,and a new hold-out method is introduced.This method contributes not only to processing time but also to the evaluation diversity and quality. The main conclusion is that developing machine learning applications for text requires interdisciplinary, international collaboration. Practical cases are very different, and hence the development must begin from genuine user needs and domain expertise. The technological expertise must cover linguistics,machine learning, and information systems. Finally, the methods must be evaluated both statistically and through authentic user-feedback.
Resumo:
The flow of information within modern information society has increased rapidly over the last decade. The major part of this information flow relies on the individual’s abilities to handle text or speech input. For the majority of us it presents no problems, but there are some individuals who would benefit from other means of conveying information, e.g. signed information flow. During the last decades the new results from various disciplines have all suggested towards the common background and processing for sign and speech and this was one of the key issues that I wanted to investigate further in this thesis. The basis of this thesis is firmly within speech research and that is why I wanted to design analogous test batteries for widely used speech perception tests for signers – to find out whether the results for signers would be the same as in speakers’ perception tests. One of the key findings within biology – and more precisely its effects on speech and communication research – is the mirror neuron system. That finding has enabled us to form new theories about evolution of communication, and it all seems to converge on the hypothesis that all communication has a common core within humans. In this thesis speech and sign are discussed as equal and analogical counterparts of communication and all research methods used in speech are modified for sign. Both speech and sign are thus investigated using similar test batteries. Furthermore, both production and perception of speech and sign are studied separately. An additional framework for studying production is given by gesture research using cry sounds. Results of cry sound research are then compared to results from children acquiring sign language. These results show that individuality manifests itself from very early on in human development. Articulation in adults, both in speech and sign, is studied from two perspectives: normal production and re-learning production when the apparatus has been changed. Normal production is studied both in speech and sign and the effects of changed articulation are studied with regards to speech. Both these studies are done by using carrier sentences. Furthermore, sign production is studied giving the informants possibility for spontaneous speech. The production data from the signing informants is also used as the basis for input in the sign synthesis stimuli used in sign perception test battery. Speech and sign perception were studied using the informants’ answers to questions using forced choice in identification and discrimination tasks. These answers were then compared across language modalities. Three different informant groups participated in the sign perception tests: native signers, sign language interpreters and Finnish adults with no knowledge of any signed language. This gave a chance to investigate which of the characteristics found in the results were due to the language per se and which were due to the changes in modality itself. As the analogous test batteries yielded similar results over different informant groups, some common threads of results could be observed. Starting from very early on in acquiring speech and sign the results were highly individual. However, the results were the same within one individual when the same test was repeated. This individuality of results represented along same patterns across different language modalities and - in some occasions - across language groups. As both modalities yield similar answers to analogous study questions, this has lead us to providing methods for basic input for sign language applications, i.e. signing avatars. This has also given us answers to questions on precision of the animation and intelligibility for the users – what are the parameters that govern intelligibility of synthesised speech or sign and how precise must the animation or synthetic speech be in order for it to be intelligible. The results also give additional support to the well-known fact that intelligibility in fact is not the same as naturalness. In some cases, as shown within the sign perception test battery design, naturalness decreases intelligibility. This also has to be taken into consideration when designing applications. All in all, results from each of the test batteries, be they for signers or speakers, yield strikingly similar patterns, which would indicate yet further support for the common core for all human communication. Thus, we can modify and deepen the phonetic framework models for human communication based on the knowledge obtained from the results of the test batteries within this thesis.