20 resultados para statistical data analysis
em Helda - Digital Repository of University of Helsinki
Resumo:
In this Thesis, we develop theory and methods for computational data analysis. The problems in data analysis are approached from three perspectives: statistical learning theory, the Bayesian framework, and the information-theoretic minimum description length (MDL) principle. Contributions in statistical learning theory address the possibility of generalization to unseen cases, and regression analysis with partially observed data with an application to mobile device positioning. In the second part of the Thesis, we discuss so called Bayesian network classifiers, and show that they are closely related to logistic regression models. In the final part, we apply the MDL principle to tracing the history of old manuscripts, and to noise reduction in digital signals.
Resumo:
Accelerator mass spectrometry (AMS) is an ultrasensitive technique for measuring the concentration of a single isotope. The electric and magnetic fields of an electrostatic accelerator system are used to filter out other isotopes from the ion beam. The high velocity means that molecules can be destroyed and removed from the measurement background. As a result, concentrations down to one atom in 10^16 atoms are measurable. This thesis describes the construction of the new AMS system in the Accelerator Laboratory of the University of Helsinki. The system is described in detail along with the relevant ion optics. System performance and some of the 14C measurements done with the system are described. In a second part of the thesis, a novel statistical model for the analysis of AMS data is presented. Bayesian methods are used in order to make the best use of the available information. In the new model, instrumental drift is modelled with a continuous first-order autoregressive process. This enables rigorous normalization to standards measured at different times. The Poisson statistical nature of a 14C measurement is also taken into account properly, so that uncertainty estimates are much more stable. It is shown that, overall, the new model improves both the accuracy and the precision of AMS measurements. In particular, the results can be improved for samples with very low 14C concentrations or measured only a few times.
Resumo:
The aim of this thesis is to develop a fully automatic lameness detection system that operates in a milking robot. The instrumentation, measurement software, algorithms for data analysis and a neural network model for lameness detection were developed. Automatic milking has become a common practice in dairy husbandry, and in the year 2006 about 4000 farms worldwide used over 6000 milking robots. There is a worldwide movement with the objective of fully automating every process from feeding to milking. Increase in automation is a consequence of increasing farm sizes, the demand for more efficient production and the growth of labour costs. As the level of automation increases, the time that the cattle keeper uses for monitoring animals often decreases. This has created a need for systems for automatically monitoring the health of farm animals. The popularity of milking robots also offers a new and unique possibility to monitor animals in a single confined space up to four times daily. Lameness is a crucial welfare issue in the modern dairy industry. Limb disorders cause serious welfare, health and economic problems especially in loose housing of cattle. Lameness causes losses in milk production and leads to early culling of animals. These costs could be reduced with early identification and treatment. At present, only a few methods for automatically detecting lameness have been developed, and the most common methods used for lameness detection and assessment are various visual locomotion scoring systems. The problem with locomotion scoring is that it needs experience to be conducted properly, it is labour intensive as an on-farm method and the results are subjective. A four balance system for measuring the leg load distribution of dairy cows during milking in order to detect lameness was developed and set up in the University of Helsinki Research farm Suitia. The leg weights of 73 cows were successfully recorded during almost 10,000 robotic milkings over a period of 5 months. The cows were locomotion scored weekly, and the lame cows were inspected clinically for hoof lesions. Unsuccessful measurements, caused by cows standing outside the balances, were removed from the data with a special algorithm, and the mean leg loads and the number of kicks during milking was calculated. In order to develop an expert system to automatically detect lameness cases, a model was needed. A probabilistic neural network (PNN) classifier model was chosen for the task. The data was divided in two parts and 5,074 measurements from 37 cows were used to train the model. The operation of the model was evaluated for its ability to detect lameness in the validating dataset, which had 4,868 measurements from 36 cows. The model was able to classify 96% of the measurements correctly as sound or lame cows, and 100% of the lameness cases in the validation data were identified. The number of measurements causing false alarms was 1.1%. The developed model has the potential to be used for on-farm decision support and can be used in a real-time lameness monitoring system.
Resumo:
This work belongs to the field of computational high-energy physics (HEP). The key methods used in this thesis work to meet the challenges raised by the Large Hadron Collider (LHC) era experiments are object-orientation with software engineering, Monte Carlo simulation, the computer technology of clusters, and artificial neural networks. The first aspect discussed is the development of hadronic cascade models, used for the accurate simulation of medium-energy hadron-nucleus reactions, up to 10 GeV. These models are typically needed in hadronic calorimeter studies and in the estimation of radiation backgrounds. Various applications outside HEP include the medical field (such as hadron treatment simulations), space science (satellite shielding), and nuclear physics (spallation studies). Validation results are presented for several significant improvements released in Geant4 simulation tool, and the significance of the new models for computing in the Large Hadron Collider era is estimated. In particular, we estimate the ability of the Bertini cascade to simulate Compact Muon Solenoid (CMS) hadron calorimeter HCAL. LHC test beam activity has a tightly coupled cycle of simulation-to-data analysis. Typically, a Geant4 computer experiment is used to understand test beam measurements. Thus an another aspect of this thesis is a description of studies related to developing new CMS H2 test beam data analysis tools and performing data analysis on the basis of CMS Monte Carlo events. These events have been simulated in detail using Geant4 physics models, full CMS detector description, and event reconstruction. Using the ROOT data analysis framework we have developed an offline ANN-based approach to tag b-jets associated with heavy neutral Higgs particles, and we show that this kind of NN methodology can be successfully used to separate the Higgs signal from the background in the CMS experiment.
Resumo:
Aims: Develop and validate tools to estimate residual noise covariance in Planck frequency maps. Quantify signal error effects and compare different techniques to produce low-resolution maps. Methods: We derive analytical estimates of covariance of the residual noise contained in low-resolution maps produced using a number of map-making approaches. We test these analytical predictions using Monte Carlo simulations and their impact on angular power spectrum estimation. We use simulations to quantify the level of signal errors incurred in different resolution downgrading schemes considered in this work. Results: We find an excellent agreement between the optimal residual noise covariance matrices and Monte Carlo noise maps. For destriping map-makers, the extent of agreement is dictated by the knee frequency of the correlated noise component and the chosen baseline offset length. The significance of signal striping is shown to be insignificant when properly dealt with. In map resolution downgrading, we find that a carefully selected window function is required to reduce aliasing to the sub-percent level at multipoles, ell > 2Nside, where Nside is the HEALPix resolution parameter. We show that sufficient characterization of the residual noise is unavoidable if one is to draw reliable contraints on large scale anisotropy. Conclusions: We have described how to compute the low-resolution maps, with a controlled sky signal level, and a reliable estimate of covariance of the residual noise. We have also presented a method to smooth the residual noise covariance matrices to describe the noise correlations in smoothed, bandwidth limited maps.
Resumo:
Tiivistelmä ReferatAbstract Metabolomics is a rapidly growing research field that studies the response of biological systems to environmental factors, disease states and genetic modifications. It aims at measuring the complete set of endogenous metabolites, i.e. the metabolome, in a biological sample such as plasma or cells. Because metabolites are the intermediates and end products of biochemical reactions, metabolite compositions and metabolite levels in biological samples can provide a wealth of information on on-going processes in a living system. Due to the complexity of the metabolome, metabolomic analysis poses a challenge to analytical chemistry. Adequate sample preparation is critical to accurate and reproducible analysis, and the analytical techniques must have high resolution and sensitivity to allow detection of as many metabolites as possible. Furthermore, as the information contained in the metabolome is immense, the data set collected from metabolomic studies is very large. In order to extract the relevant information from such large data sets, efficient data processing and multivariate data analysis methods are needed. In the research presented in this thesis, metabolomics was used to study mechanisms of polymeric gene delivery to retinal pigment epithelial (RPE) cells. The aim of the study was to detect differences in metabolomic fingerprints between transfected cells and non-transfected controls, and thereafter to identify metabolites responsible for the discrimination. The plasmid pCMV-β was introduced into RPE cells using the vector polyethyleneimine (PEI). The samples were analyzed using high performance liquid chromatography (HPLC) and ultra performance liquid chromatography (UPLC) coupled to a triple quadrupole (QqQ) mass spectrometer (MS). The software MZmine was used for raw data processing and principal component analysis (PCA) was used in statistical data analysis. The results revealed differences in metabolomic fingerprints between transfected cells and non-transfected controls. However, reliable fingerprinting data could not be obtained because of low analysis repeatability. Therefore, no attempts were made to identify metabolites responsible for discrimination between sample groups. Repeatability and accuracy of analyses can be influenced by protocol optimization. However, in this study, optimization of analytical methods was hindered by the very small number of samples available for analysis. In conclusion, this study demonstrates that obtaining reliable fingerprinting data is technically demanding, and the protocols need to be thoroughly optimized in order to approach the goals of gaining information on mechanisms of gene delivery.
Resumo:
The core aim of machine learning is to make a computer program learn from the experience. Learning from data is usually defined as a task of learning regularities or patterns in data in order to extract useful information, or to learn the underlying concept. An important sub-field of machine learning is called multi-view learning where the task is to learn from multiple data sets or views describing the same underlying concept. A typical example of such scenario would be to study a biological concept using several biological measurements like gene expression, protein expression and metabolic profiles, or to classify web pages based on their content and the contents of their hyperlinks. In this thesis, novel problem formulations and methods for multi-view learning are presented. The contributions include a linear data fusion approach during exploratory data analysis, a new measure to evaluate different kinds of representations for textual data, and an extension of multi-view learning for novel scenarios where the correspondence of samples in the different views or data sets is not known in advance. In order to infer the one-to-one correspondence of samples between two views, a novel concept of multi-view matching is proposed. The matching algorithm is completely data-driven and is demonstrated in several applications such as matching of metabolites between humans and mice, and matching of sentences between documents in two languages.
Resumo:
The purpose of the present study was to investigate the possibilities and interconnec-tions that exist concerning the relationship between the University of Applied Sci-ences and the Learning by Developing action model (LbD), on the one hand, and education for sustainable development and high-quality learning as a part of profes-sional competence development on the other. The research and learning environment was the Coping at Home research project and its Caring TV project, which provided the context of the Physiotherapy for Elderly People professional study unit. The re-searcher was a teacher and an evaluator of her own students learning. The aims of the study were to monitor and evaluate learning at the individual and group level using tools of high-quality learning − improved concept maps − related to understanding the projects core concept of successful ageing. Conceptions were evaluated through aspects of sustainable development and a conceptual basis of physiotherapy. As edu-cational research this was a multi-method case study design experiment. The three research questions were as follows. 1. What kind of individual conceptions and conceptual structures do students build concerning the concept of successful ageing? How many and what kind of concepts and propositions do they have a) before the study unit, b) after the study unit, c) after the social-knowledge building? 2. What kind of social-knowledge building exists? a) What kind of social learn-ing process exists? b) What kind of socially created concepts, propositions and conceptual structures do the students possess after the project? c) What kind of meaning does the social-knowledge building have at an individual level? 3. How do physiotherapy competences develop according to the results of the first and second research questions? The subjects were 22 female, third-year Bachelor of Physiotherapy students in Laurea University of Applied Sciences in Finland. Individual learning was evaluated in 12 of the 22 students. The data was collected as a part of the learning exercises of the Physiotherapy for Elderly People study unit, with improved concept maps both at individual and group levels. The students were divided into two social-knowledge building groups: the first group had 15 members and second 7 members. Each group created a group-level concept map on the theme of successful ageing. These face-to-face interactions were recorded with CMapTools and videotaped. The data consists of both individually produced concept maps and group-produced concept maps of the two groups and the videotaped material of these processes. The data analysis was carried out at the intersection of various research traditions. Individually produced data was analysed based on content analysis. Group-produced data was analysed based on content analysis and dialogue analysis. The data was also analysed by simple statistical analysis. In the individually produced improved concept maps the students conceptions were comprehensive, and the first concept maps were found to have many concepts unrelated to each other. The conceptual structures were between spoke structures and chain structures. Only a few professional concepts were evident. In the second indi-vidual improved concept maps the conception was more professional than earlier, particulary from the functional point of view. The conceptual structures mostly re-sembled spoke structures. After the second individual concept mapping social map-ping interventions were made in the two groups. After this, multidisciplinary concrete links were established between all concepts in almost all individual concept maps, and the interconnectedness of the concepts in different subject areas was thus understood. The conceptual structures were mainly net structures. The concepts in these individual concept maps were also found to be more professional and concrete than in the previ-ous concept maps of these subjects. In addition, the wider context dependency of the concepts was recognized in many individual concept maps. This implies a conceptual framework for specialists. The social-knowledge building was similar to a social learning process. Both socio-cultural processes and cognitive processes were found to develop students conceptual awareness and the ability to engage in intentional learning. In the knowl-edge-building process two aspects were found: knowledge creation and pedagogical action. The discussion during the concept-mapping process was similar to a shared thinking process. In visualising the process with CMapTools, students easily comple-mented each others thoughts and words, as if mutually telepathic . Synthesizing, supporting, asking and answering, peer teaching and counselling, tutoring, evaluating and arguing took place, and students were very active, self-directed and creative. It took hundreds of conversations before a common understanding could be found. The use of concept mapping in particular was very effective. The concepts in these group-produced concept maps were found to be professional, and values of sustainable development were observed. The results show the importance of developing the contents and objectives of the European Qualification Framework as well as education for sustainable development, especially in terms of the need for knowledge creation, global responsibility and systemic, holistic and critical thinking in order to develop clinical practice. Keywords: education for sustainable development, learning, knowledge building, improved concept map, conceptual structure, competence, successful ageing
Resumo:
This academic work begins with a compact presentation of the general background to the study, which also includes an autobiography for the interest in this research. The presentation provides readers who know little of the topic of this research and of the structure of the educational system as well as of the value given to education in Nigeria. It further concentrates on the dynamic interplay of the effect of academic and professional qualification and teachers' job effectiveness in secondary schools in Nigeria in particular, and in Africa in general. The aim of this study is to produce a systematic analysis and rich theoretical and empirical description of teachers' teaching competencies. The theoretical part comprises a comprehensive literature review that focuses on research conducted in the areas of academic and professional qualification and teachers' job effectiveness, teaching competencies, and the role of teacher education with particular emphasis on school effectiveness and improvement. This research benefits greatly from the functionalist conception of education, which is built upon two emphases: the application of the scientific method to the objective social world, and the use of an analogy between the individual 'organism' and 'society'. To this end, it offers us an opportunity to define terms systematically and to view problems as always being interrelated with other components of society. The empirical part involves describing and interpreting what educational objectives can be achieved with the help of teachers' teaching competencies in close connection to educational planning, teacher training and development, and achieving them without waste. The data used in this study were collected between 2002 and 2003 from teachers, principals, supervisors of education from the Ministry of Education and Post Primary Schools Board in the Rivers State of Nigeria (N=300). The data were collected from interviews, documents, observation, and questionnaires and were analyzed using both qualitative and quantitative methods to strengthen the validity of the findings. The data collected were analyzed to answer the specific research questions and hypotheses posited in this study. The data analysis involved the use of multiple statistical procedures: Percentages Mean Point Value, T-test of Significance, One-Way Analysis of Variance (ANOVA), and Cross Tabulation. The results obtained from the data analysis show that teachers require professional knowledge and professional teaching skills, as well as a broad base of general knowledge (e.g., morality, service, cultural capital, institutional survey). Above all, in order to carry out instructional processes effectively, teachers should be both academically and professionally trained. This study revealed that teachers are not however expected to have an extraordinary memory, but rather looked upon as persons capable of thinking in the right direction. This study may provide a solution to the problem of teacher education and school effectiveness in Nigeria. For this reason, I offer this treatise to anyone seriously committed in improving schools in developing countries in general and in Nigeria in particular to improve the lives of all its citizens. In particular, I write this to encourage educational planners, education policy makers, curriculum developers, principals, teachers, and students of education interested in empirical information and methods to conceptualize the issue this study has raised and to provide them with useful suggestions to help them improve secondary schooling in Nigeria. Though, multiple audiences exist for any text. For this reason, I trust that the academic community will find this piece of work a useful addition to the existing literature on school effectiveness and school improvement. Through integrating concepts from a number of disciplines, I aim to describe as holistic a representation as space could allow of the components of school effectiveness and quality improvement. A new perspective on teachers' professional competencies, which not only take into consideration the unique characteristics of the variables used in this study, but also recommend their environmental and cultural derivation. In addition, researchers should focus their attention on the ways in which both professional and non-professional teachers construct and apply their methodological competencies, such as their grouping procedures and behaviors to the schooling of students. Keywords: Professional Training, Academic Training, Professionally Qualified, Academically Qualified, Professional Qualification, Academic Qualification, Job Effectiveness, Job Efficiency, Educational Planning, Teacher Training and Development, Nigeria.
Resumo:
It has been known for decades that particles can cause adverse health effects as they are deposited within the respiratory system. Atmospheric aerosol particles influence climate by scattering solar radiation but aerosol particles act also as the nuclei around which cloud droplets form. The principal objectives of this thesis were to investigate the chemical composition and the sources of fine particles in different environments (traffic, urban background, remote) as well as during some specific air pollution situations. Quantifying the climate and health effects of atmospheric aerosols is not possible without detailed information of the aerosol chemical composition. Aerosol measurements were carried out at nine sites in six countries (Finland, Germany, Czech, Netherlands, Greece and Italy). Several different instruments were used in order to measure both the particulate matter (PM) mass and its chemical composition. In the off-line measurements the samples were collected first on a substrate or filter and gravimetric and chemical analysis were conducted in the laboratory. In the on-line measurements the sampling and analysis were either a combined procedure or performed successively within the same instrument. Results from the impactor samples were analyzed by the statistical methods. This thesis comprises also a work where a method for the determination carbonaceous matter size distribution by using a multistage impactor was developed. It was found that the chemistry of PM has usually strong spatial, temporal and size-dependent variability. In the Finnish sites most of the fine PM consisted of organic matter. However, in Greece sulfate dominated the fine PM and in Italy nitrate made the largest contribution to the fine PM. Regarding the size-dependent chemical composition, organic components were likely to be enriched in smaller particles than inorganic ions. Data analysis showed that organic carbon (OC) had four major sources in Helsinki. Secondary production was the major source in Helsinki during spring, summer and fall, whereas in winter biomass combustion dominated OC. The significant impact of biomass combustion on OC concentrations was also observed in the measurements performed in Central Europe. In this thesis aerosol samples were collected mainly by the conventional filter and impactor methods which suffered from the long integration time. However, by filter and impactor measurements chemical mass closure was achieved accurately, and a simple filter sampling was found to be useful in order to explain the sources of PM on the seasonal basis. The online instruments gave additional information related to the temporal variations of the sources and the atmospheric mixing conditions.
Resumo:
The aim of the study was to find out how the consumption of the population in Finland became a target of social interest and production of statistical data in the early 20th century, and what efforts have been made to influence consumption with social policy measures at different times. Questions concerning consumption are examined through the practices employed in the compilation of statistics on it. The interpretation framework in the study is Michael Foucault s perspective of modern liberal government. This mode of government is typified by pursuit of efficiency and search of equilibrium between economic government and a government of the processes of life. It shows aspirations towards both integration and individualisation. The government is based on freedom practices. It also implies knowledge-based ways of conceptualising reality. Statistical data are of specific significance in this context. The connection between the government of consumption and the compilation of statistics on it is studied through the theoretical, socio-political and statistical conceptualisation of consumption. The research material consisted of Finnish and international documentation on the compilation of statistics on consumption, publications of social programmes, and reports of studies on consumption. The analysis of the material focused especially on the problematisations related to consumption found in these documents and on changes in them over history. There have been both clearly observable changes and as well as historical stratification and diversity in the rationalities and practices of consumption government during the 20th century. Consumption has been influenced by pluralistic government, based at different times and in varying ways on the logics of solidarity and markets. The difference between these is that in the former risks are prepared for collectively while in the latter risks are individualised. Despite the differences, the characteristic that is common to these logics is certain kind of contractuality. They are both permeated by the household logic which differs from them in that it is based on the normative and ethical demands imposed on an individual. There has been a clear interactive connection between statistical data and consumption government. Statistical practices have followed changes in the way consumption has been conceptualised in society. This has been reflected in the statistical phenomena of interest, concepts, classifications and indicators. New ways of compiling statistics have in their turn shaped perceptions of reality. Statistical data have also facilitated a variety of rational calculations with which the consequences of the population s consumption habits have been evaluated at the levels of economy at large and individuals.
Resumo:
This thesis explores migration and the attractiveness of urban living in the Greater Helsinki region. The aim of the thesis is to explore the attractiveness of the city of Helsinki in terms of regional migration and to identify what characterizes migration to Helsinki. The study focuses in particular on housing, which is a key factor influencing migration decisions in the region. Other central themes in the study are housing policy and regional competition among municipalities. This study focuses solely on households moving within Finnish borders excluding international migration. Migration is examined by comparing in- and out-migration in Helsinki, as well as studying migration to the city s inner and outer areas. The primary research material in the study is a questionnaire data collected by the National Consumer Research Centre. In this thesis the data is used for studying migrants aged 25 45. The main research method is analyzing the data statistically using the SPSS software. Methods include frequency analysis, cross tabulation, factor analysis and descriptive analysis. Additionally, statistical data is used to complement the questionnaire data. The research results indicate that Helsinki s in- and out-migration differs both in terms of the type of households that migrate as well as in the reasons why they migrate. Furthermore, differences can also be detected between migration to the inner and outer parts of Helsinki. According to the research results, a household s current phase of life is crucial in determining where and why they move within the Greater Helsinki region. A household s set of values on the other hand, seems to have a lesser impact on migration within the region, even though households moving to Helsinki seem to value a somewhat more urban lifestyle than the ones moving out of the city. The research also shows a direct correlation between the values of migrants and their current phase of life. Decisions of migrating are heavily influenced by wider societal issues. In the Greater Helsinki region the labor and housing market appear to have a great influence on the direction of migration streams. According to the results, households move to and from Helsinki for different reasons. The primary reasons for moving to Helsinki are related to the city s diverse labor market and to the working careers of households. Issues related to urban living and an urban lifestyle seem to be relevant although not the main reason why people move to Helsinki. The research material indicates that Helsinki s urban environment is both a pull and a push factor affecting the decisions of migrants. The city attracts those seeking urban living, but on the contrary does not appeal to households seeking more space and wishing to live closer to nature. According to the research, Helsinki with its densely built urban environment mainly attracts singles and childless couples, whereas the city region s other municipalities are more attractive for families with children. Housing policy is one of the main reasons determining where people move within the Helsinki region. As for the city of Helsinki, improving the city s attractiveness seems to be closely linked to how well the city manages to execute its future housing policies and how well alternative living preferences can be taken into account in planning.
Resumo:
In this study we explore the concurrent, combined use of three research methods, statistical corpus analysis and two psycholinguistic experiments (a forced-choice and an acceptability rating task), using verbal synonymy in Finnish as a case in point. In addition to supporting conclusions from earlier studies concerning the relationships between corpus-based and ex- perimental data (e. g., Featherston 2005), we show that each method adds to our understanding of the studied phenomenon, in a way which could not be achieved through any single method by itself. Most importantly, whereas relative rareness in a corpus is associated with dispreference in selection, such infrequency does not categorically always entail substantially lower acceptability. Furthermore, we show that forced-choice and acceptability rating tasks pertain to distinct linguistic processes, with category-wise in- commensurable scales of measurement, and should therefore be merged with caution, if at all.
Resumo:
FTIR-spektroskopia (Fourier-muunnosinfrapunaspektroskopia) on nopea analyysimenetelmä. Fourier-laitteissa interferometrin käyttäminen mahdollistaa koko infrapunataajuusalueen mittaamisen muutamassa sekunnissa. ATR-liitännäisellä varustetun FTIR-spektrometrin käyttö ei edellytä juuri näytteen valmistusta ja siksi menetelmä on käytössä myös helppo. ATR-liitännäinen mahdollistaa myös monien erilaisten näytteiden analysoinnin. Infrapunaspektrin mittaaminen onnistuu myös sellaisista näytteistä, joille perinteisiä näytteenvalmistusmenetelmiä ei voida käyttää. FTIR-spektroskopian avulla saatu tieto yhdistetään usein tilastollisiin monimuuttuja-analyyseihin. Klusterianalyysin avulla voidaan spektreistä saatu tieto ryhmitellä samanlaisuuteen perustuen. Hierarkkisessa klusterianalyysissa objektien välinen samanlaisuus määritetään laskemalla niiden välinen etäisyys. Pääkomponenttianalyysin avulla vähennetään datan ulotteisuutta ja luodaan uusia korreloimattomia pääkomponentteja. Pääkomponenttien tulee säilyttää mahdollisimman suuri määrä alkuperäisen datan variaatiosta. FTIR-spektroskopian ja monimuuttujamenetelmien sovellusmahdollisuuksia on tutkittu paljon. Elintarviketeollisuudessa sen soveltuvuutta esimerkiksi laadun valvontaan on tutkittu. Menetelmää on käytetty myös haihtuvien öljyjen kemiallisten koostumusten tunnistukseen sekä öljykasvien kemotyyppien havaitsemiseen. Tässä tutkimuksessa arvioitiin menetelmän käyttöä suoputken uutenäytteiden luokittelussa. Tutkimuksessa suoputken eri kasvinosien uutenäytteiden FTIR-spektrejä vertailtiin valikoiduista puhdasaineista mitattuihin FTIR-spektreihin. Puhdasaineiden FTIR-spektreistä tunnistettiin niiden tyypilliset absorptiovyöhykkeet. Furanokumariinien spektrien intensiivisten vyöhykkeiden aaltolukualueet valittiin monimuuttuja-analyyseihin. Monimuuttuja-analyysit tehtiin myös IR-spektrin sormenjälkialueelta aaltolukualueelta 1785-725 cm-1. Uutenäytteitä pyrittiin luokittelemaan niiden keräyspaikan ja kumariinipitoisuuden mukaan. Keräyspaikan mukaan ryhmittymistä oli havaittavissa, mikä selittyi vyöhykkeiden aaltolukualueiden mukaan tehdyissä analyyseissa pääosin kumariinipitoisuuksilla. Näissä analyyseissa uutenäytteet pääosin ryhmittyivät ja erottuivat kokonaiskumariinipitoisuuksien mukaan. Myös aaltolukualueen 1785-725 cm-1 analyyseissa havaittiin keräyspaikan mukaan ryhmittymistä, mitä kumariinipitoisuudet eivät kuitenkaan selittäneet. Näihin ryhmittymisiin vaikuttivat mahdollisesti muiden yhdisteiden samanlaiset pitoisuudet näytteissä. Analyyseissa käytettiin myös muita aaltolukualueita, mutta tulokset eivät juuri poikenneet aiemmista. 2. kertaluvun derivaattaspektrien monimuuttuja-analyysit sormenjälkialueelta eivät myöskään muuttaneet tuloksia havaittavasti. Jatkotutkimuksissa nyt käytettyä menetelmää on mahdollista edelleen kehittää esimerkiksi tutkimalla monimuuttuja-analyyseissa 2. kertaluvun derivaattaspektreistä suppeampia, tarkkaan valittuja aaltolukualueita.