23 resultados para VLE data sets
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
In this work we study the classification of forest types using mathematics based image analysis on satellite data. We are interested in improving classification of forest segments when a combination of information from two or more different satellites is used. The experimental part is based on real satellite data originating from Canada. This thesis gives summary of the mathematics basics of the image analysis and supervised learning , methods that are used in the classification algorithm. Three data sets and four feature sets were investigated in this thesis. The considered feature sets were 1) histograms (quantiles) 2) variance 3) skewness and 4) kurtosis. Good overall performances were achieved when a combination of ASTERBAND and RADARSAT2 data sets was used.
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Poster at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Integration of marketing research data in new product development. Case study: Food industry company
Resumo:
The aim of this master’s thesis is to provide a real life example of how marketing research data is used by different functions in the NPD process. In order to achieve this goal, a case study in a company was implemented where gathering, analysis, distribution and synthesis of marketing research data in NPD were studied. The main research question was formulated as follows: How is marketing research data integrated and used by different company functions in the NPD process? The theory part of the master’s thesis was focused on the discussion of the marketing function role in NPD, use of marketing research particularly in the food industry, as well as issues related to the marketing/R&D interface during the NPD process. The empirical part of the master’s thesis was based on qualitative explanatory case study research. Individual in-depth interviews with company representatives, company documents and online research were used for data collection and analyzed through triangulation method. The empirical findings advocate that the most important marketing data sources at the concept generation stage of NPD are: global trends monitoring, retailing audit and consumers insights. These data sets are crucial for establishing the potential of the product on the market and defining the desired features for the new product to be developed. The findings also suggest the example of successful crossfunctional communication during the NPD process with formal and informal communication patterns. General managerial recommendations are given on the integration in NPD of a strategy, process, continuous improvement, and motivated cross-functional product development teams.
Resumo:
The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.
Resumo:
Our surrounding landscape is in a constantly dynamic state, but recently the rate of changes and their effects on the environment have considerably increased. In terms of the impact on nature, this development has not been entirely positive, but has rather caused a decline in valuable species, habitats, and general biodiversity. Regardless of recognizing the problem and its high importance, plans and actions of how to stop the detrimental development are largely lacking. This partly originates from a lack of genuine will, but is also due to difficulties in detecting many valuable landscape components and their consequent neglect. To support knowledge extraction, various digital environmental data sources may be of substantial help, but only if all the relevant background factors are known and the data is processed in a suitable way. This dissertation concentrates on detecting ecologically valuable landscape components by using geospatial data sources, and applies this knowledge to support spatial planning and management activities. In other words, the focus is on observing regionally valuable species, habitats, and biotopes with GIS and remote sensing data, using suitable methods for their analysis. Primary emphasis is given to the hemiboreal vegetation zone and the drastic decline in its semi-natural grasslands, which were created by a long trajectory of traditional grazing and management activities. However, the applied perspective is largely methodological, and allows for the application of the obtained results in various contexts. Models based on statistical dependencies and correlations of multiple variables, which are able to extract desired properties from a large mass of initial data, are emphasized in the dissertation. In addition, the papers included combine several data sets from different sources and dates together, with the aim of detecting a wider range of environmental characteristics, as well as pointing out their temporal dynamics. The results of the dissertation emphasise the multidimensionality and dynamics of landscapes, which need to be understood in order to be able to recognise their ecologically valuable components. This not only requires knowledge about the emergence of these components and an understanding of the used data, but also the need to focus the observations on minute details that are able to indicate the existence of fragmented and partly overlapping landscape targets. In addition, this pinpoints the fact that most of the existing classifications are too generalised as such to provide all the required details, but they can be utilized at various steps along a longer processing chain. The dissertation also emphases the importance of landscape history as an important factor, which both creates and preserves ecological values, and which sets an essential standpoint for understanding the present landscape characteristics. The obtained results are significant both in terms of preserving semi-natural grasslands, as well as general methodological development, giving support to science-based framework in order to evaluate ecological values and guide spatial planning.
Resumo:
Mass spectrometry (MS)-based proteomics has seen significant technical advances during the past two decades and mass spectrometry has become a central tool in many biosciences. Despite the popularity of MS-based methods, the handling of the systematic non-biological variation in the data remains a common problem. This biasing variation can result from several sources ranging from sample handling to differences caused by the instrumentation. Normalization is the procedure which aims to account for this biasing variation and make samples comparable. Many normalization methods commonly used in proteomics have been adapted from the DNA-microarray world. Studies comparing normalization methods with proteomics data sets using some variability measures exist. However, a more thorough comparison looking at the quantitative and qualitative differences of the performance of the different normalization methods and at their ability in preserving the true differential expression signal of proteins, is lacking. In this thesis, several popular and widely used normalization methods (the Linear regression normalization, Local regression normalization, Variance stabilizing normalization, Quantile-normalization, Median central tendency normalization and also variants of some of the forementioned methods), representing different strategies in normalization are being compared and evaluated with a benchmark spike-in proteomics data set. The normalization methods are evaluated in several ways. The performance of the normalization methods is evaluated qualitatively and quantitatively on a global scale and in pairwise comparisons of sample groups. In addition, it is investigated, whether performing the normalization globally on the whole data or pairwise for the comparison pairs examined, affects the performance of the normalization method in normalizing the data and preserving the true differential expression signal. In this thesis, both major and minor differences in the performance of the different normalization methods were found. Also, the way in which the normalization was performed (global normalization of the whole data or pairwise normalization of the comparison pair) affected the performance of some of the methods in pairwise comparisons. Differences among variants of the same methods were also observed.
Resumo:
Tutkielman tavoitteena oli tarkastella innovaatioiden leviämismallien ennustetarkkuuteen vaikuttavia tekijöitä. Tutkielmassa ennustettiin logistisella mallilla matkapuhelinliittymien leviämistä kolmessa Euroopan maassa: Suomessa, Ranskassa ja Kreikassa. Teoriaosa keskittyi innovaatioiden leviämisen ennustamiseen leviämismallien avulla. Erityisesti painotettiin mallien ennustuskykyä ja niiden käytettävyyttä eri tilanteissa. Empiirisessä osassa keskityttiin ennustamiseen logistisella leviämismallilla, joka kalibroitiin eri tavoin koostetuilla aikasarjoilla. Näin tehtyjä ennusteita tarkasteltiin tiedon kokoamistasojen vaikutusten selvittämiseksi. Tutkimusasetelma oli empiirinen, mikä sisälsi logistisen leviämismallin ennustetarkkuuden tutkimista otosdatan kokoamistasoa muunnellen. Leviämismalliin syötettävä data voidaan kerätä kuukausittain ja operaattorikohtaisesti vaikuttamatta ennustetarkkuuteen. Dataan on sisällytettävä leviämiskäyrän käännöskohta, eli pitkän aikavälin huippukysyntäpiste.
Resumo:
Coastal birds are an integral part of coastal ecosystems, which nowadays are subject to severe environmental pressures. Effective measures for the management and conservation of seabirds and their habitats call for insight into their population processes and the factors affecting their distribution and abundance. Central to national and international management and conservation measures is the availability of accurate data and information on bird populations, as well as on environmental trends and on measures taken to solve environmental problems. In this thesis I address different aspects of the occurrence, abundance, population trends and breeding success of waterbirds breeding on the Finnish coast of the Baltic Sea, and discuss the implications of the results for seabird monitoring, management and conservation. In addition, I assess the position and prospects of coastal bird monitoring data, in the processing and dissemination of biodiversity data and information in accordance with the Convention on Biological Diversity (CBD) and other national and international commitments. I show that important factors for seabird habitat selection are island area and elevation, water depth, shore openness, and the composition of island cover habitats. Habitat preferences are species-specific, with certain similarities within species groups. The occurrence of the colonial Arctic Tern (Sterna paradisaea) is partly affected by different habitat characteristics than its abundance. Using long-term bird monitoring data, I show that eutrophication and winter severity have reduced the populations of several Finnish seabird species. A major demographic factor through which environmental changes influence bird populations is breeding success. Breeding success can function as a more rapid indicator of sublethal environmental impacts than population trends, particularly for long-lived and slowbreeding species, and should therefore be included in coastal bird monitoring schemes. Among my target species, local breeding success can be shown to affect the populations of the Mallard (Anas platyrhynchos), the Eider (Somateria mollissima) and the Goosander (Mergus merganser) after a time lag corresponding to their species-specific recruitment age. For some of the target species, the number of individuals in late summer can be used as an easier and more cost-effective indicator of breeding success than brood counts. My results highlight that the interpretation and application of habitat and population studies require solid background knowledge of the ecology of the target species. In addition, the special characteristics of coastal birds, their habitats, and coastal bird monitoring data have to be considered in the assessment of their distribution and population trends. According to the results, the relationships between the occurrence, abundance and population trends of coastal birds and environmental factors can be quantitatively assessed using multivariate modelling and model selection. Spatial data sets widely available in Finland can be utilised in the calculation of several variables that are relevant to the habitat selection of Finnish coastal species. Concerning some habitat characteristics field work is still required, due to a lack of remotely sensed data or the low resolution of readily available data in relation to the fine scale of the habitat patches in the archipelago. While long-term data sets exist for water quality and weather, the lack of data concerning for instance the food resources of birds hampers more detailed studies of environmental effects on bird populations. Intensive studies of coastal bird species in different archipelago areas should be encouraged. The provision and free delivery of high-quality coastal data concerning bird populations and their habitats would greatly increase the capability of ecological modelling, as well as the management and conservation of coastal environments and communities. International initiatives that promote open spatial data infrastructures and sharing are therefore highly regarded. To function effectively, international information networks, such as the biodiversity Clearing House Mechanism (CHM) under the CBD, need to be rooted at regional and local levels. Attention should also be paid to the processing of data for higher levels of the information hierarchy, so that data are synthesized and developed into high-quality knowledge applicable to management and conservation.
Resumo:
The uncertainty of any analytical determination depends on analysis and sampling. Uncertainty arising from sampling is usually not controlled and methods for its evaluation are still little known. Pierre Gy’s sampling theory is currently the most complete theory about samplingwhich also takes the design of the sampling equipment into account. Guides dealing with the practical issues of sampling also exist, published by international organizations such as EURACHEM, IUPAC (International Union of Pure and Applied Chemistry) and ISO (International Organization for Standardization). In this work Gy’s sampling theory was applied to several cases, including the analysis of chromite concentration estimated on SEM (Scanning Electron Microscope) images and estimation of the total uncertainty of a drug dissolution procedure. The results clearly show that Gy’s sampling theory can be utilized in both of the above-mentioned cases and that the uncertainties achieved are reliable. Variographic experiments introduced in Gy’s sampling theory are beneficially applied in analyzing the uncertainty of auto-correlated data sets such as industrial process data and environmental discharges. The periodic behaviour of these kinds of processes can be observed by variographic analysis as well as with fast Fourier transformation and auto-correlation functions. With variographic analysis, the uncertainties are estimated as a function of the sampling interval. This is advantageous when environmental data or process data are analyzed as it can be easily estimated how the sampling interval is affecting the overall uncertainty. If the sampling frequency is too high, unnecessary resources will be used. On the other hand, if a frequency is too low, the uncertainty of the determination may be unacceptably high. Variographic methods can also be utilized to estimate the uncertainty of spectral data produced by modern instruments. Since spectral data are multivariate, methods such as Principal Component Analysis (PCA) are needed when the data are analyzed. Optimization of a sampling plan increases the reliability of the analytical process which might at the end have beneficial effects on the economics of chemical analysis,
Resumo:
Most metazoans rely on aerobic energy production, which is dependent on adequate oxygen supply. In the case of reduced oxygen supply (hypoxia), the most profound changes in gene expression are mediated by transcription factors named hypoxia-inducible factors (HIF alpha). These proteins are post-translationally regulated by prolyl-4-hydroxylase (PHD) enzymes that are direct “sensors” of cellular oxygen levels. This thesis examines the molecular evolution of metazoan HIF systems. In early metazoans the HIF system emerged from pre-existing PHD oxygen sensors and early bHLH-PAS transcription factors. In invertebrates our analysis revealed an unexpected diversity of PHD genes and HIF alpha sequence characteristics. An early branching vertebrate, the epaulette shark (Hemiscyllium ocellatum) was chosen for sequencing and hypoxia preconditioning studies of HIF alpha and PHD genes. As no quantitative PCR reference genes were available, this thesis includes the first study of reference genes in cartilaginous fish species. Applying multiple statistical analysis we also discoveredthat commonly used reference gene software may perform poorly with some data sets. Novel reference genes allowed accurate measurements of the mRNAlevels of the studied target genes. Cartilaginous fishes have three genomic duplicates of both HIF alpha and PHD genes like mammals and teleost fishes. Combining functional divergence and selection analyses it was possible to describe how sequence changes in both HIF alpha and PHD duplicates may have contributed to the differential oxygen sensitivityof HIF alphas. Additionally, novel teleost HIF-1 alpha sequences were produced and used to reveal the molecular evolution of HIF-1 alpha in this lineage rich with hypoxia tolerant species.
Resumo:
Although social capital and health have been extensively studied during the last decade, there are still open issues in current empirical research. These concern for instance the measurement of the concept in different contexts, as well as the association between different types of social capital and different dimensions of health. The present thesis addressed these questions. The general aim was to promote the understanding of social capital and health by investigating the oldest old and the two major language groups in Finland, Swedish- and Finnish-speakers. Another aim was to contribute to the discussion on methodological issues in social capital and health research. The present thesis investigated two empirical data sets, Umeå 85+ and Health 2000. The Umeå 85+ study was a cross-sectional study of 163 individuals aged 85, 90, and 95 or older, living in the municipality of Umeå, Sweden, in the year of 2000. The Health 2000 survey was a national study of 8,028 persons aged 30 or above carried out in Finland in 2000-2001. Different indicators of structural (e.g. social contacts) and cognitive (e.g. trust) social capital, as well as health indicators were used as variables in the analyses. The Umeå 85+ data set was analyzed with factor analysis, as well as univariate and multivariate analysis of variance. The Health 2000 data was analyzed with logistic regression techniques. The results showed that the Swedish-speakers in the Finnish data set Health 2000 had consistently higher prevalence of social capital compared to the Finnish-speakers even after controlling for central sociodemographic variables. The results further showed that even if the language group differences in health were small, the Swedishspeakers experienced in general better self-reported health compared with the Finnish-speakers. Common sociodemographic variables could not explain these observed differences in health. The results imply that social capital is often, but not always, associated with health. This was clearly seen in the Umeå 85+ data set where only one health indicator (depressive symptoms) was associated with structural social capital among the oldest old. The results based on the analysis of the Health 2000 survey demonstrated that the cognitive component of social capital was associated with self-rated health and psychological health rather than with participation in social activities and social contacts. In addition, social capital statistically reduced the health advantage especially for Swedish-speaking men, indicating that high prevalence of social capital may promote health. Finally, the present thesis also discussed the issue of methodological challenges faced with when analyzing social capital and health. It was suggested that certain components of social capital such as bonding and bridging social capital may be more relevant than structural and cognitive components when investigating social capital among the two language groups in Finland. The results concerning the oldest old indicated that the structural aspects of social capital probably reflect current living conditions, whereas cognitive social capital reflects attitudes and traits often acquired decades earlier. This is interpreted as an indication of the fact that structural and cognitive social capital are closely related yet empirically two distinctive concepts. Taken together, some components of social capital may be more relevant to study than others depending on which population group and age group is under study. The results also implied that the choice of cut-off point of dichotomization of selfrated health has an impact on the estimated effects of the explanatory variables. When the whole age interval, 35-64 years, was analyzed with logistic regression techniques the choice of cut-off point did not matter for the estimated effects of marital status and educational level. The results changed, however, when the age interval was divided into three shorter intervals. If self-rated health is explored using wide age intervals that do not account for age-dependent covariates there is a risk of drawing misleading conclusions. In conclusion, the results presented in the thesis suggest that the uneven distribution of social capital observed between the two language groups in Finland are of importance when trying to further understand health inequalities that exist between Swedish- and Finnish-speakers in Finland. Although social capital seemed to be relevant to the understanding of health among the oldest old, the meaning of social capital is probably different compared to a less vulnerable age group. This should be noticed in future empirical research. In the present thesis, it was shown that the relationship between social capital and health is complex and multidimensional. Different aspects of social capital seem to be important for different aspects of health. This reduces the possibility to generalize the results and to recommend general policy implementations in this area. An increased methodological awareness regarding social capital as well as health are called for in order to further understand the cfomplex association between them. However, based on the present data and findings social capital is associated with health. To understand individual health one must also consider social aspects of the individuals’ environment such as social capital.
Resumo:
The purpose of this study is to examine how well risk parity works in terms of risk, return and diversification relative to more traditional minimum variance, 1/N and 60/40 portfolios. Risk parity portfolios were constituted of five risk sources; three common asset classes and two alternative beta investment strategies. The three common asset classes were equities, bonds and commodities, and the alternative beta investment strategies were carry trade and trend following. Risk parity portfolios were constructed using five different risk measures of which four were tail risk measures. The risk measures were standard deviation, Value-at-Risk, Expected Shortfall, modified Value-at-Risk and modified Expected Shortfall. We studied also how sensitive risk parity is to the choice of risk measure. The hypothesis is that risk parity portfolios provide better return with the same amount of risk and are better diversified than the benchmark portfolios. We used two data sets, monthly and weekly data. The monthly data was from the years 1989-2011 and the weekly data was from the years 2000-2011. Empirical studies showed that risk parity portfolios provide better diversification since the diversification is made at the risk level. Risk based portfolios provided superior return compared to the asset based portfolios. Using tail risk measures in risk parity portfolios do not necessarily provide better hedge from tail events than standard deviation.
Resumo:
Tässä diplomityössä jatkettiin Loviisan voimalaitoksen höyryturbiinien suorituskyvyn parannuspotentiaalien tutkimusta. Tavoitteena oli kehittää laitoksen höyryturbiinien suorituskyvyn käytönaikaisia on-line-mittauksia. Selvityksessä perehdyttiin norjalaisen IFE:n kehittämään stationääritilan TEMPOohjelmaan( The Thermal Performance Monitoring And Optimisation system), sen käyttöohjeisiin ja toimintaperiaatteisiin. Työssä esiteltiin laajasti tiedon yhteensovittamisen laskentateoriaa, johon TEMPOn toiminta perustuu. Työssä tarkasteltiin turbiinin todellista paisuntaprosessia, koska sen ymmärtäminen on tärkeässä osassa turbiinin suorituskyvyn valvonnassa. Tutkimuksessa esiteltiin myös turbiineille mahdollisia vikoja sekä niiden syntymisprosesseja. Työssä tarkasteltiin TEMPOn sovittamien tulostiedostojen analysointiohjelman toimivuutta havaitsemalla itse aiheutettuja poikkeamia todellisiin mittaustiedostoihin. Analysointiohjelmalla muodostettuja kuvaajia vertailtiin todellisen prosessin ajotilanteen kuvaajiin ja tarkasteltiin, kuinka poikkeamia on mahdollista havaita kuvaajien avulla. TEMPO-ohjelmalle löydettiin tutkimuksen edetessä kehittämisehdotuksia. Näillä muutoksilla ohjelma saadaan mallintamaan Loviisan voimalaitoksen turbiiniprosessia tarkemmin ja tuloksista saadaan hyödyllisempiä.