13 resultados para EXPLORATORY DATA ANALYSIS

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays the used fuel variety in power boilers is widening and new boiler constructions and running models have to be developed. This research and development is done in small pilot plants where more faster analyse about the boiler mass and heat balance is needed to be able to find and do the right decisions already during the test run. The barrier on determining boiler balance during test runs is the long process of chemical analyses of collected input and outputmatter samples. The present work is concentrating on finding a way to determinethe boiler balance without chemical analyses and optimise the test rig to get the best possible accuracy for heat and mass balance of the boiler. The purpose of this work was to create an automatic boiler balance calculation method for 4 MW CFB/BFB pilot boiler of Kvaerner Pulping Oy located in Messukylä in Tampere. The calculation was created in the data management computer of pilot plants automation system. The calculation is made in Microsoft Excel environment, which gives a good base and functions for handling large databases and calculations without any delicate programming. The automation system in pilot plant was reconstructed und updated by Metso Automation Oy during year 2001 and the new system MetsoDNA has good data management properties, which is necessary for big calculations as boiler balance calculation. Two possible methods for calculating boiler balance during test run were found. Either the fuel flow is determined, which is usedto calculate the boiler's mass balance, or the unburned carbon loss is estimated and the mass balance of the boiler is calculated on the basis of boiler's heat balance. Both of the methods have their own weaknesses, so they were constructed parallel in the calculation and the decision of the used method was left to user. User also needs to define the used fuels and some solid mass flowsthat aren't measured automatically by the automation system. With sensitivity analysis was found that the most essential values for accurate boiler balance determination are flue gas oxygen content, the boiler's measured heat output and lower heating value of the fuel. The theoretical part of this work concentrates in the error management of these measurements and analyses and on measurement accuracy and boiler balance calculation in theory. The empirical part of this work concentrates on the creation of the balance calculation for the boiler in issue and on describing the work environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research concerns the Urban Living Idea Contest conducted by Creator Space™ of BASF SE during its 150th anniversary in 2015. The main objectives of the thesis are to provide a comprehensive analysis of the Urban Living Idea Contest (ULIC) and propose a number of improvement suggestions for future years. More than 4,000 data points were collected and analyzed to investigate the functionality of different elements of the contest. Furthermore, a set of improvement suggestions were proposed to BASF SE. Novelty of this thesis lies in the data collection and the original analysis of the contest, which identified its critical elements, as well as the areas that could be improved. The author of this research was a member of the organizing team and involved in the decision making process from the beginning until the end of the ULIC.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Raw measurement data does not always immediately convey useful information, but applying mathematical statistical analysis tools into measurement data can improve the situation. Data analysis can offer benefits like acquiring meaningful insight from the dataset, basing critical decisions on the findings, and ruling out human bias through proper statistical treatment. In this thesis we analyze data from an industrial mineral processing plant with the aim of studying the possibility of forecasting the quality of the final product, given by one variable, with a model based on the other variables. For the study mathematical tools like Qlucore Omics Explorer (QOE) and Sparse Bayesian regression (SB) are used. Later on, linear regression is used to build a model based on a subset of variables that seem to have most significant weights in the SB model. The results obtained from QOE show that the variable representing the desired final product does not correlate with other variables. For SB and linear regression, the results show that both SB and linear regression models built on 1-day averaged data seriously underestimate the variance of true data, whereas the two models built on 1-month averaged data are reliable and able to explain a larger proportion of variability in the available data, making them suitable for prediction purposes. However, it is concluded that no single model can fit well the whole available dataset and therefore, it is proposed for future work to make piecewise non linear regression models if the same available dataset is used, or the plant to provide another dataset that should be collected in a more systematic fashion than the present data for further analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This study focuses on observing how Finnish companies execute their new product launch processes. The main objective was to find out how entry timing moderates the relationship between launch tactics (namely product innovativeness, price and emotional advertising) and new product performance (namely sales volume and customer profitability). The empirical analysis was based on data collected in Lappeenranta University of Technology. The sample consisted of Finnish companies representing different industries and innovation activities. Altogether 272 usable responses were received representing a response rate of 37.67%. The measures were first assessed by using exploratory factor analysis (EFA) in PASW Statistics 18 and then further verified with confirmatory factor analysis (CFA) in LISREL 8.80. To test the hypotheses of the moderating effects of entry timing, hierarchical regression analysis was used in PASW Statistics 18. The results of the study revealed that the effect of product innovativeness on new product sales volume is dependent on entry timing. This implies that companies should carefully consider what would be the best time for entering the market when launching highly innovative new products. The results also depict a positive relationship between emotional advertising and new product sales volume. In addition, partial support was found for a positive relationship between pricing and new product customer profitability.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Workshop at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of this exploratory research is to identify the potential value drivers regarding a new service offering. More specifically, the aim is to build understanding of customer expectations and perceived value of energy efficiency solutions in the building’s sector. The knowledge is then used in defining potential value drivers. The research is conducted from the customer’s perspective in a business-to-business context. The theory part of the master’s thesis focuses on discussing the antecedents of customer expectations and customer value. The theory gives implications how to determine value drivers and develop value propositions as well as conduct value assessment. The empirical part is based on the qualitative research method. The research was conducted as a single-case study, and the primary data was collected through semi-structured interviews with potential customers. The results of the research revealed that the customer expectations are connected to being able to define value drivers. In addition, the research revealed generic themes relating to the offering and customer-supplier relationship, which help in the process of identifying potential value drivers. The results were discussed in terms of product-, service-, price- and relationship-related value drivers for the new service. Based on the data analysis the dominant value drivers are elaborated in terms of identified customer benefits and customer sacrifices (costs). Finally, some implications of value proposition and value assessment to support the value delivery were given.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Studying testis is complex, because the tissue has a very heterogeneous cell composition and its structure changes dynamically during development. In reproductive field, the cell composition is traditionally studied by morphometric methods such as immunohistochemistry and immunofluorescence. These techniques provide accurate quantitative information about cell composition, cell-cell association and localization of the cells of interest. However, the sample preparation, processing, staining and data analysis are laborious and may take several working days. Flow cytometry protocols coupled with DNA stains have played an important role in providing quantitative information of testicular cells populations ex vivo and in vitro studies. Nevertheless, the addition of specific cells markers such as intracellular antibodies would allow the more specific identification of cells of crucial interest during spermatogenesis. For this study, adult rat Sprague-Dawley rats were used for optimization of the flow cytometry protocol. Specific steps within the protocol were optimized to obtain a singlecell suspension representative of the cell composition of the starting material. Fixation and permeabilization procedure were optimized to be compatible with DNA stains and fluorescent intracellular antibodies. Optimization was achieved by quantitative analysis of specific parameters such as recovery of meiotic cells, amount of debris and comparison of the proportions of the various cell populations with already published data. As a result, a new and fast flow cytometry method coupled with DNA stain and intracellular antigen detection was developed. This new technique is suitable for analysis of population behavior and specific cells during postnatal testis development and spermatogenesis in rodents. This rapid protocol recapitulated the known vimentin and γH2AX protein expression patterns during rodent testis ontogenesis. Moreover, the assay was applicable for phenotype characterization of SCRbKO and E2F1KO mouse models.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of this two-phased study is to examine the interest of nursing students in choosing a career in older people nursing. First, the scoping phase explores the different premises for choosing older people nursing as a career. Second, the evaluation phase investigates the outcomes of the developed educational intervention involving older people as promoters of choosing a career in older people nursing, factors related to these outcomes, and experiences with educational intervention. The ultimate goal is to encourage more nursing students to choose older people nursing as their career. The scoping phase applies an exploratory design and centres around a descriptive, cross-sectional survey, documentary research and a scoping literature review. The information sources for this phase include 183 nursing students, 101 newspaper articles and 66 research articles. The evaluation phase applies a quasi-experimental design and a pre-post-test design with a non-equivalent comparison group and a post-intervention survey. The information sources for this phase include 87 nursing students and 43 older people. In both phases, statistical and narrative methods are applied in the data analysis. Nursing students neutrally regarded the idea of a career in older people nursing. The most consistent factors related to the nursing students’ career plans in older people nursing were found to be nursing work experience and various educational preparations in the field. Nursing students in the intervention group (n=40) were more interested in older people nursing and had more positive attitudes towards older people than did students in the comparison group (n=36). However, in both groups, the interest that students had at the baseline was associated with the interest at the one-month follow-up. There were no significant differences between the groups in terms of the students’ knowledge levels about ageing. The nursing students and older people alike highly appreciated participating in the educational intervention. It seems possible to positively impact nursing students and their choices to pursue careers in older people nursing, at least in the short-term. The involvement of older people as promoters of this career choice provides one encouraging alternative for impacting students’ career choices, but additional research is needed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This is a Master’s thesis research, which mainly aims at identifying the sustainability issues in sourcing process and to identify the core competencies in sourcing process through triple bottom line adaptation. The focus of this thesis is on apparel industry’s sourcing process. The purpose of this thesis is to examine global apparel industry’s reality in their sourcing process and how buyers-suppliers are cooperating with their sourcing process to incorporate sustainability. Other goal of this research paper is to provide recommendation for sustainable sourcing process for companies and how the stakeholders can be benefitted by sustainable sourcing. The literature review part of this paper has presented the research gaps from the earlier researches along with the key concepts, academic purposes and key definitions. Theoretical framework chapter has focused on global sourcing strategies and firm’s competencies and sustainable strategies. From the theoretical framework, author has presented essential theory which establishes the link between research questions and proposed hypotheses. Main results and findings have been presented in empirical findings and in data analysis chapter. This study is an exploratory research followed by deductive method and primary data has been used to evaluate the current situation of apparel industry; which will assist to build the recommendation model. Primary data has been collected through online questionnaires and secondary data has used to cover the literature and theoretical parts. Therefore, the potential outcome of this paper will display the importance of sustainable sourcing from academic point of view and also from the business perspective. As a final point, this paper has followed the research objectives and has generated some new directions for further studies.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this thesis the process of building a software for transport accessibility analysis is described. The goal was to create a software which is easy to distribute and simple to use for the user without particular background in the field of the geographical data analysis. It was shown that existing tools do not suit for this particular task due to complex interface or significant rendering time. The goal was accomplished by applying modern approaches in the process of building web applications such as maps based on vector tiles, FLUX architecture design pattern and module bundling. It was discovered that vector tiles have considerable advantages over image-based tiles such as faster rendering and real-time styling.