20 resultados para Data Interpretation, Statistical
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.
Resumo:
The importance of after-sales service or service in general can be seen and experienced by customers every day with industrial as well as other non-industrial services or products. This dissertation, drawing on theory and experience, focuses on practical engineering implications, specifically the management of customer issues in the after-sales phase in the mobile phone arena. The main objective of this doctoral dissertation is to investigate customer after-sales issue management, specifically regarding mobile phones. The case studies focus on issue resolution time and the issue of corrective actions. This dissertation consists of a main body and four peer-reviewed journal articles and one manuscript currently under review by a peer-reviewed journal. The main body of this dissertation examines the elements of customer satisfaction, loyalty, and retention with respect to corrective actions to address customer issues and issue resolution time through literature and empirical studies. The five independent works are case studies supporting the thesis research questions. This study examines four questions: 1) What are the factors affecting corrective actions for customers? 2) How can customer issue resolution time be controlled? 3) What are the factors affecting processes in the service chain? and 4) How can communication be measured in a service chain? In this work, both quantitative and qualitative analysis methods are used. The main body of the thesis reviews the literature regarding the elements that bridge the five case studies. The case studies of the articles and surveys lean more toward the methodology of critical positivism and then apply the interpretive approach in interpreting the results. The case study articles employ various statistical methods to analyze and to interpret the empirical and survey data. The statistical methods were used to create a model that is useful for significantly optimizing issue resolution time. Moreover, it was found that samples for verifying issues provided by the customer neither improve the perceived quality of corrective actions nor the perceived quality of issue resolution time. The term service in this work is limited to the technical services that are provided by product manufacturers and after-sales authorized service vendors. On the basis of this research work, it has been observed that corrective actions and issue resolution time are associated with customer satisfaction and hence, according to induction theory, to customer loyalty and retention. This thesis utilizes knowledge of marketing and customer relationships to contribute to the existing body of knowledge concerning information and communication technology for after-sales service recovery of mobile terminals. The established models in the thesis contribute to the existing knowledge of the after-sales process of dealing with customer issues in the field of mobile phones. The findings suggest that process managers could focus more on communication and training provided to the staff as new technology evolves rapidly. The study also suggest the managers formulate strategies for how customers can be kept informed on a regular basis of the status of issues that have been escalated for corrective action. The findings also lay the foundation for the comprehensive objective to control the entire product development process, starting with conceptualization. This implies that robust design should be applied to the new products so that problems affecting customer service quality are not repeated. The objective will be achieved when the entire service chain from product development to the final user can be modeled and this model can be used to support the organization at all levels.
Resumo:
Virtual environments and real-time simulators (VERS) are becoming more and more important tools in research and development (R&D) process of non-road mobile machinery (NRMM). The virtual prototyping techniques enable faster and more cost-efficient development of machines compared to use of real life prototypes. High energy efficiency has become an important topic in the world of NRMM because of environmental and economic demands. The objective of this thesis is to develop VERS based methods for research and development of NRMM. A process using VERS for assessing effects of human operators on the life-cycle efficiency of NRMM was developed. Human in the loop simulations are ran using an underground mining loader to study the developed process. The simulations were ran in the virtual environment of the Laboratory of Intelligent Machines of Lappeenranta University of Technology. A physically adequate real-time simulation model of NRMM was shown to be reliable and cost effective in testing of hardware components by the means of hardware-in-the-loop (HIL) simulations. A control interface connecting integrated electro-hydraulic energy converter (IEHEC) with virtual simulation model of log crane was developed. IEHEC consists of a hydraulic pump-motor and an integrated electrical permanent magnet synchronous motorgenerator. The results show that state of the art real-time NRMM simulators are capable to solve factors related to energy consumption and productivity of the NRMM. A significant variation between the test drivers is found. The results show that VERS can be used for assessing human effects on the life-cycle efficiency of NRMM. HIL simulation responses compared to that achieved with conventional simulation method demonstrate the advances and drawbacks of various possible interfaces between the simulator and hardware part of the system under study. Novel ideas for arranging the interface are successfully tested and compared with the more traditional one. The proposed process for assessing the effects of operators on the life-cycle efficiency will be applied for wider group of operators in the future. Driving styles of the operators can be analysed statistically from sufficient large result data. The statistical analysis can find the most life-cycle efficient driving style for the specific environment and machinery. The proposed control interface for HIL simulation need to be further studied. The robustness and the adaptation of the interface in different situations must be verified. The future work will also include studying the suitability of the IEHEC for different working machines using the proposed HIL simulation method.
Resumo:
The Finnish legislation requires for a safe and secure learning environment. However, the comprehensive, risk based safety and security management (SSM) and the management commitment in the implementation and development of the SSM are not mentioned in the legislation. Multiple institutions, operators and researchers have studied and developed safety and security in educational institutions over the past decade. Typically the approach has been fragmented and without bringing up the importance of the comprehensive SSM. The development needs of the safety and security operations in universities have been studied. However, in universities of applied sciences (UASs) and in elementary schools (ESs), the performance level, strengths and weaknesses of the comprehensive SSM have not been studied. The objective of this study was to develop the comprehensive, risk based SSM of educational institutions by developing the new Asteri consultative auditing process and study its effects on auditees. Furthermore, the performance level in the comprehensive SSM in UASs and ESs were studied using Asteri and the TUTOR model developed by the Keski-Uusimaa Department for Rescue Services. In addition, strengths, development needs and differences were identified. In total, 76 educational institutions were audited between the years 2011 and 2014. The study is based on logical empiricism, and an observational applied research design was used. Auditing, observation and an electronic survey were used for data collection. Statistical analysis was used to analyze the collected information. In addition, thematic analysis was used to analyze the development areas of the organizations mentioned by the respondents in the survey. As one of the main contributions, this research presents the new Asteri consultative auditing process. Organizations with low performance levels on the audited subject benefit the most from the Asteri consultative auditing process. Asteri may be usable in many different types of audits, not only in SSM audits. As a new result, this study provides new knowledge on attitudes related to auditing. According to the research findings, auditing may generate negative attitudes and the auditor should take them into account when planning and preparing for audits. Negative attitudes can be compensated by producing added value, objectivity and positivity for the audit and, thus, improve the positive effects of auditing on knowledge and skills. Moreover, as the results of this study shows, auditing safety and security issues do not increase feelings of insecurity, but rather increase feelings of safety and security when using the new Asteri consultative auditing process with the TUTOR model. The results showed that the SSM in the audited UASs was statistically significantly more advanced than that in the audited ESs. However, there is still room for improvement in the ESs and the UASs as the approach to the SSM was fragmented. It can be assumed that the majority of Finnish UASs and ESs do not likely meet the basic level of the comprehensive, risk based the SSM.
Resumo:
Ty ksittelee Markkinoinnin Automaatiota, viitekehyksen rakentamista Markkinoinnin Automaation kyttnottoon ja sen hydyntmiselle markkinoinnin ja myynnin putken hallinnassa. Ty on suoritettu tapaustutkimuksena, jonka primri datana on kytetty puoli-strukturoituja haastatteluja ja sekundri datana on kytetty dataa myynnin tietojrjestelmist. Kirjallisuuskatsaus markkinoinnin automaatioon paljastaa, ett aihetta ei ole juurikaan tutkittu akateemisesti. Etenkin selkeit aukkoja teorioissa on miten markkinoinnin automaatiota kannattaisi aloittaa ja miten siihen tarvittavia kamppanjoita kannattaisi rakentaa. Tapaustutkimuksen tuloksena selvisi selket ongelma kohdat nykysess markkinoinnin ja myynnin putkessa, ja mys kohdat joissa markkinoinnin automaatio voi olla avuksi. Suurin osa ongelma kohdista on markkinoinnin ja myynnin vliss. Toimiakseen markkinoinnin automaatio vaatii selket mritykset yrityksess Liidille ja miten sit ksitelln. Toimivuuden takaamiseksi se tarvitsee mys jatkuvaa palautetta liideist ja myynneist. Alue mik mys tarvitsee muutosta paremman toimivuuden takaamiseksi on markkinoinnin kamppanjoiden suunnittelu, yhdess myynnin kanssa ja asiakkaan polku edell. Tulevaisuuden tavoitteena tulisi olla viestien personointi ja asiakkaiden profilointi. Tulevaisuuden tutkimuskohteet olisivat erittin avuliaita yrityksille, varsinki jos ne ksittelisivt kyttnottoa tai personointia.
Resumo:
To enable a mathematically and physically sound execution of the fatigue test and a correct interpretation of its results, statistical evaluation methods are used to assist in the analysis of fatigue testing data. The main objective of this work is to develop step-by-stepinstructions for statistical analysis of the laboratory fatigue data. The scopeof this project is to provide practical cases about answering the several questions raised in the treatment of test data with application of the methods and formulae in the document IIW-XIII-2138-06 (Best Practice Guide on the Statistical Analysis of Fatigue Data). Generally, the questions in the data sheets involve some aspects: estimation of necessary sample size, verification of the statistical equivalence of the collated sets of data, and determination of characteristic curves in different cases. The series of comprehensive examples which are given in this thesis serve as a demonstration of the various statistical methods to develop a sound procedure to create reliable calculation rules for the fatigue analysis.
Resumo:
Longitudinal surveys are increasingly used to collect event history data on person-specific processes such as transitions between labour market states. Surveybased event history data pose a number of challenges for statistical analysis. These challenges include survey errors due to sampling, non-response, attrition and measurement. This study deals with non-response, attrition and measurement errors in event history data and the bias caused by them in event history analysis. The study also discusses some choices faced by a researcher using longitudinal survey data for event history analysis and demonstrates their effects. These choices include, whether a design-based or a model-based approach is taken, which subset of data to use and, if a design-based approach is taken, which weights to use. The study takes advantage of the possibility to use combined longitudinal survey register data. The Finnish subset of European Community Household Panel (FI ECHP) survey for waves 15 were linked at person-level with longitudinal register data. Unemployment spells were used as study variables of interest. Lastly, a simulation study was conducted in order to assess the statistical properties of the Inverse Probability of Censoring Weighting (IPCW) method in a survey data context. The study shows how combined longitudinal survey register data can be used to analyse and compare the non-response and attrition processes, test the missingness mechanism type and estimate the size of bias due to non-response and attrition. In our empirical analysis, initial non-response turned out to be a more important source of bias than attrition. Reported unemployment spells were subject to seam effects, omissions, and, to a lesser extent, overreporting. The use of proxy interviews tended to cause spell omissions. An often-ignored phenomenon classification error in reported spell outcomes, was also found in the data. Neither the Missing At Random (MAR) assumption about non-response and attrition mechanisms, nor the classical assumptions about measurement errors, turned out to be valid. Both measurement errors in spell durations and spell outcomes were found to cause bias in estimates from event history models. Low measurement accuracy affected the estimates of baseline hazard most. The design-based estimates based on data from respondents to all waves of interest and weighted by the last wave weights displayed the largest bias. Using all the available data, including the spells by attriters until the time of attrition, helped to reduce attrition bias. Lastly, the simulation study showed that the IPCW correction to design weights reduces bias due to dependent censoring in design-based Kaplan-Meier and Cox proportional hazard model estimators. The study discusses implications of the results for survey organisations collecting event history data, researchers using surveys for event history analysis, and researchers who develop methods to correct for non-sampling biases in event history data.
Resumo:
The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and ecient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the eld of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specic data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The rst study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specication in mouse embryonicstem cells.
Resumo:
The main objective of this study was todo a statistical analysis of ecological type from optical satellite data, using Tipping's sparse Bayesian algorithm. This thesis uses "the Relevence Vector Machine" algorithm in ecological classification betweenforestland and wetland. Further this bi-classification technique was used to do classification of many other different species of trees and produces hierarchical classification of entire subclasses given as a target class. Also, we carried out an attempt to use airborne image of same forest area. Combining it with image analysis, using different image processing operation, we tried to extract good features and later used them to perform classification of forestland and wetland.
Resumo:
Tutkielman tavoitteena on selvitt, miten tilinptsanalyysi tykaluna soveltuu ICT-yritysten menestyksen arviointiin. Samalla pyritn tilinptsanalyysin avulla arvioimaan Kaakkois-Suomen ICT-klusterin taloudellinen tilanne. Tutkielman teoriaosassa tutustutaan uuden talouden erityispiirteiden vaikutuksiin tilinptsinformaation ja tilinptsanalyysin tulosten tulkinnassa. Empiirinen aineisto koostuu Kaakkois-Suomen ICT-klusterin tilinptksist, joista lasketaan valitut tunnusluvut. Niden tunnuslukujen avulla pyritn arvioimaan yritysten menestyst ja selvittmn menestykseen vaikuttavia tekijit taloudellisesta nkkulmasta. Tutkimus on kvalitatiivinen ja tutkimusotteeltaan deskriptiivinen eli kuvaileva ja selittv. Tutkimustulosten mukaan tilinptsanalyysi ei ole riittv tykalu ICT-yritysten menestyksen arviointiin. Sen avulla on mahdollista kert yrityksist pohjatietoa, jota tulee muiden keinojen mm. yrityshaastattelujen avulla tydent. Tilinptsanalyysi osoittaa, ett Kaakkois-Suomen ICT-yritysten taloudellinen tilanne on keskimrisesti hyv, mutta tulevaisuutta on mahdoton ennustaa. Koska kyseess ei ole tilastollinen tutkimus, eik kohdeyritysjoukko ole riittv, ei tuloksia voi yleist koskemaan kaikkia toimialan yrityksi.
Resumo:
In a very volatile industry of high technology it is of utmost importance to accurately forecast customers demand. However, statistical forecasting of sales, especially in heavily competitive electronics product business, has always been a challenging task due to very high variation in demand and very short product life cycles of products. The purpose of this thesis is to validate if statistical methods can be applied to forecasting sales of short life cycle electronics products and provide a feasible framework for implementing statistical forecasting in the environment of the case company. Two different approaches have been developed for forecasting on short and medium term and long term horizons. Both models are based on decomposition models, but differ in interpretation of the model residuals. For long term horizons residuals are assumed to represent white noise, whereas for short and medium term forecasting horizon residuals are modeled using statistical forecasting methods. Implementation of both approaches is performed in Matlab. Modeling results have shown that different markets exhibit different demand patterns and therefore different analytical approaches are appropriate for modeling demand in these markets. Moreover, the outcomes of modeling imply that statistical forecasting can not be handled separately from judgmental forecasting, but should be perceived only as a basis for judgmental forecasting activities. Based on modeling results recommendations for further deployment of statistical methods in sales forecasting of the case company are developed.
Resumo:
This thesis was focussed on statistical analysis methods and proposes the use of Bayesian inference to extract information contained in experimental data by estimating Ebola model parameters. The model is a system of differential equations expressing the behavior and dynamics of Ebola. Two sets of data (onset and death data) were both used to estimate parameters, which has not been done by previous researchers in (Chowell, 2004). To be able to use both data, a new version of the model has been built. Model parameters have been estimated and then used to calculate the basic reproduction number and to study the disease-free equilibrium. Estimates of the parameters were useful to determine how well the model fits the data and how good estimates were, in terms of the information they provided about the possible relationship between variables. The solution showed that Ebola model fits the observed onset data at 98.95% and the observed death data at 93.6%. Since Bayesian inference can not be performed analytically, the Markov chain Monte Carlo approach has been used to generate samples from the posterior distribution over parameters. Samples have been used to check the accuracy of the model and other characteristics of the target posteriors.
Resumo:
The identifiability of the parameters of a heat exchanger model without phase change was studied in this Masters thesis using synthetically made data. A fast, two-step Markov chain Monte Carlo method (MCMC) was tested with a couple of case studies and a heat exchanger model. The two-step MCMC-method worked well and decreased the computation time compared to the traditional MCMC-method. The effect of measurement accuracy of certain control variables to the identifiability of parameters was also studied. The accuracy used did not seem to have a remarkable effect to the identifiability of parameters. The use of the posterior distribution of parameters in different heat exchanger geometries was studied. It would be computationally most efficient to use the same posterior distribution among different geometries in the optimisation of heat exchanger networks. According to the results, this was possible in the case when the frontal surface areas were the same among different geometries. In the other cases the same posterior distribution can be used for optimisation too, but that will give a wider predictive distribution as a result. For condensing surface heat exchangers the numerical stability of the simulation model was studied. As a result, a stable algorithm was developed.
Resumo:
Markku Laitinen's keynote presentation in the QQML conference in Limerick, Ireland the 23rd of April, 2012.