26 resultados para Data envelopment analysis (DEA).


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This work is devoted to the problem of reconstructing the basis weight structure at paper web with black{box techniques. The data that is analyzed comes from a real paper machine and is collected by an o®-line scanner. The principal mathematical tool used in this work is Autoregressive Moving Average (ARMA) modelling. When coupled with the Discrete Fourier Transform (DFT), it gives a very flexible and interesting tool for analyzing properties of the paper web. Both ARMA and DFT are independently used to represent the given signal in a simplified version of our algorithm, but the final goal is to combine the two together. Ljung-Box Q-statistic lack-of-fit test combined with the Root Mean Squared Error coefficient gives a tool to separate significant signals from noise.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Nowadays the used fuel variety in power boilers is widening and new boiler constructions and running models have to be developed. This research and development is done in small pilot plants where more faster analyse about the boiler mass and heat balance is needed to be able to find and do the right decisions already during the test run. The barrier on determining boiler balance during test runs is the long process of chemical analyses of collected input and outputmatter samples. The present work is concentrating on finding a way to determinethe boiler balance without chemical analyses and optimise the test rig to get the best possible accuracy for heat and mass balance of the boiler. The purpose of this work was to create an automatic boiler balance calculation method for 4 MW CFB/BFB pilot boiler of Kvaerner Pulping Oy located in Messukylä in Tampere. The calculation was created in the data management computer of pilot plants automation system. The calculation is made in Microsoft Excel environment, which gives a good base and functions for handling large databases and calculations without any delicate programming. The automation system in pilot plant was reconstructed und updated by Metso Automation Oy during year 2001 and the new system MetsoDNA has good data management properties, which is necessary for big calculations as boiler balance calculation. Two possible methods for calculating boiler balance during test run were found. Either the fuel flow is determined, which is usedto calculate the boiler's mass balance, or the unburned carbon loss is estimated and the mass balance of the boiler is calculated on the basis of boiler's heat balance. Both of the methods have their own weaknesses, so they were constructed parallel in the calculation and the decision of the used method was left to user. User also needs to define the used fuels and some solid mass flowsthat aren't measured automatically by the automation system. With sensitivity analysis was found that the most essential values for accurate boiler balance determination are flue gas oxygen content, the boiler's measured heat output and lower heating value of the fuel. The theoretical part of this work concentrates in the error management of these measurements and analyses and on measurement accuracy and boiler balance calculation in theory. The empirical part of this work concentrates on the creation of the balance calculation for the boiler in issue and on describing the work environment.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Työn tarkoituksena oli kerätä käyttövarmuustietoa savukaasulinjasta kahdelta suomalaiselta sellutehtaalta niiden käyttöönotosta aina tähän päivään asti. Käyttövarmuustieto koostuu luotettavuustiedoista sekä kunnossapitotiedoista. Kerätyn tiedon avulla on mahdollista kuvata tarkasti laitoksen käyttövarmuutta seuraavilla tunnusluvuilla: suunnittelemattomien häiriöiden lukumäärä ja korjausajat, laitteiden seisokkiaika, vikojen todennäköisyys ja korjaavan kunnossapidon kustannukset suhteessa savukaasulinjan korjaavan kunnossapidon kokonaiskustannuksiin. Käyttövarmuustiedon keräysmetodi on esitelty. Savukaasulinjan kriittisten laitteiden määrittelyyn käytetty metodi on yhdistelmä kyselytutkimuksesta ja muunnellusta vian vaikutus- ja kriittisyysanalyysistä. Laitteiden valitsemiskriteerit lopulliseen kriittisyysanalyysiin päätettiin käyttövarmuustietojen sekä kyselytutkimuksen perusteella. Kriittisten laitteiden määrittämisen tarkoitus on löytää savukaasulinjasta ne laitteet, joiden odottamaton vikaantuminen aiheuttaa vakavimmat seuraukset savukaasulinjan luotettavuuteen, tuotantoon, turvallisuuteen, päästöihin ja kustannuksiin. Tiedon avulla rajoitetut kunnossapidon resurssit voidaan suunnata oikein. Kriittisten laitteiden määrittämisen tuloksena todetaan, että kolme kriittisintä laitetta savukaasulinjassa ovat molemmille sellutehtaille yhteisesti: savukaasupuhaltimet, laahakuljettimet sekä ketjukuljettimet. Käyttövarmuustieto osoittaa, että laitteiden luotettavuus on tehdaskohtaista, mutta periaatteessa samat päälinjat voidaan nähdä suunnittelemattomien vikojen todennäköisyyttä esittävissä kuvissa. Kustannukset, jotka esitetään laitteen suunnittelemattomien kunnossapitokustannusten suhteena savukaasulinjan kokonaiskustannuksiin, noudattelevat hyvin pitkälle luotettavuuskäyrää, joka on laskettu laitteen seisokkiajan suhteena käyttötunteihin. Käyttövarmuustiedon keräys yhdistettynä kriittisten laitteiden määrittämiseen mahdollistavat ennakoivan kunnossapidon oikean kohdistamisen ja ajoittamisen laitteiston elinaikana siten, että luotettavuus- ja kustannustehokkuusvaatimukset saavutetaan.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this work we study the classification of forest types using mathematics based image analysis on satellite data. We are interested in improving classification of forest segments when a combination of information from two or more different satellites is used. The experimental part is based on real satellite data originating from Canada. This thesis gives summary of the mathematics basics of the image analysis and supervised learning , methods that are used in the classification algorithm. Three data sets and four feature sets were investigated in this thesis. The considered feature sets were 1) histograms (quantiles) 2) variance 3) skewness and 4) kurtosis. Good overall performances were achieved when a combination of ASTERBAND and RADARSAT2 data sets was used.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Longitudinal surveys are increasingly used to collect event history data on person-specific processes such as transitions between labour market states. Surveybased event history data pose a number of challenges for statistical analysis. These challenges include survey errors due to sampling, non-response, attrition and measurement. This study deals with non-response, attrition and measurement errors in event history data and the bias caused by them in event history analysis. The study also discusses some choices faced by a researcher using longitudinal survey data for event history analysis and demonstrates their effects. These choices include, whether a design-based or a model-based approach is taken, which subset of data to use and, if a design-based approach is taken, which weights to use. The study takes advantage of the possibility to use combined longitudinal survey register data. The Finnish subset of European Community Household Panel (FI ECHP) survey for waves 1–5 were linked at person-level with longitudinal register data. Unemployment spells were used as study variables of interest. Lastly, a simulation study was conducted in order to assess the statistical properties of the Inverse Probability of Censoring Weighting (IPCW) method in a survey data context. The study shows how combined longitudinal survey register data can be used to analyse and compare the non-response and attrition processes, test the missingness mechanism type and estimate the size of bias due to non-response and attrition. In our empirical analysis, initial non-response turned out to be a more important source of bias than attrition. Reported unemployment spells were subject to seam effects, omissions, and, to a lesser extent, overreporting. The use of proxy interviews tended to cause spell omissions. An often-ignored phenomenon classification error in reported spell outcomes, was also found in the data. Neither the Missing At Random (MAR) assumption about non-response and attrition mechanisms, nor the classical assumptions about measurement errors, turned out to be valid. Both measurement errors in spell durations and spell outcomes were found to cause bias in estimates from event history models. Low measurement accuracy affected the estimates of baseline hazard most. The design-based estimates based on data from respondents to all waves of interest and weighted by the last wave weights displayed the largest bias. Using all the available data, including the spells by attriters until the time of attrition, helped to reduce attrition bias. Lastly, the simulation study showed that the IPCW correction to design weights reduces bias due to dependent censoring in design-based Kaplan-Meier and Cox proportional hazard model estimators. The study discusses implications of the results for survey organisations collecting event history data, researchers using surveys for event history analysis, and researchers who develop methods to correct for non-sampling biases in event history data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Med prediktion avses att man skattar det framtida värdet på en observerbar storhet. Kännetecknande för det bayesianska paradigmet är att osäkerhet gällande okända storheter uttrycks i form av sannolikheter. En bayesiansk prediktiv modell är således en sannolikhetsfördelning över de möjliga värden som en observerbar, men ännu inte observerad storhet kan anta. I de artiklar som ingår i avhandlingen utvecklas metoder, vilka bl.a. tillämpas i analys av kromatografiska data i brottsutredningar. Med undantag för den första artikeln, bygger samtliga metoder på bayesiansk prediktiv modellering. I artiklarna betraktas i huvudsak tre olika typer av problem relaterade till kromatografiska data: kvantifiering, parvis matchning och klustring. I den första artikeln utvecklas en icke-parametrisk modell för mätfel av kromatografiska analyser av alkoholhalt i blodet. I den andra artikeln utvecklas en prediktiv inferensmetod för jämförelse av två stickprov. Metoden tillämpas i den tredje artik eln för jämförelse av oljeprover i syfte att kunna identifiera den förorenande källan i samband med oljeutsläpp. I den fjärde artikeln härleds en prediktiv modell för klustring av data av blandad diskret och kontinuerlig typ, vilken bl.a. tillämpas i klassificering av amfetaminprover med avseende på produktionsomgångar.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Communications play a key role in modern smart grids. New functionalities that make the grids ‘smart’ require the communication network to function properly. Data transmission between intelligent electric devices (IEDs) in the rectifier and the customer-end inverters (CEIs) used for power conversion is also required in the smart grid concept of the low-voltage direct current (LVDC) distribution network. Smart grid applications, such as smart metering, demand side management (DSM), and grid protection applied with communications are all installed in the LVDC system. Thus, besides remote connection to the databases of the grid operators, a local communication network in the LVDC network is needed. One solution applied to implement the communication medium in power distribution grids is power line communication (PLC). There are power cables in the distribution grids, and hence, they may be applied as a communication channel for the distribution-level data. This doctoral thesis proposes an IP-based high-frequency (HF) band PLC data transmission concept for the LVDC network. A general method to implement the Ethernet-based PLC concept between the public distribution rectifier and the customerend inverters in the LVDC grid is introduced. Low-voltage cables are studied as the communication channel in the frequency band of 100 kHz–30 MHz. The communication channel characteristics and the noise in the channel are described. All individual components in the channel are presented in detail, and a channel model, comprising models for each channel component is developed and verified by measurements. The channel noise is also studied by measurements. Theoretical signalto- noise ratio (SNR) and channel capacity analyses and practical data transmission tests are carried out to evaluate the applicability of the PLC concept against the requirements set by the smart grid applications in the LVDC system. The main results concerning the applicability of the PLC concept and its limitations are presented, and suggestion for future research proposed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis introduces heat demand forecasting models which are generated by using data mining algorithms. The forecast spans one full day and this forecast can be used in regulating heat consumption of buildings. For training the data mining models, two years of heat consumption data from a case building and weather measurement data from Finnish Meteorological Institute are used. The thesis utilizes Microsoft SQL Server Analysis Services data mining tools in generating the data mining models and CRISP-DM process framework to implement the research. Results show that the built models can predict heat demand at best with mean average percentage errors of 3.8% for 24-h profile and 5.9% for full day. A deployment model for integrating the generated data mining models into an existing building energy management system is also discussed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This research concerns the Urban Living Idea Contest conducted by Creator Space™ of BASF SE during its 150th anniversary in 2015. The main objectives of the thesis are to provide a comprehensive analysis of the Urban Living Idea Contest (ULIC) and propose a number of improvement suggestions for future years. More than 4,000 data points were collected and analyzed to investigate the functionality of different elements of the contest. Furthermore, a set of improvement suggestions were proposed to BASF SE. Novelty of this thesis lies in the data collection and the original analysis of the contest, which identified its critical elements, as well as the areas that could be improved. The author of this research was a member of the organizing team and involved in the decision making process from the beginning until the end of the ULIC.