15 resultados para statistical equivalence
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
To enable a mathematically and physically sound execution of the fatigue test and a correct interpretation of its results, statistical evaluation methods are used to assist in the analysis of fatigue testing data. The main objective of this work is to develop step-by-stepinstructions for statistical analysis of the laboratory fatigue data. The scopeof this project is to provide practical cases about answering the several questions raised in the treatment of test data with application of the methods and formulae in the document IIW-XIII-2138-06 (Best Practice Guide on the Statistical Analysis of Fatigue Data). Generally, the questions in the data sheets involve some aspects: estimation of necessary sample size, verification of the statistical equivalence of the collated sets of data, and determination of characteristic curves in different cases. The series of comprehensive examples which are given in this thesis serve as a demonstration of the various statistical methods to develop a sound procedure to create reliable calculation rules for the fatigue analysis.
Resumo:
The general objective of this study was to conduct astatistical analysis on the variation of the weld profiles and their influence on the fatigue strength of the joint. Weld quality with respect to its fatigue strength is of importance which is the main concept behind this thesis. The intention of this study was to establish the influence of weld geometric parameters on the weld quality and fatigue strength. The effect of local geometrical variations of non-load carrying cruciform fillet welded joint under tensile loading wasstudied in this thesis work. Linear Elastic Fracture Mechanics was used to calculate fatigue strength of the cruciform fillet welded joints in as-welded condition and under cyclic tensile loading, for a range of weld geometries. With extreme value statistical analysis and LEFM, an attempt was made to relate the variation of the cruciform weld profiles such as weld angle and weld toe radius to respective FAT classes.
Resumo:
Tämä diplomityö liittyy Spektrikuvien tutkimiseen tilastollisen kuvamallin näkökulmasta. Diplomityön ensimmäisessä osassa tarkastellaan tilastollisten parametrien jakaumien vaikutusta väreihin ja korostumiin erilaisissa valaistusolosuhteissa. Havaittiin, että tilastollisten parametrien väliset suhteet eivät riipu valaistusolosuhteista, mutta riippuvat kuvan häiriöttömyydestä. Ilmeni myös, että korkea huipukkuus saattaa aiheutua värikylläisyydestä. Lisäksi työssä kehitettiin tilastolliseen spektrimalliin perustuvaa tekstuurinyhdistämisalgoritmia. Sillä saavutettiin hyviä tuloksia, kun tilastollisten parametrien väliset riippuvuussuhteet olivat voimassa. Työn toisessa osassa erilaisia spektrikuvia tutkittiin käyttäen itsenäistä komponenttien analyysia (ICA). Seuraavia itsenäiseen komponenttien analyysiin tarkoitettuja algoritmia tarkasteltiin: JADE, kiinteän pisteen ICA ja momenttikeskeinen ICA. Tutkimuksissa painotettiin erottelun laatua. Paras erottelu saavutettiin JADE- algoritmilla, joskin erot muiden algoritmien välillä eivät olleet merkittäviä. Algoritmi jakoi kuvan kahteen itsenäiseen, joko korostuneeseen ja korostumattomaan tai kromaattiseen ja akromaattiseen, komponenttiin. Lopuksi pohditaan huipukkuuden suhdetta kuvan ominaisuuksiin, kuten korostuneisuuteen ja värikylläisyyteen. Työn viimeisessä osassa ehdotetaan mahdollisia jatkotutkimuskohteita.
Resumo:
In a very volatile industry of high technology it is of utmost importance to accurately forecast customers’ demand. However, statistical forecasting of sales, especially in heavily competitive electronics product business, has always been a challenging task due to very high variation in demand and very short product life cycles of products. The purpose of this thesis is to validate if statistical methods can be applied to forecasting sales of short life cycle electronics products and provide a feasible framework for implementing statistical forecasting in the environment of the case company. Two different approaches have been developed for forecasting on short and medium term and long term horizons. Both models are based on decomposition models, but differ in interpretation of the model residuals. For long term horizons residuals are assumed to represent white noise, whereas for short and medium term forecasting horizon residuals are modeled using statistical forecasting methods. Implementation of both approaches is performed in Matlab. Modeling results have shown that different markets exhibit different demand patterns and therefore different analytical approaches are appropriate for modeling demand in these markets. Moreover, the outcomes of modeling imply that statistical forecasting can not be handled separately from judgmental forecasting, but should be perceived only as a basis for judgmental forecasting activities. Based on modeling results recommendations for further deployment of statistical methods in sales forecasting of the case company are developed.
Resumo:
Construction of multiple sequence alignments is a fundamental task in Bioinformatics. Multiple sequence alignments are used as a prerequisite in many Bioinformatics methods, and subsequently the quality of such methods can be critically dependent on the quality of the alignment. However, automatic construction of a multiple sequence alignment for a set of remotely related sequences does not always provide biologically relevant alignments.Therefore, there is a need for an objective approach for evaluating the quality of automatically aligned sequences. The profile hidden Markov model is a powerful approach in comparative genomics. In the profile hidden Markov model, the symbol probabilities are estimated at each conserved alignment position. This can increase the dimension of parameter space and cause an overfitting problem. These two research problems are both related to conservation. We have developed statistical measures for quantifying the conservation of multiple sequence alignments. Two types of methods are considered, those identifying conserved residues in an alignment position, and those calculating positional conservation scores. The positional conservation score was exploited in a statistical prediction model for assessing the quality of multiple sequence alignments. The residue conservation score was used as part of the emission probability estimation method proposed for profile hidden Markov models. The results of the predicted alignment quality score highly correlated with the correct alignment quality scores, indicating that our method is reliable for assessing the quality of any multiple sequence alignment. The comparison of the emission probability estimation method with the maximum likelihood method showed that the number of estimated parameters in the model was dramatically decreased, while the same level of accuracy was maintained. To conclude, we have shown that conservation can be successfully used in the statistical model for alignment quality assessment and in the estimation of emission probabilities in the profile hidden Markov models.
Resumo:
In this thesis the X-ray tomography is discussed from the Bayesian statistical viewpoint. The unknown parameters are assumed random variables and as opposite to traditional methods the solution is obtained as a large sample of the distribution of all possible solutions. As an introduction to tomography an inversion formula for Radon transform is presented on a plane. The vastly used filtered backprojection algorithm is derived. The traditional regularization methods are presented sufficiently to ground the Bayesian approach. The measurements are foton counts at the detector pixels. Thus the assumption of a Poisson distributed measurement error is justified. Often the error is assumed Gaussian, altough the electronic noise caused by the measurement device can change the error structure. The assumption of Gaussian measurement error is discussed. In the thesis the use of different prior distributions in X-ray tomography is discussed. Especially in severely ill-posed problems the use of a suitable prior is the main part of the whole solution process. In the empirical part the presented prior distributions are tested using simulated measurements. The effect of different prior distributions produce are shown in the empirical part of the thesis. The use of prior is shown obligatory in case of severely ill-posed problem.
Resumo:
This thesis was focussed on statistical analysis methods and proposes the use of Bayesian inference to extract information contained in experimental data by estimating Ebola model parameters. The model is a system of differential equations expressing the behavior and dynamics of Ebola. Two sets of data (onset and death data) were both used to estimate parameters, which has not been done by previous researchers in (Chowell, 2004). To be able to use both data, a new version of the model has been built. Model parameters have been estimated and then used to calculate the basic reproduction number and to study the disease-free equilibrium. Estimates of the parameters were useful to determine how well the model fits the data and how good estimates were, in terms of the information they provided about the possible relationship between variables. The solution showed that Ebola model fits the observed onset data at 98.95% and the observed death data at 93.6%. Since Bayesian inference can not be performed analytically, the Markov chain Monte Carlo approach has been used to generate samples from the posterior distribution over parameters. Samples have been used to check the accuracy of the model and other characteristics of the target posteriors.
Resumo:
The optimal design of a heat exchanger system is based on given model parameters together with given standard ranges for machine design variables. The goals set for minimizing the Life Cycle Cost (LCC) function which represents the price of the saved energy, for maximizing the momentary heat recovery output with given constraints satisfied and taking into account the uncertainty in the models were successfully done. Nondominated Sorting Genetic Algorithm II (NSGA-II) for the design optimization of a system is presented and implemented inMatlab environment. Markov ChainMonte Carlo (MCMC) methods are also used to take into account the uncertainty in themodels. Results show that the price of saved energy can be optimized. A wet heat exchanger is found to be more efficient and beneficial than a dry heat exchanger even though its construction is expensive (160 EUR/m2) compared to the construction of a dry heat exchanger (50 EUR/m2). It has been found that the longer lifetime weights higher CAPEX and lower OPEX and vice versa, and the effect of the uncertainty in the models has been identified in a simplified case of minimizing the area of a dry heat exchanger.
Resumo:
The identifiability of the parameters of a heat exchanger model without phase change was studied in this Master’s thesis using synthetically made data. A fast, two-step Markov chain Monte Carlo method (MCMC) was tested with a couple of case studies and a heat exchanger model. The two-step MCMC-method worked well and decreased the computation time compared to the traditional MCMC-method. The effect of measurement accuracy of certain control variables to the identifiability of parameters was also studied. The accuracy used did not seem to have a remarkable effect to the identifiability of parameters. The use of the posterior distribution of parameters in different heat exchanger geometries was studied. It would be computationally most efficient to use the same posterior distribution among different geometries in the optimisation of heat exchanger networks. According to the results, this was possible in the case when the frontal surface areas were the same among different geometries. In the other cases the same posterior distribution can be used for optimisation too, but that will give a wider predictive distribution as a result. For condensing surface heat exchangers the numerical stability of the simulation model was studied. As a result, a stable algorithm was developed.
Resumo:
In the present dissertation, multilingual thesauri were approached as cultural products and the focus was twofold: On the empirical level the focus was placed on the translatability of certain British-English social science indexing terms into the Finnish language and culture at a concept, a term and an indexing term level. On the theoretical level the focus was placed on the aim of translation and on the concept of equivalence. In accordance with modern communicative and dynamic translation theories the interest was on the human dimension. The study is qualitative. In this study, equivalence was understood in a similar way to how dynamic, functional equivalence is commonly understood in translation studies. Translating was seen as a decision-making process, where a translator often has different kinds of possibilities to choose in order to fulfil the function of the translation. Accordingly, and as a starting point for the construction of the empirical part, the function of the source text was considered to be the same or similar to the function of the target text, that is, a functional thesaurus both in source and target context. Further, the study approached the challenges of multilingual thesaurus construction from the perspectives of semantics and pragmatics. In semantic analysis the focus was on what the words conventionally mean and in pragmatics on the ‘invisible’ meaning - or how we recognise what is meant even when it is not actually said (or written). Languages and ideas expressed by languages are created mainly in accordance with expressional needs of the surrounding culture and thesauri were considered to reflect several subcultures and consequently the discourses which represent them. The research material consisted of different kinds of potential discourses: dictionaries, database records, and thesauri, Finnish versus British social science researches, Finnish versus British indexers, simulated indexing tasks with five articles and Finnish versus British thesaurus constructors. In practice, the professional background of the two last mentioned groups was rather similar. It became even more clear that all the material types had their own characteristics, although naturally not entirely separate from each other. It is further noteworthy that the different types and origins of research material were not used to represent true comparison pairs, and that the aim of triangulation of methods and material was to gain a holistic view. The general research questions were: 1. Can differences be found between Finnish and British discourses regarding family roles as thesaurus terms, and if so, what kinds of differences and which are the implications for multilingual thesaurus construction? 2. What is the pragmatic indexing term equivalence? The first question studied how the same topic (family roles) was represented in different contexts and by different users, and further focused on how the possible differences were handled in multilingual thesaurus construction. The second question was based on findings of the previous one, and answered to the final question as to what kinds of factors should be considered when defining translation equivalence in multilingual thesaurus construction. The study used multiple cases and several data collection and analysis methods aiming at theoretical replication and complementarity. The empirical material and analysis consisted of focused interviews (with Finnish and British social scientists, thesaurus constructors and indexers), simulated indexing tasks with Finnish and British indexers, semantic component analysis of dictionary definitions and translations, coword analysis and datasets retrieved in databases, and discourse analysis of thesauri. As a terminological starting point a topic and case family roles was selected. The results were clear: 1) It was possible to identify different discourses. There also existed subdiscourses. For example within the group of social scientists the orientation to qualitative versus quantitative research had an impact on the way they reacted to the studied words and discourses, and indexers placed more emphasis on the information seekers whereas thesaurus constructors approached the construction problems from a more material based solution. The differences between the different specialist groups i.e. the social scientists, the indexers and the thesaurus constructors were often greater than between the different geo-cultural groups i.e. Finnish versus British. The differences occurred as a result of different translation aims, diverging expectations for multilingual thesauri and variety of practices. For multilingual thesaurus construction this means severe challenges. The clearly ambiguous concept of multilingual thesaurus as well as different construction and translation strategies should be considered more precisely in order to shed light on focus and equivalence types, which are clearly not self-evident. The research also revealed the close connection between the aims of multilingual thesauri and the pragmatic indexing term equivalence. 2) The pragmatic indexing term equivalence is very much context-depended. Although thesaurus term equivalence is defined and standardised in the field of library and information science (LIS), it is not understood in one established way and the current LIS tools are inadequate to provide enough analytical tools for both constructing and studying different kinds of multilingual thesauri as well as their indexing term equivalence. The tools provided in translation science were more practical and theoretical, and especially the division of different meanings of a word provided a useful tool in analysing the pragmatic equivalence, which often differs from the ideal model represented in thesaurus construction literature. The study thus showed that the variety of different discourses should be acknowledged, there is a need for operationalisation of new types of multilingual thesauri, and the factors influencing pragmatic indexing term equivalence should be discussed more precisely than is traditionally done.
Resumo:
Longitudinal surveys are increasingly used to collect event history data on person-specific processes such as transitions between labour market states. Surveybased event history data pose a number of challenges for statistical analysis. These challenges include survey errors due to sampling, non-response, attrition and measurement. This study deals with non-response, attrition and measurement errors in event history data and the bias caused by them in event history analysis. The study also discusses some choices faced by a researcher using longitudinal survey data for event history analysis and demonstrates their effects. These choices include, whether a design-based or a model-based approach is taken, which subset of data to use and, if a design-based approach is taken, which weights to use. The study takes advantage of the possibility to use combined longitudinal survey register data. The Finnish subset of European Community Household Panel (FI ECHP) survey for waves 1–5 were linked at person-level with longitudinal register data. Unemployment spells were used as study variables of interest. Lastly, a simulation study was conducted in order to assess the statistical properties of the Inverse Probability of Censoring Weighting (IPCW) method in a survey data context. The study shows how combined longitudinal survey register data can be used to analyse and compare the non-response and attrition processes, test the missingness mechanism type and estimate the size of bias due to non-response and attrition. In our empirical analysis, initial non-response turned out to be a more important source of bias than attrition. Reported unemployment spells were subject to seam effects, omissions, and, to a lesser extent, overreporting. The use of proxy interviews tended to cause spell omissions. An often-ignored phenomenon classification error in reported spell outcomes, was also found in the data. Neither the Missing At Random (MAR) assumption about non-response and attrition mechanisms, nor the classical assumptions about measurement errors, turned out to be valid. Both measurement errors in spell durations and spell outcomes were found to cause bias in estimates from event history models. Low measurement accuracy affected the estimates of baseline hazard most. The design-based estimates based on data from respondents to all waves of interest and weighted by the last wave weights displayed the largest bias. Using all the available data, including the spells by attriters until the time of attrition, helped to reduce attrition bias. Lastly, the simulation study showed that the IPCW correction to design weights reduces bias due to dependent censoring in design-based Kaplan-Meier and Cox proportional hazard model estimators. The study discusses implications of the results for survey organisations collecting event history data, researchers using surveys for event history analysis, and researchers who develop methods to correct for non-sampling biases in event history data.
Resumo:
This master thesis work introduces the fuzzy tolerance/equivalence relation and its application in cluster analysis. The work presents about the construction of fuzzy equivalence relations using increasing generators. Here, we investigate and research on the role of increasing generators for the creation of intersection, union and complement operators. The objective is to develop different varieties of fuzzy tolerance/equivalence relations using different varieties of increasing generators. At last, we perform a comparative study with these developed varieties of fuzzy tolerance/equivalence relations in their application to a clustering method.
Resumo:
In this research, the effectiveness of Naive Bayes and Gaussian Mixture Models classifiers on segmenting exudates in retinal images is studied and the results are evaluated with metrics commonly used in medical imaging. Also, a color variation analysis of retinal images is carried out to find how effectively can retinal images be segmented using only the color information of the pixels.
Resumo:
The strongest wish of the customer concerning chemical pulp features is consistent, uniform quality. Variation may be controlled and reduced by using statistical methods. However, studies addressing the application and benefits of statistical methods in forest product sector are scarce. Thus, the customer wish is the root cause of the motivation behind this dissertation. The research problem addressed by this dissertation is that companies in the chemical forest product sector require new knowledge for improving their utilization of statistical methods. To gain this new knowledge, the research problem is studied from five complementary viewpoints – challenges and success factors, organizational learning, problem solving, economic benefit, and statistical methods as management tools. The five research questions generated on the basis of these viewpoints are answered in four research papers, which are case studies based on empirical data collection. This research as a whole complements the literature dealing with the use of statistical methods in the forest products industry. Practical examples of the application of statistical process control, case-based reasoning, the cross-industry standard process for data mining, and performance measurement methods in the context of chemical forest products manufacturing are brought to the public knowledge of the scientific community. The benefit of the application of these methods is estimated or demonstrated. The purpose of this dissertation is to find pragmatic ideas for companies in the chemical forest product sector in order for them to improve their utilization of statistical methods. The main practical implications of this doctoral dissertation can be summarized in four points: 1. It is beneficial to reduce variation in chemical forest product manufacturing processes 2. Statistical tools can be used to reduce this variation 3. Problem-solving in chemical forest product manufacturing processes can be intensified through the use of statistical methods 4. There are certain success factors and challenges that need to be addressed when implementing statistical methods