909 resultados para exploratory data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays the used fuel variety in power boilers is widening and new boiler constructions and running models have to be developed. This research and development is done in small pilot plants where more faster analyse about the boiler mass and heat balance is needed to be able to find and do the right decisions already during the test run. The barrier on determining boiler balance during test runs is the long process of chemical analyses of collected input and outputmatter samples. The present work is concentrating on finding a way to determinethe boiler balance without chemical analyses and optimise the test rig to get the best possible accuracy for heat and mass balance of the boiler. The purpose of this work was to create an automatic boiler balance calculation method for 4 MW CFB/BFB pilot boiler of Kvaerner Pulping Oy located in Messukylä in Tampere. The calculation was created in the data management computer of pilot plants automation system. The calculation is made in Microsoft Excel environment, which gives a good base and functions for handling large databases and calculations without any delicate programming. The automation system in pilot plant was reconstructed und updated by Metso Automation Oy during year 2001 and the new system MetsoDNA has good data management properties, which is necessary for big calculations as boiler balance calculation. Two possible methods for calculating boiler balance during test run were found. Either the fuel flow is determined, which is usedto calculate the boiler's mass balance, or the unburned carbon loss is estimated and the mass balance of the boiler is calculated on the basis of boiler's heat balance. Both of the methods have their own weaknesses, so they were constructed parallel in the calculation and the decision of the used method was left to user. User also needs to define the used fuels and some solid mass flowsthat aren't measured automatically by the automation system. With sensitivity analysis was found that the most essential values for accurate boiler balance determination are flue gas oxygen content, the boiler's measured heat output and lower heating value of the fuel. The theoretical part of this work concentrates in the error management of these measurements and analyses and on measurement accuracy and boiler balance calculation in theory. The empirical part of this work concentrates on the creation of the balance calculation for the boiler in issue and on describing the work environment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this paper is to examine whether informal labor markets affect the flows of Foreign Direct Investment (FDI), and also whether this effect is similar in developed and developing countries. With this aim, different public data sources, such as the World Bank (WB), and the United Nations Conference on Trade and Development (UNCTAD) are used, and panel econometric models are estimated for a sample of 65 countries over a 14 year period (1996-2009). In addition, this paper uses a dynamic model as an extension of the analysis to establish whether such an effect exists and what its indicators and significance may be.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: The objective was to determine the risk of stroke associated with subclinical hypothyroidism. DATA SOURCES AND STUDY SELECTION: Published prospective cohort studies were identified through a systematic search through November 2013 without restrictions in several databases. Unpublished studies were identified through the Thyroid Studies Collaboration. We collected individual participant data on thyroid function and stroke outcome. Euthyroidism was defined as TSH levels of 0.45-4.49 mIU/L, and subclinical hypothyroidism was defined as TSH levels of 4.5-19.9 mIU/L with normal T4 levels. DATA EXTRACTION AND SYNTHESIS: We collected individual participant data on 47 573 adults (3451 subclinical hypothyroidism) from 17 cohorts and followed up from 1972-2014 (489 192 person-years). Age- and sex-adjusted pooled hazard ratios (HRs) for participants with subclinical hypothyroidism compared to euthyroidism were 1.05 (95% confidence interval [CI], 0.91-1.21) for stroke events (combined fatal and nonfatal stroke) and 1.07 (95% CI, 0.80-1.42) for fatal stroke. Stratified by age, the HR for stroke events was 3.32 (95% CI, 1.25-8.80) for individuals aged 18-49 years. There was an increased risk of fatal stroke in the age groups 18-49 and 50-64 years, with a HR of 4.22 (95% CI, 1.08-16.55) and 2.86 (95% CI, 1.31-6.26), respectively (p trend 0.04). We found no increased risk for those 65-79 years old (HR, 1.00; 95% CI, 0.86-1.18) or ≥ 80 years old (HR, 1.31; 95% CI, 0.79-2.18). There was a pattern of increased risk of fatal stroke with higher TSH concentrations. CONCLUSIONS: Although no overall effect of subclinical hypothyroidism on stroke could be demonstrated, an increased risk in subjects younger than 65 years and those with higher TSH concentrations was observed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

AbstractObjective:To evaluate the association between Hashimoto's thyroiditis (HT) and papillary thyroid carcinoma (PTC).Materials and Methods:The patients were evaluated by ultrasonography-guided fine needle aspiration cytology. Typical cytopathological aspects and/or classical histopathological findings were taken into consideration in the diagnosis of HT, and only histopathological results were considered in the diagnosis of PTC.Results:Among 1,049 patients with multi- or uninodular goiter (903 women and 146 men), 173 (16.5%) had cytopathological features of thyroiditis. Thirty-three (67.4%) out of the 49 operated patients had PTC, 9 (27.3%) of them with histopathological features of HT. Five (31.3%) out of the 16 patients with non-malignant disease also had HT. In the groups with HT, PTC, and PCT+HT, the female prevalence rate was 100%, 91.6%, and 77.8%, respectively. Mean age was 41.5, 43.3, and 48.5 years, respectively. No association was observed between the two diseases in the present study where HT occurred in 31.1% of the benign cases and in 27.3% of malignant cases (p = 0.8).Conclusion:In spite of the absence of association between HT and PCT, the possibility of malignancy in HT should always be considered because of the coexistence of the two diseases already reported in the literature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We empirically investigate the determinants of EMU sovereign bond yield spreads with respect to the German bund. Using panel data techniques, we examine the role of a wide set of potential drivers. To our knowledge, this paper presents one of the most exhaustive compilations of the variables used in the literature to study the behaviour of sovereign yield spreads and, in particular, to gauge the effect on these spreads of changes in market sentiment and risk aversion. We use a sample of both central and peripheral countries from January 1999 to December 2012 and assess whether there were significant changes after the outbreak of the euro area debt crisis. Our results suggest that the rise in sovereign risk in central countries can only be partially explained by the evolution of local macroeconomic variables in those countries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We empirically investigate the determinants of EMU sovereign bond yield spreads with respect to the German bund. Using panel data techniques, we examine the role of a wide set of potential drivers. To our knowledge, this paper presents one of the most exhaustive compilations of the variables used in the literature to study the behaviour of sovereign yield spreads and, in particular, to gauge the effect on these spreads of changes in market sentiment and risk aversion. We use a sample of both central and peripheral countries from January 1999 to December 2012 and assess whether there were significant changes after the outbreak of the euro area debt crisis. Our results suggest that the rise in sovereign risk in central countries can only be partially explained by the evolution of local macroeconomic variables in those countries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

First application of compositional data analysis techniques to Australian election data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In any discipline, where uncertainty and variability are present, it is important to haveprinciples which are accepted as inviolate and which should therefore drive statisticalmodelling, statistical analysis of data and any inferences from such an analysis.Despite the fact that two such principles have existed over the last two decades andfrom these a sensible, meaningful methodology has been developed for the statisticalanalysis of compositional data, the application of inappropriate and/or meaninglessmethods persists in many areas of application. This paper identifies at least tencommon fallacies and confusions in compositional data analysis with illustrativeexamples and provides readers with necessary, and hopefully sufficient, arguments topersuade the culprits why and how they should amend their ways

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this paper is to examine whether informal labor markets affect the flows of Foreign Direct Investment (FDI), and also whether this effect is similar in developed and developing countries. With this aim, different public data sources, such as the World Bank (WB), and the United Nations Conference on Trade and Development (UNCTAD) are used, and panel econometric models are estimated for a sample of 65 countries over a 14 year period (1996-2009). In addition, this paper uses a dynamic model as an extension of the analysis to establish whether such an effect exists and what its indicators and significance may be.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

GLUT4 protein expression in white adipose tissue (WAT) and skeletal muscle (SM) was investigated in 2-month-old, 12-month-old spontaneously obese or 12-month-old calorie-restricted lean Wistar rats, by considering different parameters of analysis, such as tissue and body weight, and total protein yield of the tissue. In WAT, a ~70% decrease was observed in plasma membrane and microsomal GLUT4 protein, expressed as µg protein or g tissue, in both 12-month-old obese and 12-month-old lean rats compared to 2-month-old rats. However, when plasma membrane and microsomal GLUT4 tissue contents were expressed as g body weight, they were the same. In SM, GLUT4 protein content, expressed as µg protein, was similar in 2-month-old and 12-month-old obese rats, whereas it was reduced in 12-month-old obese rats, when expressed as g tissue or g body weight, which may play an important role in insulin resistance. Weight loss did not change the SM GLUT4 content. These results show that altered insulin sensitivity is accompanied by modulation of GLUT4 protein expression. However, the true role of WAT and SM GLUT4 contents in whole-body or tissue insulin sensitivity should be determined considering not only GLUT4 protein expression, but also the strong morphostructural changes in these tissues, which require different types of data analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study sought to evaluate the acceptance of "dulce de leche" with coffee and whey. The results were analyzed through response surface, ANOVA, test of averages, histograms, and preference map correlating the global impression data with results of physical, physiochemical and sensory analysis. The response surface methodology, by itself, was not enough to find the best formulation. For ANOVA, test of averages, and preference map it was observed that the consumers' favorite "dulce de leche" were those of formulation 1 (10% whey and 1% coffee) and 2 (30% whey and 1% coffee), followed by formulation 9 (20% whey and 1.25% coffee). The acceptance of samples 1 and 2 was influenced by the higher acceptability in relation to the flavor and for presenting higher pH, L*, and b* values. It was observed that samples 1 and 2 presented higher purchase approval score and higher percentages of responses for the 'ideal' category in terms of sweetness and coffee flavor. It was found that consumers preferred the samples with low concentrations of coffee independent of the concentration of whey thus enabling the use of whey and coffee in the manufacture of dulce de leche, obtaining a new product.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research concerns the Urban Living Idea Contest conducted by Creator Space™ of BASF SE during its 150th anniversary in 2015. The main objectives of the thesis are to provide a comprehensive analysis of the Urban Living Idea Contest (ULIC) and propose a number of improvement suggestions for future years. More than 4,000 data points were collected and analyzed to investigate the functionality of different elements of the contest. Furthermore, a set of improvement suggestions were proposed to BASF SE. Novelty of this thesis lies in the data collection and the original analysis of the contest, which identified its critical elements, as well as the areas that could be improved. The author of this research was a member of the organizing team and involved in the decision making process from the beginning until the end of the ULIC.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we discuss Conceptual Knowledge Discovery in Databases (CKDD) in its connection with Data Analysis. Our approach is based on Formal Concept Analysis, a mathematical theory which has been developed and proven useful during the last 20 years. Formal Concept Analysis has led to a theory of conceptual information systems which has been applied by using the management system TOSCANA in a wide range of domains. In this paper, we use such an application in database marketing to demonstrate how methods and procedures of CKDD can be applied in Data Analysis. In particular, we show the interplay and integration of data mining and data analysis techniques based on Formal Concept Analysis. The main concern of this paper is to explain how the transition from data to knowledge can be supported by a TOSCANA system. To clarify the transition steps we discuss their correspondence to the five levels of knowledge representation established by R. Brachman and to the steps of empirically grounded theory building proposed by A. Strauss and J. Corbin.