985 resultados para Data handling
Resumo:
This paper uses an exogenous increase in income for a specific sub-group in Taiwan to explore the extent to which higher income leads to higher levels of health and wellbeing. In 1995, the Taiwanese government implemented the Senior Farmer Welfare Benefit Interim Regulation (SFWBIR) which was a pure cash injection, approximately US$110 (£70) per month in 1996, to senior farmers. A Difference-in-differences (DiD) approach is used on survey data from the Taiwanese Health and Living Status of Elderly in 1989 and 1996 to evaluate the short term effect of the SFWBIR on self-assessed health, depression, and life satisfaction. Senior manufacturing workers are employed as a comparison group for the senior farmers in the natural experiment because their demographic backgrounds are similar. This paper provides evidence that the increase in income from the SFWBIR significantly improved the mental health of senior farmers by reducing the scale of depression (CES-D) by 1.718, however, it had no significant short term impact on self-assessed health or life satisfaction.
Resumo:
The use of Geographic Information Systems has revolutionalized the handling and the visualization of geo-referenced data and has underlined the critic role of spatial analysis. The usual tools for such a purpose are geostatistics which are widely used in Earth science. Geostatistics are based upon several hypothesis which are not always verified in practice. On the other hand, Artificial Neural Network (ANN) a priori can be used without special assumptions and are known to be flexible. This paper proposes to discuss the application of ANN in the case of the interpolation of a geo-referenced variable.
Resumo:
Despite the central role of quantitative PCR (qPCR) in the quantification of mRNA transcripts, most analyses of qPCR data are still delegated to the software that comes with the qPCR apparatus. This is especially true for the handling of the fluorescence baseline. This article shows that baseline estimation errors are directly reflected in the observed PCR efficiency values and are thus propagated exponentially in the estimated starting concentrations as well as 'fold-difference' results. Because of the unknown origin and kinetics of the baseline fluorescence, the fluorescence values monitored in the initial cycles of the PCR reaction cannot be used to estimate a useful baseline value. An algorithm that estimates the baseline by reconstructing the log-linear phase downward from the early plateau phase of the PCR reaction was developed and shown to lead to very reproducible PCR efficiency values. PCR efficiency values were determined per sample by fitting a regression line to a subset of data points in the log-linear phase. The variability, as well as the bias, in qPCR results was significantly reduced when the mean of these PCR efficiencies per amplicon was used in the calculation of an estimate of the starting concentration per sample.
Resumo:
Imaging mass spectrometry (IMS) represents an innovative tool in the cancer research pipeline, which is increasingly being used in clinical and pharmaceutical applications. The unique properties of the technique, especially the amount of data generated, make the handling of data from multiple IMS acquisitions challenging. This work presents a histology-driven IMS approach aiming to identify discriminant lipid signatures from the simultaneous mining of IMS data sets from multiple samples. The feasibility of the developed workflow is evaluated on a set of three human colorectal cancer liver metastasis (CRCLM) tissue sections. Lipid IMS on tissue sections was performed using MALDI-TOF/TOF MS in both negative and positive ionization modes after 1,5-diaminonaphthalene matrix deposition by sublimation. The combination of both positive and negative acquisition results was performed during data mining to simplify the process and interrogate a larger lipidome into a single analysis. To reduce the complexity of the IMS data sets, a sub data set was generated by randomly selecting a fixed number of spectra from a histologically defined region of interest, resulting in a 10-fold data reduction. Principal component analysis confirmed that the molecular selectivity of the regions of interest is maintained after data reduction. Partial least-squares and heat map analyses demonstrated a selective signature of the CRCLM, revealing lipids that are significantly up- and down-regulated in the tumor region. This comprehensive approach is thus of interest for defining disease signatures directly from IMS data sets by the use of combinatory data mining, opening novel routes of investigation for addressing the demands of the clinical setting.
Resumo:
Volumes of data used in science and industry are growing rapidly. When researchers face the challenge of analyzing them, their format is often the first obstacle. Lack of standardized ways of exploring different data layouts requires an effort each time to solve the problem from scratch. Possibility to access data in a rich, uniform manner, e.g. using Structured Query Language (SQL) would offer expressiveness and user-friendliness. Comma-separated values (CSV) are one of the most common data storage formats. Despite its simplicity, with growing file size handling it becomes non-trivial. Importing CSVs into existing databases is time-consuming and troublesome, or even impossible if its horizontal dimension reaches thousands of columns. Most databases are optimized for handling large number of rows rather than columns, therefore, performance for datasets with non-typical layouts is often unacceptable. Other challenges include schema creation, updates and repeated data imports. To address the above-mentioned problems, I present a system for accessing very large CSV-based datasets by means of SQL. It's characterized by: "no copy" approach - data stay mostly in the CSV files; "zero configuration" - no need to specify database schema; written in C++, with boost [1], SQLite [2] and Qt [3], doesn't require installation and has very small size; query rewriting, dynamic creation of indices for appropriate columns and static data retrieval directly from CSV files ensure efficient plan execution; effortless support for millions of columns; due to per-value typing, using mixed text/numbers data is easy; very simple network protocol provides efficient interface for MATLAB and reduces implementation time for other languages. The software is available as freeware along with educational videos on its website [4]. It doesn't need any prerequisites to run, as all of the libraries are included in the distribution package. I test it against existing database solutions using a battery of benchmarks and discuss the results.
Resumo:
Nowadays the used fuel variety in power boilers is widening and new boiler constructions and running models have to be developed. This research and development is done in small pilot plants where more faster analyse about the boiler mass and heat balance is needed to be able to find and do the right decisions already during the test run. The barrier on determining boiler balance during test runs is the long process of chemical analyses of collected input and outputmatter samples. The present work is concentrating on finding a way to determinethe boiler balance without chemical analyses and optimise the test rig to get the best possible accuracy for heat and mass balance of the boiler. The purpose of this work was to create an automatic boiler balance calculation method for 4 MW CFB/BFB pilot boiler of Kvaerner Pulping Oy located in Messukylä in Tampere. The calculation was created in the data management computer of pilot plants automation system. The calculation is made in Microsoft Excel environment, which gives a good base and functions for handling large databases and calculations without any delicate programming. The automation system in pilot plant was reconstructed und updated by Metso Automation Oy during year 2001 and the new system MetsoDNA has good data management properties, which is necessary for big calculations as boiler balance calculation. Two possible methods for calculating boiler balance during test run were found. Either the fuel flow is determined, which is usedto calculate the boiler's mass balance, or the unburned carbon loss is estimated and the mass balance of the boiler is calculated on the basis of boiler's heat balance. Both of the methods have their own weaknesses, so they were constructed parallel in the calculation and the decision of the used method was left to user. User also needs to define the used fuels and some solid mass flowsthat aren't measured automatically by the automation system. With sensitivity analysis was found that the most essential values for accurate boiler balance determination are flue gas oxygen content, the boiler's measured heat output and lower heating value of the fuel. The theoretical part of this work concentrates in the error management of these measurements and analyses and on measurement accuracy and boiler balance calculation in theory. The empirical part of this work concentrates on the creation of the balance calculation for the boiler in issue and on describing the work environment.
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
Brazil is amongst the world’s largest swine producers. However, its competitiveness has been vulnerable due to a lack of cooperation between the supply chain players. This condition makes the financial losses to be evaluated taking into account only an individual node, and most of the time, these damages are imputed by swine breeders. Living weight losses occur between the farm to slaughterhouses, and the main cause of these losses is the pre-slaughter handling, especially during animal transportation. In this research, we analyzed the pre-slaughter handling in a swine farm in Brasilândia, MS, Brazil. Analyzed data were provided by five slaughterhouses (farm clients) from the studied region, in which it was considered living weight losses, carcass bruising, animal injury, and death rate. The results indicated that total financial losses represent 160 thousand dollars per year, when taking into account the supply chain management.
Resumo:
Prerequisites and effects of proactive and preventive psycho-social student welfare activities in Finnish preschool and elementary school were of interest in the present thesis. So far, Finnish student welfare work has mainly focused on interventions and individuals, and the voluminous possibilities to enhance well-being of all students as a part of everyday school work have not been fully exploited. Consequently, in this thesis three goals were set: (1) To present concrete examples of proactive and preventive psycho-social student welfare activities in Finnish basic education; (2) To investigate measurable positive effects of proactive and preventive activities; and (3) To investigate implementation of proactive and preventive activities in ecological contexts. Two prominent phenomena in preschool and elementary school years—transition to formal schooling and school bullying—were chosen as examples of critical situations that are appropriate targets for proactive and preventive psycho-social student welfare activities. Until lately, the procedures concerning both school transitions and school bullying have been rather problem-focused and reactive in nature. Theoretically, we lean on the bioecological model of development by Bronfenbrenner and Morris with concentric micro-, meso-, exo- and macrosystems. Data were drawn from two large-scale research projects, the longitudinal First Steps Study: Interactive Learning in the Child–Parent– Teacher Triangle, and the Evaluation Study of the National Antibullying Program KiVa. In Study I, we found that the academic skills of children from preschool–elementary school pairs that implemented several supportive activities during the preschool year developed more quickly from preschool to Grade 1 compared with the skills of children from pairs that used fewer practices. In Study II, we focused on possible effects of proactive and preventive actions on teachers and found that participation in the KiVa antibullying program influenced teachers‘ self-evaluated competence to tackle bullying. In Studies III and IV, we investigated factors that affect implementation rate of these proactive and preventive actions. In Study III, we found that principal‘s commitment and support for antibullying work has a clear-cut positive effect on implementation adherence of student lessons of the KiVa antibullying program. The more teachers experience support for and commitment to anti-bullying work from their principal, the more they report having covered KiVa student lessons and topics. In Study IV, we wanted to find out why some schools implement several useful and inexpensive transition practices, whereas other schools use only a few of them. We were interested in broadening the scope and looking at local-level (exosystem) qualities, and, in fact, the local-level activities and guidelines, along with teacherreported importance of the transition practices, were the only factors significantly associated with the implementation rate of transition practices between elementary schools and partner preschools. Teacher- and school-level factors available in this study turned out to be mostly not significant. To summarize, the results confirm that school-based promotion and prevention activities may have beneficial effects not only on students but also on teachers. Second, various top-down processes, such as engagement at the level of elementary school principals or local administration may enhance implementation of these beneficial activities. The main message is that when aiming to support the lives of children the primary focus should be on adults. In future, promotion of psychosocial well-being and the intrinsic value of inter- and intrapersonal skills need to be strengthened in the Finnish educational systems. Future research efforts in student welfare and school psychology, as well as focused training for psychologists in educational contexts, should be encouraged in the departments of psychology and education in Finnish universities. Moreover, a specific research centre for school health and well-being should be established.
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
Dopaminergic neurotransmission is involved in the regulation of sleep. In particular, the nigrostriatal pathway is an important center of sleep regulation. We hypothesized that dopaminergic neurons located in substantia nigra pars compacta (SNpc) could be activated by gentle handling, a method to obtain sleep deprivation (SD). Adult male C57/BL6J mice (N = 5/group) were distributed into non-SD (NSD) or SD groups. SD animals were subjected to SD once for 1 or 3 h by gentle handling. Two experiments were performed. The first determined the activation of SNpc neurons after SD, and the second examined the same parameters after pharmacologically induced dopaminergic depletion using intraperitoneal reserpine (2 mg/kg). After 1 or 3 h, SD and NSD mice were subjected to motor evaluation using the open field test. Immediately after the behavioral test, the mice were perfused intracardially to fix the brain and for immunohistochemical analysis of c-Fos protein expression within the SNpc. The open field test indicated that SD for 1 or 3 h did not modify motor behavior. However, c-Fos protein expression was increased after 1 h of SD compared with the NSD and 3-h SD groups. These immunohistochemistry data indicate that these periods of SD are not able to produce dopaminergic supersensitivity. Nevertheless, the increased expression of c-Fos within the SNpc suggests that dopaminergic nigral activation was triggered by SD earlier than motor responsiveness. Dopamine-depleted mice (experiment 2) exhibited a similar increase of c-Fos expression compared to control animals indicating that dopamine neurons are still activated in the 1-h SD group despite the exhaustion of dopamine. This finding suggests that this range (2-5-fold) of neuronal activation may serve as a marker of SD.
Resumo:
This research is looking to find out what benefits employees expect the organization of data governance gains for an organization and how it benefits implementing automated marketing capabilities. Quality and usability of the data are crucial for organizations to meet various business needs. Organizations have more data and technology available what can be utilized for example in automated marketing. Data governance addresses the organization of decision rights and accountabilities for the management of an organization’s data assets. With automated marketing it is meant sending a right message, to a right person, at a right time, automatically. The research is a single case study conducted in Finnish ICT-company. The case company was starting to organize data governance and implementing automated marketing capabilities at the time of the research. Empirical material is interviews of the employees of the case company. Content analysis is used to interpret the interviews in order to find the answers to the research questions. Theoretical framework of the research is derived from the morphology of data governance. Findings of the research indicate that the employees expect the organization of data governance among others to improve customer experience, to improve sales, to provide abilities to identify individual customer’s life-situation, ensure that the handling of the data is according to the regulations and improve operational efficiency. The organization of data governance is expected to solve problems in customer data quality that are currently hindering implementation of automated marketing capabilities.
Resumo:
The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.
Resumo:
Fluid handling systems account for a significant share of the global consumption of electrical energy. They also suffer from problems, which reduce their energy efficiency and increase life-cycle costs. Detecting or predicting these problems in time can make fluid handling systems more environmentally and economically sustainable to operate. In this Master’s Thesis, significant problems in fluid systems were studied and possibilities to develop variable-speed-drive-based detection methods for them was discussed. A literature review was conducted to find significant problems occurring in fluid handling systems containing pumps, fans and compressors. To find case examples for evaluating the feasibility of variable-speed-drive-based methods, queries were sent to industrial companies. As a result of this, the possibility to detect heat exchanger fouling with a variable-speed drive was analysed with data from three industrial cases. It was found that a mass flow rate estimate, which can be generated with a variable speed drive, can be used together with temperature measurements to monitor a heat exchanger’s thermal performance. Secondly, it was found that the fouling-related increase in the pressure drop of a heat exchanger can be monitored with a variable speed drive. Lastly, for systems where the flow device is speed controlled with by a pressure measurement, it was concluded that increasing rotational speed can be interpreted as progressing fouling in the heat exchanger.
Resumo:
Posiva Oy’s final disposal facility’s encapsulation plant will start to operate in the 2020s. Once the operation starts, the facility is designed to run more than a hundred years. The encapsulation plant will be first of its kind in the world, being part of the solution to solve a global issue of final disposal of nuclear waste. In the encapsulation plant’s fuel handling cell the spent nuclear fuel will be processed to be deposited into the Finnish bedrock, into ONKALO. In the fuel handling cell, the environment is highly radioactive forming a permit-required enclosed space. Remote observation is needed in order to monitor the fuel handling process. The purpose of this thesis is to map (Part I) and compare (Part II) remote observation methods to observe Posiva Oy’s fuel handling cell’s process, and provide a possible theoretical solution for this case. Secondary purpose for this thesis is to provide resources for other remote observation cases, as well as to inform about possible future technology to enable readiness in the design of the encapsulation plant. The approach was to theoretically analyze the mapped remote observation methods. Firstly, the methods were filtered by three environmental challenges. These are the high levels of radiation, the permit-required confined space and the hundred year timespan. Secondly, the most promising methods were selected by the experts designing the facility. Thirdly, a customized feasibility analysis was created and performed on the selected methods to rank the methods with scores. The results are the mapped methods and the feasibility analysis scores. The three highest scoring methods were radiation tolerant camera, fiberscope and audio feed. A combination of these three methods was given as a possible theoretical solution for this case. As this case is first in the world, remote observation methods for it had not been thoroughly researched. The findings in this thesis will act as initial data for the design of the fuel handling cell’s remote observation systems and can potentially effect on the overall design of the facility by providing unique and case specific information. In addition, this thesis could provide resources for other remote observation cases.