979 resultados para Methods : Statistical
Resumo:
This Ph.D. thesis focuses on the investigation of some chemical and sensorial analytical parameters linked to the quality and purity of different categories of oils obtained by olives: extra virgin olive oils, both those that are sold in the large retail trade (supermarkets and discounts) and those directly collected at some Italian mills, and lower-quality oils (refined, lampante and “repaso”). Concurrently with the adoption of traditional and well-known analytical procedures such as gas chromatography and high-performance liquid chromatography, I carried out a set-up of innovative, fast and environmentally-friend methods. For example, I developed some analytical approaches based on Fourier transform medium infrared spectroscopy (FT-MIR) and time domain reflectometry (TDR), coupled with a robust chemometric elaboration of the results. I investigated some other freshness and quality markers that are not included in official parameters (in Italian and European regulations): the adoption of such a full chemical and sensorial analytical plan allowed me to obtain interesting information about the degree of quality of the EVOOs, mostly within the Italian market. Here the range of quality of EVOOs resulted very wide, in terms of sensory attributes, price classes and chemical parameters. Thanks to the collaboration with other Italian and foreign research groups, I carried out several applicative studies, especially focusing on the shelf-life of oils obtained by olives and on the effects of thermal stresses on the quality of the products. I also studied some innovative technological treatments, such as the clarification by using inert gases, as an alternative to the traditional filtration. Moreover, during a three-and-a-half months research stay at the University of Applied Sciences in Zurich, I also carried out a study related to the application of statistical methods for the elaboration of sensory results, obtained thanks to the official Swiss Panel and to some consumer tests.
Resumo:
In this work, new tools in atmospheric pollutant sampling and analysis were applied in order to go deeper in source apportionment study. The project was developed mainly by the study of atmospheric emission sources in a suburban area influenced by a municipal solid waste incinerator (MSWI), a medium-sized coastal tourist town and a motorway. Two main research lines were followed. For what concerns the first line, the potentiality of the use of PM samplers coupled with a wind select sensor was assessed. Results showed that they may be a valid support in source apportionment studies. However, meteorological and territorial conditions could strongly affect the results. Moreover, new markers were investigated, particularly focusing on the processes of biomass burning. OC revealed a good biomass combustion process indicator, as well as all determined organic compounds. Among metals, lead and aluminium are well related to the biomass combustion. Surprisingly PM was not enriched of potassium during bonfire event. The second research line consists on the application of Positive Matrix factorization (PMF), a new statistical tool in data analysis. This new technique was applied to datasets which refer to different time resolution data. PMF application to atmospheric deposition fluxes identified six main sources affecting the area. The incinerator’s relative contribution seemed to be negligible. PMF analysis was then applied to PM2.5 collected with samplers coupled with a wind select sensor. The higher number of determined environmental indicators allowed to obtain more detailed results on the sources affecting the area. Vehicular traffic revealed the source of greatest concern for the study area. Also in this case, incinerator’s relative contribution seemed to be negligible. Finally, the application of PMF analysis to hourly aerosol data demonstrated that the higher the temporal resolution of the data was, the more the source profiles were close to the real one.
Resumo:
The thesis is concerned with local trigonometric regression methods. The aim was to develop a method for extraction of cyclical components in time series. The main results of the thesis are the following. First, a generalization of the filter proposed by Christiano and Fitzgerald is furnished for the smoothing of ARIMA(p,d,q) process. Second, a local trigonometric filter is built, with its statistical properties. Third, they are discussed the convergence properties of trigonometric estimators, and the problem of choosing the order of the model. A large scale simulation experiment has been designed in order to assess the performance of the proposed models and methods. The results show that local trigonometric regression may be a useful tool for periodic time series analysis.
Resumo:
Many of developing countries are facing crisis in water management due to increasing of population, water scarcity, water contaminations and effects of world economic crisis. Water distribution systems in developing countries are facing many challenges of efficient repair and rehabilitation since the information of water network is very limited, which makes the rehabilitation assessment plans very difficult. Sufficient information with high technology in developed countries makes the assessment for rehabilitation easy. Developing countries have many difficulties to assess the water network causing system failure, deterioration of mains and bad water quality in the network due to pipe corrosion and deterioration. The limited information brought into focus the urgent need to develop economical assessment for rehabilitation of water distribution systems adapted to water utilities. Gaza Strip is subject to a first case study, suffering from severe shortage in the water supply and environmental problems and contamination of underground water resources. This research focuses on improvement of water supply network to reduce the water losses in water network based on limited database using techniques of ArcGIS and commercial water network software (WaterCAD). A new approach for rehabilitation water pipes has been presented in Gaza city case study. Integrated rehabilitation assessment model has been developed for rehabilitation water pipes including three components; hydraulic assessment model, Physical assessment model and Structural assessment model. WaterCAD model has been developed with integrated in ArcGIS to produce the hydraulic assessment model for water network. The model have been designed based on pipe condition assessment with 100 score points as a maximum points for pipe condition. As results from this model, we can indicate that 40% of water pipeline have score points less than 50 points and about 10% of total pipes length have less than 30 score points. By using this model, the rehabilitation plans for each region in Gaza city can be achieved based on available budget and condition of pipes. The second case study is Kuala Lumpur Case from semi-developed countries, which has been used to develop an approach to improve the water network under crucial conditions using, advanced statistical and GIS techniques. Kuala Lumpur (KL) has water losses about 40% and high failure rate, which make severe problem. This case can represent cases in South Asia countries. Kuala Lumpur faced big challenges to reduce the water losses in water network during last 5 years. One of these challenges is high deterioration of asbestos cement (AC) pipes. They need to replace more than 6500 km of AC pipes, which need a huge budget to be achieved. Asbestos cement is subject to deterioration due to various chemical processes that either leach out the cement material or penetrate the concrete to form products that weaken the cement matrix. This case presents an approach for geo-statistical model for modelling pipe failures in a water distribution network. Database of Syabas Company (Kuala Lumpur water company) has been used in developing the model. The statistical models have been calibrated, verified and used to predict failures for both networks and individual pipes. The mathematical formulation developed for failure frequency in Kuala Lumpur was based on different pipeline characteristics, reflecting several factors such as pipe diameter, length, pressure and failure history. Generalized linear model have been applied to predict pipe failures based on District Meter Zone (DMZ) and individual pipe levels. Based on Kuala Lumpur case study, several outputs and implications have been achieved. Correlations between spatial and temporal intervals of pipe failures also have been done using ArcGIS software. Water Pipe Assessment Model (WPAM) has been developed using the analysis of historical pipe failure in Kuala Lumpur which prioritizing the pipe rehabilitation candidates based on ranking system. Frankfurt Water Network in Germany is the third main case study. This case makes an overview for Survival analysis and neural network methods used in water network. Rehabilitation strategies of water pipes have been developed for Frankfurt water network in cooperation with Mainova (Frankfurt Water Company). This thesis also presents a methodology of technical condition assessment of plastic pipes based on simple analysis. This thesis aims to make contribution to improve the prediction of pipe failures in water networks using Geographic Information System (GIS) and Decision Support System (DSS). The output from the technical condition assessment model can be used to estimate future budget needs for rehabilitation and to define pipes with high priority for replacement based on poor condition. rn
Resumo:
Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.
Resumo:
The revision hip arthroplasty is a surgical procedure, consisting in the reconstruction of the hip joint through the replacement of the damaged hip prosthesis. Several factors may give raise to the failure of the artificial device: aseptic loosening, infection and dislocation represent the principal causes of failure worldwide. The main effect is the raise of bone defects in the region closest to the prosthesis that weaken the bone structure for the biological fixation of the new artificial hip. For this reason bone reconstruction is necessary before the surgical revision operation. This work is born by the necessity to test the effects of bone reconstruction due to particular bone defects in the acetabulum, after the hip prosthesis revision. In order to perform biomechanical in vitro tests on hip prosthesis implanted in human pelvis or hemipelvis a practical definition of a reference frame for these kind of bone specimens is required. The aim of the current study is to create a repeatable protocol to align hemipelvic samples in the testing machine, that relies on a reference system based on anatomical landmarks on the human pelvis. In chapter 1 a general overview of the human pelvic bone is presented: anatomy, bone structure, loads and the principal devices for hip joint replacement. The purpose of chapters 2 is to identify the most common causes of the revision hip arthroplasty, analysing data from the most reliable orthopaedic registries in the world. Chapter 3 presents an overview of the most used classifications for acetabular bone defects and fractures and the most common techniques for acetabular and bone reconstruction. After a critical review of the scientific literature about reference frames for human pelvis, in chapter 4, the definition of a new reference frame is proposed. Based on this reference frame, the alignment protocol for the human hemipelvis is presented as well as the statistical analysis that confirm the good repeatability of the method.
Resumo:
Statistical models have been recently introduced in computational orthopaedics to investigate the bone mechanical properties across several populations. A fundamental aspect for the construction of statistical models concerns the establishment of accurate anatomical correspondences among the objects of the training dataset. Various methods have been proposed to solve this problem such as mesh morphing or image registration algorithms. The objective of this study is to compare a mesh-based and an image-based statistical appearance model approaches for the creation of nite element(FE) meshes. A computer tomography (CT) dataset of 157 human left femurs was used for the comparison. For each approach, 30 finite element meshes were generated with the models. The quality of the obtained FE meshes was evaluated in terms of volume, size and shape of the elements. Results showed that the quality of the meshes obtained with the image-based approach was higher than the quality of the mesh-based approach. Future studies are required to evaluate the impact of this finding on the final mechanical simulations.
Resumo:
The aim of this in vitro study was to assess the agreement among four techniques used as gold standard for the validation of methods for occlusal caries detection. Sixty-five human permanent molars were selected and one site in each occlusal surface was chosen as the test site. The teeth were cut and prepared according to each technique: stereomicroscopy without coloring (1), dye enhancement with rhodamine B (2) and fuchsine/acetic light green (3), and semi-quantitative microradiography (4). Digital photographs from each prepared tooth were assessed by three examiners for caries extension. Weighted kappa, as well as Friedman's test with multiple comparisons, was performed to compare all techniques and verify statistical significant differences. Results: kappa values varied from 0.62 to 0.78, the latter being found by both dye enhancement methods. Friedman's test showed statistical significant difference (P < 0.001) and multiple comparison identified these differences among all techniques, except between both dye enhancement methods (rhodamine B and fuchsine/acetic light green). Cross-tabulation showed that the stereomicroscopy overscored the lesions. Both dye enhancement methods showed a good agreement, while stereomicroscopy overscored the lesions. Furthermore, the outcome of caries diagnostic tests may be influenced by the validation method applied. Dye enhancement methods seem to be reliable as gold standard methods.
Resumo:
The objective of this study was to estimate the potential of method restriction as a public health strategy in suicide prevention. Data from the Swiss Federal Statistical Office and the Swiss Institutes of Forensic Medicine from 2004 were gathered and categorized into suicide submethods according to accessibility to restriction of means. Of suicides in Switzerland, 39.2% are accessible to method restriction. The highest proportions were found in private weapons (13.2%), army weapons (10.4%), and jumps from hot-spots (4.6%). The presented method permits the estimation of the suicide prevention potential of a country by method restriction and the comparison of restriction potentials between suicide methods. In Switzerland, reduction of firearm suicides has the highest potential to reduce the total number of suicides.
Resumo:
The project aimed to use results of contamination of city vegetation with heavy metals and sulphur compounds as the basis for analysing the integral response of trees and shrubs to contamination, through a complex method of phytoindication. The results were used to draw up recommendations on pollution reduction in the city and to develop the method of phytoindication as a means of monitoring environmental pollution in St. Petersburg and other large cities. Field investigations were carried out in August 1996, and 66 descriptions of green areas were made in order to estimate the functional state of plants in the Vasileostrovsky district. Investigations of the spectrum reflecting properties of plants showed considerable variation of albedo meanings of leaves under the influence of various internal and external factors. The results indicated that lime trees most closely reflect the condition of the environment. Practically all the green areas studied were in poor condition, the only exceptions being areas of ash trees, which are more resistant to environmental pollution, and one lime-tree alley in a comparatively unpolluted street. The study identified those types of trees which are more or less resistant to complex environmental pollution and Ms. Terekhina recommends that the species in the present green areas be changed to include a higher number of the more resistant species. The turbidimetric analysis of tree barks for sulphates gave an indication of the level and spatial distribution of each pollutant, and the results also confirmed other findings that electric conductivity is a significant feature in determining the extent of sulphate pollution. In testing for various metals, the lime tree showed the highest contents for all elements except magnesium, copper, zinc, cadmium and strontium, again confirming the species' vulnerability to pollution. Medium rates of concentration in the city and environs showed that city plants concentrate 3 times as many different elements and 10 times more chromium, copper and lead than do those in the suburbs. The second stage of the study was based on the concept of phytoindication, which presupposes that changes in the relation of chemical elements in regional biological circulation under the influence of technogenesis provide a criterion for predicting displacements in people's health. There are certain basic factors in this concept. The first is that all living beings are related ecologically as well as by their evolutionary origin, and that the lower an organism is on the evolutionary scale, the less adaptational reserve it has. The second is that smaller concentrations of chemical elements are needed for toxicological influence on plants than on people and so the former's reactions to geochemical factors are easier to characterise. Visual indicational features of urban plants are well defined and can form the basis of a complex "environment - public health" analysis. Specific plant reactions reflecting atmospheric pollution and other components of urbogeosystems make it possible to determine indication criteria for predicting possible disturbances in the general state of health of the population. Thirdly the results of phytoindication investigations must be taken together with information about public health in the area. It only proved possibly to analyse general indexes of public health based on statistical data from the late 1980s and early 1990s as the data of later years were greatly influenced by social factors. These data show that the rates of illness in St. Petersburg (especially for children) are higher than in Russia as a whole, for most classes of diseases, indicating that the population there is more sensitive to the ecological state of the urban environment. The Vasileostrovsky district had the second highest sick rate for adullts, while the rate of infant mortality in the first year of life was highest there. Ms. Terekhina recommends further studies to more precisely assess the effectiveness of the methods she tested, but has drawn up a proposed map of environmental hazard for the population, taking into account prevailing wind directions.
Resumo:
We derive a new class of iterative schemes for accelerating the convergence of the EM algorithm, by exploiting the connection between fixed point iterations and extrapolation methods. First, we present a general formulation of one-step iterative schemes, which are obtained by cycling with the extrapolation methods. We, then square the one-step schemes to obtain the new class of methods, which we call SQUAREM. Squaring a one-step iterative scheme is simply applying it twice within each cycle of the extrapolation method. Here we focus on the first order or rank-one extrapolation methods for two reasons, (1) simplicity, and (2) computational efficiency. In particular, we study two first order extrapolation methods, the reduced rank extrapolation (RRE1) and minimal polynomial extrapolation (MPE1). The convergence of the new schemes, both one-step and squared, is non-monotonic with respect to the residual norm. The first order one-step and SQUAREM schemes are linearly convergent, like the EM algorithm but they have a faster rate of convergence. We demonstrate, through five different examples, the effectiveness of the first order SQUAREM schemes, SqRRE1 and SqMPE1, in accelerating the EM algorithm. The SQUAREM schemes are also shown to be vastly superior to their one-step counterparts, RRE1 and MPE1, in terms of computational efficiency. The proposed extrapolation schemes can fail due to the numerical problems of stagnation and near breakdown. We have developed a new hybrid iterative scheme that combines the RRE1 and MPE1 schemes in such a manner that it overcomes both stagnation and near breakdown. The squared first order hybrid scheme, SqHyb1, emerges as the iterative scheme of choice based on our numerical experiments. It combines the fast convergence of the SqMPE1, while avoiding near breakdowns, with the stability of SqRRE1, while avoiding stagnations. The SQUAREM methods can be incorporated very easily into an existing EM algorithm. They only require the basic EM step for their implementation and do not require any other auxiliary quantities such as the complete data log likelihood, and its gradient or hessian. They are an attractive option in problems with a very large number of parameters, and in problems where the statistical model is complex, the EM algorithm is slow and each EM step is computationally demanding.
Resumo:
This article gives an overview over the methods used in the low--level analysis of gene expression data generated using DNA microarrays. This type of experiment allows to determine relative levels of nucleic acid abundance in a set of tissues or cell populations for thousands of transcripts or loci simultaneously. Careful statistical design and analysis are essential to improve the efficiency and reliability of microarray experiments throughout the data acquisition and analysis process. This includes the design of probes, the experimental design, the image analysis of microarray scanned images, the normalization of fluorescence intensities, the assessment of the quality of microarray data and incorporation of quality information in subsequent analyses, the combination of information across arrays and across sets of experiments, the discovery and recognition of patterns in expression at the single gene and multiple gene levels, and the assessment of significance of these findings, considering the fact that there is a lot of noise and thus random features in the data. For all of these components, access to a flexible and efficient statistical computing environment is an essential aspect.
Resumo:
For various reasons, it is important, if not essential, to integrate the computations and code used in data analyses, methodological descriptions, simulations, etc. with the documents that describe and rely on them. This integration allows readers to both verify and adapt the statements in the documents. Authors can easily reproduce them in the future, and they can present the document's contents in a different medium, e.g. with interactive controls. This paper describes a software framework for authoring and distributing these integrated, dynamic documents that contain text, code, data, and any auxiliary content needed to recreate the computations. The documents are dynamic in that the contents, including figures, tables, etc., can be recalculated each time a view of the document is generated. Our model treats a dynamic document as a master or ``source'' document from which one can generate different views in the form of traditional, derived documents for different audiences. We introduce the concept of a compendium as both a container for the different elements that make up the document and its computations (i.e. text, code, data, ...), and as a means for distributing, managing and updating the collection. The step from disseminating analyses via a compendium to reproducible research is a small one. By reproducible research, we mean research papers with accompanying software tools that allow the reader to directly reproduce the results and employ the methods that are presented in the research paper. Some of the issues involved in paradigms for the production, distribution and use of such reproducible research are discussed.
Resumo:
Studies of chronic life-threatening diseases often involve both mortality and morbidity. In observational studies, the data may also be subject to administrative left truncation and right censoring. Since mortality and morbidity may be correlated and mortality may censor morbidity, the Lynden-Bell estimator for left truncated and right censored data may be biased for estimating the marginal survival function of the non-terminal event. We propose a semiparametric estimator for this survival function based on a joint model for the two time-to-event variables, which utilizes the gamma frailty specification in the region of the observable data. Firstly, we develop a novel estimator for the gamma frailty parameter under left truncation. Using this estimator, we then derive a closed form estimator for the marginal distribution of the non-terminal event. The large sample properties of the estimators are established via asymptotic theory. The methodology performs well with moderate sample sizes, both in simulations and in an analysis of data from a diabetes registry.