971 resultados para statistical modelling
Resumo:
This article considers alternative methods to calculate the fair premium rate of crop insurance contracts based on county yields. The premium rate was calculated using parametric and nonparametric approaches to estimate the conditional agricultural yield density. These methods were applied to a data set of county yield provided by the Statistical and Geography Brazilian Institute (IBGE), for the period of 1990 through 2002, for soybean, corn and wheat, in the State of Paran. In this article, we propose methodological alternatives to pricing crop insurance contracts resulting in more accurate premium rates in a situation of limited data.
Resumo:
The goal of this work is to try to create a statistical model, based only on easily computable parameters from the CSP problem to predict runtime behaviour of the solving algorithms, and let us choose the best algorithm to solve the problem. Although it seems that the obvious choice should be MAC, experimental results obtained so far show, that with big numbers of variables, other algorithms perfom much better, specially for hard problems in the transition phase.
Resumo:
Lecture notes for a first year statistical modelling course.
Resumo:
The impact of projected climate change on wine production was analysed for the Demarcated Region of Douro, Portugal. A statistical grapevine yield model (GYM) was developed using climate parameters as predictors. Statistically significant correlations were identified between annual yield and monthly mean temperatures and monthly precipitation totals during the growing cycle. These atmospheric factors control grapevine yield in the region, with the GYM explaining 50.4% of the total variance in the yield time series in recent decades. Anomalously high March rainfall (during budburst, shoot and inflorescence development) favours yield, as well as anomalously high temperatures and low precipitation amounts in May and June (May: flowering and June: berry development). The GYM was applied to a regional climate model output, which was shown to realistically reproduce the GYM predictors. Finally, using ensemble simulations under the A1B emission scenario, projections for GYM-derived yield in the Douro Region, and for the whole of the twenty-first century, were analysed. A slight upward trend in yield is projected to occur until about 2050, followed by a steep and continuous increase until the end of the twenty-first century, when yield is projected to be about 800 kg/ha above current values. While this estimate is based on meteorological parameters alone, changes due to elevated CO2 may further enhance this effect. In spite of the associated uncertainties, it can be stated that projected climate change may significantly benefit wine yield in the Douro Valley.
Resumo:
Extreme rainfall events have triggered a significant number of flash floods in Madeira Island along its past and recent history. Madeira is a volcanic island where the spatial rainfall distribution is strongly affected by its rugged topography. In this thesis, annual maximum of daily rainfall data from 25 rain gauge stations located in Madeira Island were modelled by the generalised extreme value distribution. Also, the hypothesis of a Gumbel distribution was tested by two methods and the existence of a linear trend in both distributions parameters was analysed. Estimates for the 50– and 100–year return levels were also obtained. Still in an univariate context, the assumption that a distribution function belongs to the domain of attraction of an extreme value distribution for monthly maximum rainfall data was tested for the rainy season. The available data was then analysed in order to find the most suitable domain of attraction for the sampled distribution. In a different approach, a search for thresholds was also performed for daily rainfall values through a graphical analysis. In a multivariate context, a study was made on the dependence between extreme rainfall values from the considered stations based on Kendall’s τ measure. This study suggests the influence of factors such as altitude, slope orientation, distance between stations and their proximity of the sea on the spatial distribution of extreme rainfall. Groups of three pairwise associated stations were also obtained and an adjustment was made to a family of extreme value copulas involving the Marshall–Olkin family, whose parameters can be written as a function of Kendall’s τ association measures of the obtained pairs.
Resumo:
Anaerobic threshold (AT) is usually estimated as a change point problem by visual analysis of the cardiorespiratory response to incremental dynamic exercise. In this study, two phase linear (TPL) models of the linear-linear and linear-quadratic type were used for the estimation of AT. The correlation coefficient between the classical and statistical approaches was 0.88, and 0.89 after outlier exclusion. The TPL models provide a simple method for estimating AT that can be easily implemented using a digital computer for the automatic pattern recognition of AT.
Resumo:
Peer reviewed
Resumo:
Poly(methylvinylether-co-maleic acid) (PMVE/MA) is commonly used as a component of pharmaceutical platforms, principally to enhance interactions with biological substrates (mucoadhesion). However, the limited knowledge on the rheological properties of this polymer and their relationships with mucoadhesion has negated the biomedical use of this polymer as a mono-component platform. This study presents a comprehensive study of the rheological properties of aqueous PMVE/MA platforms and defines their relationships with mucoadhesion using multiple regression analysis. Using dilute solution viscometry the intrinsic viscosities of un-neutralised PMVE/MA and PMVE/MA neutralised using NaOH or TEA were 22.32 ± 0.89 dL g-1, 274.80 ± 1.94 dL g-1 and 416.49 ± 2.21 dL g-1 illustrating greater polymer chain expansion following neutralisation using Triethylamine (TEA). PMVE/MA platforms exhibited shear-thinning properties. Increasing polymer concentration increased the consistencies, zero shear rate (ZSR) viscosities (determined from flow rheometry), storage and loss moduli, dynamic viscosities (defined using oscillatory analysis) and mucoadhesive properties, yet decreased the loss tangents of the neutralised polymer platforms. TEA neutralised systems possessed significantly and substantially greater consistencies, ZSR and dynamic viscosities, storage and loss moduli, mucoadhesion and lower loss tangents than their NaOH counterparts. Multiple regression analysis enabled identification of the dominant role of polymer viscoelasticity on mucoadhesion (r > 0.98). The mucoadhesive properties of PMVE/MA platforms were considerable and were greater than those of other platforms that have successfully been shown to enhance in vivo retention when applied to the oral cavity, indicating a positive role for PMVE/MA mono-component platforms for pharmaceutical and biomedical applications.
Resumo:
The main objective of this work was to develop a novel dimensionality reduction technique as a part of an integrated pattern recognition solution capable of identifying adulterants such as hazelnut oil in extra virgin olive oil at low percentages based on spectroscopic chemical fingerprints. A novel Continuous Locality Preserving Projections (CLPP) technique is proposed which allows the modelling of the continuous nature of the produced in-house admixtures as data series instead of discrete points. The maintenance of the continuous structure of the data manifold enables the better visualisation of this examined classification problem and facilitates the more accurate utilisation of the manifold for detecting the adulterants. The performance of the proposed technique is validated with two different spectroscopic techniques (Raman and Fourier transform infrared, FT-IR). In all cases studied, CLPP accompanied by k-Nearest Neighbors (kNN) algorithm was found to outperform any other state-of-the-art pattern recognition techniques.
Resumo:
This paper proposes a template for modelling complex datasets that integrates traditional statistical modelling approaches with more recent advances in statistics and modelling through an exploratory framework. Our approach builds on the well-known and long standing traditional idea of 'good practice in statistics' by establishing a comprehensive framework for modelling that focuses on exploration, prediction, interpretation and reliability assessment, a relatively new idea that allows individual assessment of predictions. The integrated framework we present comprises two stages. The first involves the use of exploratory methods to help visually understand the data and identify a parsimonious set of explanatory variables. The second encompasses a two step modelling process, where the use of non-parametric methods such as decision trees and generalized additive models are promoted to identify important variables and their modelling relationship with the response before a final predictive model is considered. We focus on fitting the predictive model using parametric, non-parametric and Bayesian approaches. This paper is motivated by a medical problem where interest focuses on developing a risk stratification system for morbidity of 1,710 cardiac patients given a suite of demographic, clinical and preoperative variables. Although the methods we use are applied specifically to this case study, these methods can be applied across any field, irrespective of the type of response.
Resumo:
The identification of compositional changes in fumarolic gases of active and quiescent volcanoes is one of the most important targets in monitoring programs. From a general point of view, many systematic (often cyclic) and random processes control the chemistry of gas discharges, making difficult to produce a convincing mathematical-statistical modelling. Changes in the chemical composition of volcanic gases sampled at Vulcano Island (Aeolian Arc, Sicily, Italy) from eight different fumaroles located in the northern sector of the summit crater (La Fossa) have been analysed by considering their dependence from time in the period 2000-2007. Each intermediate chemical composition has been considered as potentially derived from the contribution of the two temporal extremes represented by the 2000 and 2007 samples, respectively, by using inverse modelling methodologies for compositional data. Data pertaining to fumaroles F5 and F27, located on the rim and in the inner part of La Fossa crater, respectively, have been used to achieve the proposed aim. The statistical approach has allowed us to highlight the presence of random and not random fluctuations, features useful to understand how the volcanic system works, opening new perspectives in sampling strategies and in the evaluation of the natural risk related to a quiescent volcano
Resumo:
Dissertação para obtenção do Grau de Doutor em Engenharia Química, especialidade de Engenharia Bioquímica
Resumo:
1. Statistical modelling is often used to relate sparse biological survey data to remotely derived environmental predictors, thereby providing a basis for predictively mapping biodiversity across an entire region of interest. The most popular strategy for such modelling has been to model distributions of individual species one at a time. Spatial modelling of biodiversity at the community level may, however, confer significant benefits for applications involving very large numbers of species, particularly if many of these species are recorded infrequently. 2. Community-level modelling combines data from multiple species and produces information on spatial pattern in the distribution of biodiversity at a collective community level instead of, or in addition to, the level of individual species. Spatial outputs from community-level modelling include predictive mapping of community types (groups of locations with similar species composition), species groups (groups of species with similar distributions), axes or gradients of compositional variation, levels of compositional dissimilarity between pairs of locations, and various macro-ecological properties (e.g. species richness). 3. Three broad modelling strategies can be used to generate these outputs: (i) 'assemble first, predict later', in which biological survey data are first classified, ordinated or aggregated to produce community-level entities or attributes that are then modelled in relation to environmental predictors; (ii) 'predict first, assemble later', in which individual species are modelled one at a time as a function of environmental variables, to produce a stack of species distribution maps that is then subjected to classification, ordination or aggregation; and (iii) 'assemble and predict together', in which all species are modelled simultaneously, within a single integrated modelling process. These strategies each have particular strengths and weaknesses, depending on the intended purpose of modelling and the type, quality and quantity of data involved. 4. Synthesis and applications. The potential benefits of modelling large multispecies data sets using community-level, as opposed to species-level, approaches include faster processing, increased power to detect shared patterns of environmental response across rarely recorded species, and enhanced capacity to synthesize complex data into a form more readily interpretable by scientists and decision-makers. Community-level modelling therefore deserves to be considered more often, and more widely, as a potential alternative or supplement to modelling individual species.
Resumo:
In the PhD thesis “Sound Texture Modeling” we deal with statistical modelling or textural sounds like water, wind, rain, etc. For synthesis and classification. Our initial model is based on a wavelet tree signal decomposition and the modeling of the resulting sequence by means of a parametric probabilistic model, that can be situated within the family of models trainable via expectation maximization (hidden Markov tree model ). Our model is able to capture key characteristics of the source textures (water, rain, fire, applause, crowd chatter ), and faithfully reproduces some of the sound classes. In terms of a more general taxonomy of natural events proposed by Graver, we worked on models for natural event classification and segmentation. While the event labels comprise physical interactions between materials that do not have textural propierties in their enterity, those segmentation models can help in identifying textural portions of an audio recording useful for analysis and resynthesis. Following our work on concatenative synthesis of musical instruments, we have developed a pattern-based synthesis system, that allows to sonically explore a database of units by means of their representation in a perceptual feature space. Concatenative syntyhesis with “molecules” built from sparse atomic representations also allows capture low-level correlations in perceptual audio features, while facilitating the manipulation of textural sounds based on their physical and perceptual properties. We have approached the problem of sound texture modelling for synthesis from different directions, namely a low-level signal-theoretic point of view through a wavelet transform, and a more high-level point of view driven by perceptual audio features in the concatenative synthesis setting. The developed framework provides unified approach to the high-quality resynthesis of natural texture sounds. Our research is embedded within the Metaverse 1 European project (2008-2011), where our models are contributting as low level building blocks within a semi-automated soundscape generation system.
Resumo:
This paper is a first draft of the principle of statistical modelling on coordinates. Several causes —which would be long to detail—have led to this situation close to the deadline for submitting papers to CODAWORK’03. The main of them is the fast development of the approach along thelast months, which let appear previous drafts as obsolete. The present paper contains the essential parts of the state of the art of this approach from my point of view. I would like to acknowledge many clarifying discussions with the group of people working in this field in Girona, Barcelona, Carrick Castle, Firenze, Berlin, G¨ottingen, and Freiberg. They have given a lot of suggestions and ideas. Nevertheless, there might be still errors or unclear aspects which are exclusively my fault. I hope this contribution serves as a basis for further discussions and new developments