993 resultados para Statistical distributions
Resumo:
In the context of Bayesian statistical analysis, elicitation is the process of formulating a prior density f(.) about one or more uncertain quantities to represent a person's knowledge and beliefs. Several different methods of eliciting prior distributions for one unknown parameter have been proposed. However, there are relatively few methods for specifying a multivariate prior distribution and most are just applicable to specific classes of problems and/or based on restrictive conditions, such as independence of variables. Besides, many of these procedures require the elicitation of variances and correlations, and sometimes elicitation of hyperparameters which are difficult for experts to specify in practice. Garthwaite et al. (2005) discuss the different methods proposed in the literature and the difficulties of eliciting multivariate prior distributions. We describe a flexible method of eliciting multivariate prior distributions applicable to a wide class of practical problems. Our approach does not assume a parametric form for the unknown prior density f(.), instead we use nonparametric Bayesian inference, modelling f(.) by a Gaussian process prior distribution. The expert is then asked to specify certain summaries of his/her distribution, such as the mean, mode, marginal quantiles and a small number of joint probabilities. The analyst receives that information, treating it as a data set D with which to update his/her prior beliefs to obtain the posterior distribution for f(.). Theoretical properties of joint and marginal priors are derived and numerical illustrations to demonstrate our approach are given. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The nearest-neighbor spacing distributions proposed by four models, namely, the Berry-Robnik, Caurier-Grammaticos-Ramani, Lenz-Haake, and the deformed Gaussian orthogonal ensemble, as well as the ansatz by Brody, are applied to the transition between chaos and order that occurs in the isotropic quartic oscillator. The advantages and disadvantages of these five descriptions are discussed. In addition, the results of a simple extension of the expression for the Dyson-Mehta statistic Δ3 are compared with those of a more popular one, usually associated with the Berry-Robnik formalism. ©1999 The American Physical Society.
Resumo:
Questions: We assess gap size and shape distributions, two important descriptors of the forest disturbance regime, by asking: which statistical model best describes gap size distribution; can simple geometric forms adequately describe gap shape; does gap size or shape vary with forest type, gap age or the method used for gap delimitation; and how similar are the studied forests and other tropical and temperate forests? Location: Southeastern Atlantic Forest, Brazil. Methods: Analysing over 150 gaps in two distinct forest types (seasonal and rain forests), a model selection framework was used to select appropriate probability distributions and functions to describe gap size and gap shape. The first was described using univariate probability distributions, whereas the latter was assessed based on the gap area-perimeter relationship. Comparisons of gap size and shape between sites, as well as size and age classes were then made based on the likelihood of models having different assumptions for the values of their parameters. Results: The log-normal distribution was the best descriptor of gap size distribution, independently of the forest type or gap delimitation method. Because gaps became more irregular as they increased in size, all geometric forms (triangle, rectangle and ellipse) were poor descriptors of gap shape. Only when small and large gaps (> 100 or 400m2 depending on the delimitation method) were treated separately did the rectangle and isosceles triangle become accurate predictors of gap shape. Ellipsoidal shapes were poor descriptors. At both sites, gaps were at least 50% longer than they were wide, a finding with important implications for gap microclimate (e.g. light entrance regime) and, consequently, for gap regeneration. Conclusions: In addition to more appropriate descriptions of gap size and shape, the model selection framework used here efficiently provided a means by which to compare the patterns of two different types of forest. With this framework we were able to recommend the log-normal parameters μ and σ for future comparisons of gap size distribution, and to propose possible mechanisms related to random rates of gap expansion and closure. We also showed that gap shape varied highly and that no single geometric form was able to predict the shape of all gaps, the ellipse in particular should no longer be used as a standard gap shape. © 2012 International Association for Vegetation Science.
Resumo:
An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
We present for the first time a justification on the basis of central limit theorems for the family of life distributions generated from scale-mixture of normals. This family was proposed by Balakrishnan et al. (2009) and can be used to accommodate unexpected observations for the usual Birnbaum-Saunders distribution generated from the normal one. The class of scale-mixture of normals includes normal, slash, Student-t, logistic, double-exponential, exponential power and many other distributions. We present a model for the crack extensions where the limiting distribution of total crack extensions is in the class of scale-mixture of normals. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
The objective of this study was to estimate the prevalence of inadequate micronutrient intake and excess sodium intake among adults age 19 years and older in the city of Sao Paulo, Brazil. Twenty-four hour dietary recall and sociodemographic data were collected from each participant (n=1,663) in a cross-sectional study, Inquiry of Health of Sao Paulo, of a representative sample of the adult population of the city of Sao Paulo in 2003 (ISA-2003). The variability in intake was measured through two replications of the 24-hour recall in a subsample of this population in 2007 (ISA-2007). Usual intake was estimated by the PC-SIDE program (version 1.0, 2003, Department of Statistics, Iowa State University), which uses an approach developed by Iowa State University. The prevalence of nutrient inadequacy was calculated using the Estimated Average Requirement cut-point method for vitamins A and C, thiamin, riboflavin, niacin, copper, phosphorus, and selenium. For vitamin D, pantothenic acid, manganese, and sodium, the proportion of individuals with usual intake equal to or more than the Adequate Intake value was calculated. The percentage of individuals with intake equal to more than the Tolerable Upper Intake Level was calculated for sodium. The highest prevalence of inadequacy for males and females, respectively, occurred for vitamin A (67% and 58%), vitamin C (52% and 62%), thiamin (41% and 50%), and riboflavin (29% and 19%). The adjustment for the within-person variation presented lower prevalence of inadequacy due to removal of within-person variability. All adult residents of Sao Paulo had excess sodium intake, and the rates of nutrient inadequacy were high for certain key micronutrients. J Acad Nutr Diet. 2012;112:1614-1618.
Resumo:
In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright (c) 2011 John Wiley & Sons, Ltd.
Resumo:
The objective of this thesis is to improve the understanding of what processes and mechanism affects the distribution of polychlorinated biphenyls (PCBs) and organic carbon in coastal sediments. Because of the strong association of hydrophobic organic contaminants (HOCs) such as PCBs with organic matter in the aquatic environment, these two entities are naturally linked. The coastal environment is the most complex and dynamic part of the ocean when it comes to both cycling of organic matter and HOCs. This environment is characterised by the largest fluxes and most diverse sources of both entities. A wide array of methods was used to study these processes throughout this thesis. In the field sites in the Stockholm archipelago of the Baltic proper, bottom sediments and settling particulate matter were retrieved using sediment coring devices and sediment traps from morphometrically and seismically well-characterized locations. In the laboratory, the samples have been analysed for PCBs, stable carbon isotope ratios, carbon-nitrogen atom ratios as well as standard sediment properties. From the fieldwork in the Stockholm Archipelago and the following laboratory work it was concluded that the inner Stockholm archipelago has a low (≈ 4%) trapping efficiency for freshwater-derived organic carbon. The corollary is a large potential for long-range waterborne transport of OC and OC-associated nutrients and hydrophobic organic pollutants from urban Stockholm to more pristine offshore Baltic Sea ecosystems. Theoretical work has been carried out using Geographical Information Systems (GIS) and statistical methods on a database of 4214 individual sediment samples, each with reported individual PCB congener concentrations. From this work it was concluded that the continental shelf sediments are key global inventories and ultimate sinks of PCBs. Depending on congener, 10-80% of the cumulative historical emissions to the environment are accounted for in continental shelf sediments. Further it was concluded that the many infamous and highly contaminated surface sediments of urban harbours and estuaries of contaminated rivers cannot be of importance as a secondary source to sustain the concentrations observed in remote sediments. Of the global shelf PCB inventory < 1% are in sediments near population centres while ≥ 90% is in remote areas (> 10 km from any dwellings). The remote sub-basin of the North Atlantic Ocean contains approximately half of the global shelf sediment inventory for most of the PCBs studied.
Resumo:
The uncertainties in the determination of the stratigraphic profile of natural soils is one of the main problems in geotechnics, in particular for landslide characterization and modeling. The study deals with a new approach in geotechnical modeling which relays on a stochastic generation of different soil layers distributions, following a boolean logic – the method has been thus called BoSG (Boolean Stochastic Generation). In this way, it is possible to randomize the presence of a specific material interdigitated in a uniform matrix. In the building of a geotechnical model it is generally common to discard some stratigraphic data in order to simplify the model itself, assuming that the significance of the results of the modeling procedure would not be affected. With the proposed technique it is possible to quantify the error associated with this simplification. Moreover, it could be used to determine the most significant zones where eventual further investigations and surveys would be more effective to build the geotechnical model of the slope. The commercial software FLAC was used for the 2D and 3D geotechnical model. The distribution of the materials was randomized through a specifically coded MatLab program that automatically generates text files, each of them representing a specific soil configuration. Besides, a routine was designed to automate the computation of FLAC with the different data files in order to maximize the sample number. The methodology is applied with reference to a simplified slope in 2D, a simplified slope in 3D and an actual landslide, namely the Mortisa mudslide (Cortina d’Ampezzo, BL, Italy). However, it could be extended to numerous different cases, especially for hydrogeological analysis and landslide stability assessment, in different geological and geomorphological contexts.
Resumo:
In the setting of high-dimensional linear models with Gaussian noise, we investigate the possibility of confidence statements connected to model selection. Although there exist numerous procedures for adaptive (point) estimation, the construction of adaptive confidence regions is severely limited (cf. Li in Ann Stat 17:1001–1008, 1989). The present paper sheds new light on this gap. We develop exact and adaptive confidence regions for the best approximating model in terms of risk. One of our constructions is based on a multiscale procedure and a particular coupling argument. Utilizing exponential inequalities for noncentral χ2-distributions, we show that the risk and quadratic loss of all models within our confidence region are uniformly bounded by the minimal risk times a factor close to one.
Resumo:
Macrozooplankton are an important link between higher and lower trophic levels in the oceans. They serve as the primary food for fish, reptiles, birds and mammals in some regions, and play a role in the export of carbon from the surface to the intermediate and deep ocean. Little, however, is known of their global distribution and biomass. Here we compiled a dataset of macrozooplankton abundance and biomass observations for the global ocean from a collection of four datasets. We harmonise the data to common units, calculate additional carbon biomass where possible, and bin the dataset in a global 1 x 1 degree grid. This dataset is part of a wider effort to provide a global picture of carbon biomass data for key plankton functional types, in particular to support the development of marine ecosystem models. Over 387 700 abundance data and 1330 carbon biomass data have been collected from pre-existing datasets. A further 34 938 abundance data were converted to carbon biomass data using species-specific length frequencies or using species-specific abundance to carbon biomass data. Depth-integrated values are used to calculate known epipelagic macrozooplankton biomass concentrations and global biomass. Global macrozooplankton biomass has a mean of 8.4 µg C l-1, median of 0.15 µg C l-1 and a standard deviation of 63.46 µg C l-1. The global annual average estimate of epipelagic macrozooplankton, based on the median value, is 0.02 Pg C. Biomass is highest in the tropics, decreasing in the sub-tropics and increasing slightly towards the poles. There are, however, limitations on the dataset; abundance observations have good coverage except in the South Pacific mid latitudes, but biomass observation coverage is only good at high latitudes. Biomass is restricted to data that is originally given in carbon or to data that can be converted from abundance to carbon. Carbon conversions from abundance are restricted in the most part by the lack of information on the size of the organism and/or the absence of taxonomic information. Distribution patterns of global macrozooplankton biomass and statistical information about biomass concentrations may be used to validate biogeochemical models and Plankton Functional Type models.
Resumo:
Many existing engineering works model the statistical characteristics of the entities under study as normal distributions. These models are eventually used for decision making, requiring in practice the definition of the classification region corresponding to the desired confidence level. Surprisingly enough, however, a great amount of computer vision works using multidimensional normal models leave unspecified or fail to establish correct confidence regions due to misconceptions on the features of Gaussian functions or to wrong analogies with the unidimensional case. The resulting regions incur in deviations that can be unacceptable in high-dimensional models. Here we provide a comprehensive derivation of the optimal confidence regions for multivariate normal distributions of arbitrary dimensionality. To this end, firstly we derive the condition for region optimality of general continuous multidimensional distributions, and then we apply it to the widespread case of the normal probability density function. The obtained results are used to analyze the confidence error incurred by previous works related to vision research, showing that deviations caused by wrong regions may turn into unacceptable as dimensionality increases. To support the theoretical analysis, a quantitative example in the context of moving object detection by means of background modeling is given.
Resumo:
A “most probable state” equilibrium statistical theory for random distributions of hetons in a closed basin is developed here in the context of two-layer quasigeostrophic models for the spreading phase of open-ocean convection. The theory depends only on bulk conserved quantities such as energy, circulation, and the range of values of potential vorticity in each layer. The simplest theory is formulated for a uniform cooling event over the entire basin that triggers a homogeneous random distribution of convective towers. For a small Rossby deformation radius typical for open-ocean convection sites, the most probable states that arise from this theory strongly resemble the saturated baroclinic states of the spreading phase of convection, with a stabilizing barotropic rim current and localized temperature anomaly.