949 resultados para Geo-statistical model


Relevância:

90.00% 90.00%

Publicador:

Resumo:

An extensive statistical ‘downscaling’ study is done to relate large-scale climate information from a general circulation model (GCM) to local-scale river flows in SW France for 51 gauging stations ranging from nival (snow-dominated) to pluvial (rainfall-dominated) river-systems. This study helps to select the appropriate statistical method at a given spatial and temporal scale to downscale hydrology for future climate change impact assessment of hydrological resources. The four proposed statistical downscaling models use large-scale predictors (derived from climate model outputs or reanalysis data) that characterize precipitation and evaporation processes in the hydrological cycle to estimate summary flow statistics. The four statistical models used are generalized linear (GLM) and additive (GAM) models, aggregated boosted trees (ABT) and multi-layer perceptron neural networks (ANN). These four models were each applied at two different spatial scales, namely at that of a single flow-gauging station (local downscaling) and that of a group of flow-gauging stations having the same hydrological behaviour (regional downscaling). For each statistical model and each spatial resolution, three temporal resolutions were considered, namely the daily mean flows, the summary statistics of fortnightly flows and a daily ‘integrated approach’. The results show that flow sensitivity to atmospheric factors is significantly different between nival and pluvial hydrological systems which are mainly influenced, respectively, by shortwave solar radiations and atmospheric temperature. The non-linear models (i.e. GAM, ABT and ANN) performed better than the linear GLM when simulating fortnightly flow percentiles. The aggregated boosted trees method showed higher and less variable R2 values to downscale the hydrological variability in both nival and pluvial regimes. Based on GCM cnrm-cm3 and scenarios A2 and A1B, future relative changes of fortnightly median flows were projected based on the regional downscaling approach. The results suggest a global decrease of flow in both pluvial and nival regimes, especially in spring, summer and autumn, whatever the considered scenario. The discussion considers the performance of each statistical method for downscaling flow at different spatial and temporal scales as well as the relationship between atmospheric processes and flow variability.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An important element of the developing field of proteomics is to understand protein-protein interactions and other functional links amongst genes. Across-species correlation methods for detecting functional links work on the premise that functionally linked proteins will tend to show a common pattern of presence and absence across a range of genomes. We describe a maximum likelihood statistical model for predicting functional gene linkages. The method detects independent instances of the correlated gain or loss of pairs of proteins on phylogenetic trees, reducing the high rates of false positives observed in conventional across-species methods that do not explicitly incorporate a phylogeny. We show, in a dataset of 10,551 protein pairs, that the phylogenetic method improves by up to 35% on across-species analyses at identifying known functionally linked proteins. The method shows that protein pairs with at least two to three correlated events of gain or loss are almost certainly functionally linked. Contingent evolution, in which one gene's presence or absence depends upon the presence of another, can also be detected phylogenetically, and may identify genes whose functional significance depends upon its interaction with other genes. Incorporating phylogenetic information improves the prediction of functional linkages. The improvement derives from having a lower rate of false positives and from detecting trends that across-species analyses miss. Phylogenetic methods can easily be incorporated into the screening of large-scale bioinformatics datasets to identify sets of protein links and to characterise gene networks.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Presented herein is an experimental design that allows the effects of several radiative forcing factors on climate to be estimated as precisely as possible from a limited suite of atmosphere-only general circulation model (GCM) integrations. The forcings include the combined effect of observed changes in sea surface temperatures, sea ice extent, stratospheric (volcanic) aerosols, and solar output, plus the individual effects of several anthropogenic forcings. A single linear statistical model is used to estimate the forcing effects, each of which is represented by its global mean radiative forcing. The strong colinearity in time between the various anthropogenic forcings provides a technical problem that is overcome through the design of the experiment. This design uses every combination of anthropogenic forcing rather than having a few highly replicated ensembles, which is more commonly used in climate studies. Not only is this design highly efficient for a given number of integrations, but it also allows the estimation of (nonadditive) interactions between pairs of anthropogenic forcings. The simulated land surface air temperature changes since 1871 have been analyzed. The changes in natural and oceanic forcing, which itself contains some forcing from anthropogenic and natural influences, have the most influence. For the global mean, increasing greenhouse gases and the indirect aerosol effect had the largest anthropogenic effects. It was also found that an interaction between these two anthropogenic effects in the atmosphere-only GCM exists. This interaction is similar in magnitude to the individual effects of changing tropospheric and stratospheric ozone concentrations or to the direct (sulfate) aerosol effect. Various diagnostics are used to evaluate the fit of the statistical model. For the global mean, this shows that the land temperature response is proportional to the global mean radiative forcing, reinforcing the use of radiative forcing as a measure of climate change. The diagnostic tests also show that the linear model was suitable for analyses of land surface air temperature at each GCM grid point. Therefore, the linear model provides precise estimates of the space time signals for all forcing factors under consideration. For simulated 50-hPa temperatures, results show that tropospheric ozone increases have contributed to stratospheric cooling over the twentieth century almost as much as changes in well-mixed greenhouse gases.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Geophysical time series sometimes exhibit serial correlations that are stronger than can be captured by the commonly used first‐order autoregressive model. In this study we demonstrate that a power law statistical model serves as a useful upper bound for the persistence of total ozone anomalies on monthly to interannual timescales. Such a model is usually characterized by the Hurst exponent. We show that the estimation of the Hurst exponent in time series of total ozone is sensitive to various choices made in the statistical analysis, especially whether and how the deterministic (including periodic) signals are filtered from the time series, and the frequency range over which the estimation is made. In particular, care must be taken to ensure that the estimate of the Hurst exponent accurately represents the low‐frequency limit of the spectrum, which is the part that is relevant to long‐term correlations and the uncertainty of estimated trends. Otherwise, spurious results can be obtained. Based on this analysis, and using an updated equivalent effective stratospheric chlorine (EESC) function, we predict that an increase in total ozone attributable to EESC should be detectable at the 95% confidence level by 2015 at the latest in southern midlatitudes, and by 2020–2025 at the latest over 30°–45°N, with the time to detection increasing rapidly with latitude north of this range.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In 2004 the National Household Survey (Pesquisa Nacional par Amostras de Domicilios - PNAD) estimated the prevalence of food and nutrition insecurity in Brazil. However, PNAD data cannot be disaggregated at the municipal level. The objective of this study was to build a statistical model to predict severe food insecurity for Brazilian municipalities based on the PNAD dataset. Exclusion criteria were: incomplete food security data (19.30%); informants younger than 18 years old (0.07%); collective households (0.05%); households headed by indigenous persons (0.19%). The modeling was carried out in three stages, beginning with the selection of variables related to food insecurity using univariate logistic regression. The variables chosen to construct the municipal estimates were selected from those included in PNAD as well as the 2000 Census. Multivariate logistic regression was then initiated, removing the non-significant variables with odds ratios adjusted by multiple logistic regression. The Wald Test was applied to check the significance of the coefficients in the logistic equation. The final model included the variables: per capita income; years of schooling; race and gender of the household head; urban or rural residence; access to public water supply; presence of children; total number of household inhabitants and state of residence. The adequacy of the model was tested using the Hosmer-Lemeshow test (p=0.561) and ROC curve (area=0.823). Tests indicated that the model has strong predictive power and can be used to determine household food insecurity in Brazilian municipalities, suggesting that similar predictive models may be useful tools in other Latin American countries.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A number of recent works have introduced statistical methods for detecting genetic loci that affect phenotypic variability, which we refer to as variability-controlling quantitative trait loci (vQTL). These are genetic variants whose allelic state predicts how much phenotype values will vary about their expected means. Such loci are of great potential interest in both human and non-human genetic studies, one reason being that a detected vQTL could represent a previously undetected interaction with other genes or environmental factors. The simultaneous publication of these new methods in different journals has in many cases precluded opportunity for comparison. We survey some of these methods, the respective trade-offs they imply, and the connections between them. The methods fall into three main groups: classical non-parametric, fully parametric, and semi-parametric two-stage approximations. Choosing between alternatives involves balancing the need for robustness, flexibility, and speed. For each method, we identify important assumptions and limitations, including those of practical importance, such as their scope for including covariates and random effects. We show in simulations that both parametric methods and their semi-parametric approximations can give elevated false positive rates when they ignore mean-variance relationships intrinsic to the data generation process. We conclude that choice of method depends on the trait distribution, the need to include non-genetic covariates, and the population size and structure, coupled with a critical evaluation of how these fit with the assumptions of the statistical model.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper reviews the application of statistical models to planning and evaluating cancer screening programmes. Models used to analyse screening strategies can be classified as either surface models, which consider only those events which can be directly observed such as disease incidence, prevalence or mortality, or deep models, which incorporate hypotheses about the disease process that generates the observed events. This paper focuses on the latter type. These can be further classified as analytic models, which use a model of the disease to derive direct estimates of characteristics of the screening procedure and its consequent benefits, and simulation models, which use the disease model to simulate the course of the disease in a hypothetical population with and without screening and derive measures of the benefit of screening from the simulation outcomes. The main approaches to each type of model are described and an overview given of their historical development and strengths and weaknesses. A brief review of fitting and validating such models is given and finally a discussion of the current state of, and likely future trends in, cancer screening models is presented.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Developing water quality guidelines for Antarctic marine environments requires understanding the sensitivity of local biota to contaminant exposure. Antarctic invertebrates have shown slower contaminant responses in previous experiments compared to temperate and tropical species in standard toxicity tests. Consequently, test methods which take into account environmental conditions and biological characteristics of cold climate species need to be developed. This study investigated the effects of five metals on the survival of a common Antarctic amphipod, Orchomenella pinguides. Multiple observations assessing mortality to metal exposure were made over the 30 days exposure period. Traditional toxicity tests with quantal data sets are analysed using methods such as maximum likelihood regression (probit analysis) and Spearman–Kärber which treat individual time period endpoints independently. A new statistical model was developed to integrate the time-series concentration–response data obtained in this study. Grouped survival data were modelled using a generalized additive mixed model (GAMM) which incorporates all the data obtained from multiple observation times to derive time integrated point estimates. The sensitivity of the amphipod, O. pinguides, to metals increased with increasing exposure time. Response times varied for different metals with amphipods responding faster to copper than to cadmium, lead or zinc. As indicated by 30 days lethal concentration (LC50) estimates, copper was the most toxic metal (31 µg/L), followed by cadmium (168 µg/L), lead (256 µg/L) and zinc (822 µg/L). Nickel exposure (up to 1.12 mg/L) did not affect amphipod survival. Using longer exposure durations and utilising the GAMM model provides an improved methodology for assessing sensitivities of slow responding Antarctic marine invertebrates to contaminants.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Statistical time series methods have proven to be a promising technique in structural health monitoring, since it provides a direct form of data analysis and eliminates the requirement for domain transformation. Latest research in structural health monitoring presents a number of statistical models that have been successfully used to construct quantified models of vibration response signals. Although a majority of these studies present viable results, the aspects of practical implementation, statistical model construction and decision-making procedures are often vaguely defined or omitted from presented work. In this article, a comprehensive methodology is developed, which essentially utilizes an auto-regressive moving average with exogenous input model to create quantified model estimates of experimentally acquired response signals. An iterative self-fitting algorithm is proposed to construct and fit the auto-regressive moving average with exogenous input model, which is capable of integrally finding an optimum set of auto-regressive moving average with exogenous input model parameters. After creating a dataset of quantified response signals, an unlabelled response signal can be identified according to a 'closest-fit' available in the dataset. A unique averaging method is proposed and implemented for multi-sensor data fusion to decrease the margin of error with sensors, thus increasing the reliability of global damage identification. To demonstrate the effectiveness of the developed methodology, a steel frame structure subjected to various bolt-connection damage scenarios is tested. Damage identification results from the experimental study suggest that the proposed methodology can be employed as an efficient and functional damage identification tool. © The Author(s) 2014.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The US term structure of interest rates plays a central role in fixed-income analysis. For example, estimating accurately the US term structure is a crucial step for those interested in analyzing Brazilian Brady bonds such as IDUs, DCBs, FLIRBs, EIs, etc. In this work we present a statistical model to estimate the US term structure of interest rates. We address in this report all major issues which drove us in the process of implementing the model developed, concentrating on important practical issues such as computational efficiency, robustness of the final implementation, the statistical properties of the final model, etc. Numerical examples are provided in order to illustrate the use of the model on a daily basis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Atypical points in the data may result in meaningless e±cient frontiers. This follows since portfolios constructed using classical estimates may re°ect neither the usual nor the unusual days patterns. On the other hand, portfolios constructed using robust approaches are able to capture just the dynamics of the usual days, which constitute the majority of the business days. In this paper we propose an statistical model and a robust estimation procedure to obtain an e±cient frontier which would take into account the behavior of both the usual and most of the atypical days. We show, using real data and simulations, that portfolios constructed in this way require less frequent rebalancing, and may yield higher expected returns for any risk level.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

O objetivo deste estudo é propor a implementação de um modelo estatístico para cálculo da volatilidade, não difundido na literatura brasileira, o modelo de escala local (LSM), apresentando suas vantagens e desvantagens em relação aos modelos habitualmente utilizados para mensuração de risco. Para estimação dos parâmetros serão usadas as cotações diárias do Ibovespa, no período de janeiro de 2009 a dezembro de 2014, e para a aferição da acurácia empírica dos modelos serão realizados testes fora da amostra, comparando os VaR obtidos para o período de janeiro a dezembro de 2014. Foram introduzidas variáveis explicativas na tentativa de aprimorar os modelos e optou-se pelo correspondente americano do Ibovespa, o índice Dow Jones, por ter apresentado propriedades como: alta correlação, causalidade no sentido de Granger, e razão de log-verossimilhança significativa. Uma das inovações do modelo de escala local é não utilizar diretamente a variância, mas sim a sua recíproca, chamada de “precisão” da série, que segue uma espécie de passeio aleatório multiplicativo. O LSM captou todos os fatos estilizados das séries financeiras, e os resultados foram favoráveis a sua utilização, logo, o modelo torna-se uma alternativa de especificação eficiente e parcimoniosa para estimar e prever volatilidade, na medida em que possui apenas um parâmetro a ser estimado, o que representa uma mudança de paradigma em relação aos modelos de heterocedasticidade condicional.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

An economic-statistical model is developed for variable parameters (VP) (X) over bar charts in which all design parameters vary adaptively, that is, each of the design parameters (sample size, sampling interval and control-limit width) vary as a function of the most recent process information. The cost function due to controlling the process quality through a VP (X) over bar chart is derived. During the optimization of the cost function, constraints are imposed on the expected times to signal when the process is in and out of control. In this way, required statistical properties can be assured. Through a numerical example, the proposed economic-statistical design approach for VP (X) over bar charts is compared to the economic design for VP (X) over bar charts and to the economic-statistical and economic designs for fixed parameters (FP) (X) over bar charts in terms of the operating cost and the expected times to signal. From this example, it is possible to assess the benefits provided by the proposed model. Varying some input parameters, their effect on the optimal cost and on the optimal values of the design parameters was analysed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Questions: We assess gap size and shape distributions, two important descriptors of the forest disturbance regime, by asking: which statistical model best describes gap size distribution; can simple geometric forms adequately describe gap shape; does gap size or shape vary with forest type, gap age or the method used for gap delimitation; and how similar are the studied forests and other tropical and temperate forests? Location: Southeastern Atlantic Forest, Brazil. Methods: Analysing over 150 gaps in two distinct forest types (seasonal and rain forests), a model selection framework was used to select appropriate probability distributions and functions to describe gap size and gap shape. The first was described using univariate probability distributions, whereas the latter was assessed based on the gap area-perimeter relationship. Comparisons of gap size and shape between sites, as well as size and age classes were then made based on the likelihood of models having different assumptions for the values of their parameters. Results: The log-normal distribution was the best descriptor of gap size distribution, independently of the forest type or gap delimitation method. Because gaps became more irregular as they increased in size, all geometric forms (triangle, rectangle and ellipse) were poor descriptors of gap shape. Only when small and large gaps (> 100 or 400m2 depending on the delimitation method) were treated separately did the rectangle and isosceles triangle become accurate predictors of gap shape. Ellipsoidal shapes were poor descriptors. At both sites, gaps were at least 50% longer than they were wide, a finding with important implications for gap microclimate (e.g. light entrance regime) and, consequently, for gap regeneration. Conclusions: In addition to more appropriate descriptions of gap size and shape, the model selection framework used here efficiently provided a means by which to compare the patterns of two different types of forest. With this framework we were able to recommend the log-normal parameters μ and σ for future comparisons of gap size distribution, and to propose possible mechanisms related to random rates of gap expansion and closure. We also showed that gap shape varied highly and that no single geometric form was able to predict the shape of all gaps, the ellipse in particular should no longer be used as a standard gap shape. © 2012 International Association for Vegetation Science.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)