Biblioteca Digital

949 resultados para Statistical Model

Can knowledge of the state of the stratosphere be used to improve statistical forecasts of the troposphere?

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent analysis of the Arctic Oscillation (AO) in the stratosphere and troposphere has suggested that predictability of the state of the tropospheric AO may be obtained from the state of the stratospheric AO. However, much of this research has been of a purely qualitative nature. We present a more thorough statistical analysis of a long AO amplitude dataset which seeks to establish the magnitude of such a link. A relationship between the AO in the lower stratosphere and on the 1000 hPa surface on a 10-45 day time-scale is revealed. The relationship accounts for 5% of the variance of the 1000 hPa time series at its peak value and is significant at the 5% level. Over a similar time-scale the 1000 hPa time series accounts for 1% of itself and is not significant at the 5% level. Further investigation of the relationship reveals that it is only present during the winter season and in particular during February and March. It is also demonstrated that using stratospheric AO amplitude data as a predictor in a simple statistical model results in a gain of skill of 5% over a troposphere-only statistical model. This gain in skill is not repeated if an unrelated time series is included as a predictor in the model. Copyright © 2003 Royal Meteorological Society

Statistical downscaling of river flows

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An extensive statistical ‘downscaling’ study is done to relate large-scale climate information from a general circulation model (GCM) to local-scale river flows in SW France for 51 gauging stations ranging from nival (snow-dominated) to pluvial (rainfall-dominated) river-systems. This study helps to select the appropriate statistical method at a given spatial and temporal scale to downscale hydrology for future climate change impact assessment of hydrological resources. The four proposed statistical downscaling models use large-scale predictors (derived from climate model outputs or reanalysis data) that characterize precipitation and evaporation processes in the hydrological cycle to estimate summary flow statistics. The four statistical models used are generalized linear (GLM) and additive (GAM) models, aggregated boosted trees (ABT) and multi-layer perceptron neural networks (ANN). These four models were each applied at two different spatial scales, namely at that of a single flow-gauging station (local downscaling) and that of a group of flow-gauging stations having the same hydrological behaviour (regional downscaling). For each statistical model and each spatial resolution, three temporal resolutions were considered, namely the daily mean flows, the summary statistics of fortnightly flows and a daily ‘integrated approach’. The results show that flow sensitivity to atmospheric factors is significantly different between nival and pluvial hydrological systems which are mainly influenced, respectively, by shortwave solar radiations and atmospheric temperature. The non-linear models (i.e. GAM, ABT and ANN) performed better than the linear GLM when simulating fortnightly flow percentiles. The aggregated boosted trees method showed higher and less variable R2 values to downscale the hydrological variability in both nival and pluvial regimes. Based on GCM cnrm-cm3 and scenarios A2 and A1B, future relative changes of fortnightly median flows were projected based on the regional downscaling approach. The results suggest a global decrease of flow in both pluvial and nival regimes, especially in spring, summer and autumn, whatever the considered scenario. The discussion considers the performance of each statistical method for downscaling flow at different spatial and temporal scales as well as the relationship between atmospheric processes and flow variability.

Predicting functional gene links from phylogenetic-statistical analyses of whole genomes

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An important element of the developing field of proteomics is to understand protein-protein interactions and other functional links amongst genes. Across-species correlation methods for detecting functional links work on the premise that functionally linked proteins will tend to show a common pattern of presence and absence across a range of genomes. We describe a maximum likelihood statistical model for predicting functional gene linkages. The method detects independent instances of the correlated gain or loss of pairs of proteins on phylogenetic trees, reducing the high rates of false positives observed in conventional across-species methods that do not explicitly incorporate a phylogeny. We show, in a dataset of 10,551 protein pairs, that the phylogenetic method improves by up to 35% on across-species analyses at identifying known functionally linked proteins. The method shows that protein pairs with at least two to three correlated events of gain or loss are almost certainly functionally linked. Contingent evolution, in which one gene's presence or absence depends upon the presence of another, can also be detected phylogenetically, and may identify genes whose functional significance depends upon its interaction with other genes. Incorporating phylogenetic information improves the prediction of functional linkages. The improvement derives from having a lower rate of false positives and from detecting trends that across-species analyses miss. Phylogenetic methods can easily be incorporated into the screening of large-scale bioinformatics datasets to identify sets of protein links and to characterise gene networks.

Design and analysis of climate model experiments for the efficient estimation of anthropogenic signals

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Presented herein is an experimental design that allows the effects of several radiative forcing factors on climate to be estimated as precisely as possible from a limited suite of atmosphere-only general circulation model (GCM) integrations. The forcings include the combined effect of observed changes in sea surface temperatures, sea ice extent, stratospheric (volcanic) aerosols, and solar output, plus the individual effects of several anthropogenic forcings. A single linear statistical model is used to estimate the forcing effects, each of which is represented by its global mean radiative forcing. The strong colinearity in time between the various anthropogenic forcings provides a technical problem that is overcome through the design of the experiment. This design uses every combination of anthropogenic forcing rather than having a few highly replicated ensembles, which is more commonly used in climate studies. Not only is this design highly efficient for a given number of integrations, but it also allows the estimation of (nonadditive) interactions between pairs of anthropogenic forcings. The simulated land surface air temperature changes since 1871 have been analyzed. The changes in natural and oceanic forcing, which itself contains some forcing from anthropogenic and natural influences, have the most influence. For the global mean, increasing greenhouse gases and the indirect aerosol effect had the largest anthropogenic effects. It was also found that an interaction between these two anthropogenic effects in the atmosphere-only GCM exists. This interaction is similar in magnitude to the individual effects of changing tropospheric and stratospheric ozone concentrations or to the direct (sulfate) aerosol effect. Various diagnostics are used to evaluate the fit of the statistical model. For the global mean, this shows that the land temperature response is proportional to the global mean radiative forcing, reinforcing the use of radiative forcing as a measure of climate change. The diagnostic tests also show that the linear model was suitable for analyses of land surface air temperature at each GCM grid point. Therefore, the linear model provides precise estimates of the space time signals for all forcing factors under consideration. For simulated 50-hPa temperatures, results show that tropospheric ozone increases have contributed to stratospheric cooling over the twentieth century almost as much as changes in well-mixed greenhouse gases.

On the statistical modeling of persistence in total ozone anomalies

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Geophysical time series sometimes exhibit serial correlations that are stronger than can be captured by the commonly used first‐order autoregressive model. In this study we demonstrate that a power law statistical model serves as a useful upper bound for the persistence of total ozone anomalies on monthly to interannual timescales. Such a model is usually characterized by the Hurst exponent. We show that the estimation of the Hurst exponent in time series of total ozone is sensitive to various choices made in the statistical analysis, especially whether and how the deterministic (including periodic) signals are filtered from the time series, and the frequency range over which the estimation is made. In particular, care must be taken to ensure that the estimate of the Hurst exponent accurately represents the low‐frequency limit of the spectrum, which is the part that is relevant to long‐term correlations and the uncertainty of estimated trends. Otherwise, spurious results can be obtained. Based on this analysis, and using an updated equivalent effective stratospheric chlorine (EESC) function, we predict that an increase in total ozone attributable to EESC should be detectable at the 95% confidence level by 2015 at the latest in southern midlatitudes, and by 2020–2025 at the latest over 30°–45°N, with the time to detection increasing rapidly with latitude north of this range.

Use of a predictive model for food insecurity estimates in Brazil

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In 2004 the National Household Survey (Pesquisa Nacional par Amostras de Domicilios - PNAD) estimated the prevalence of food and nutrition insecurity in Brazil. However, PNAD data cannot be disaggregated at the municipal level. The objective of this study was to build a statistical model to predict severe food insecurity for Brazilian municipalities based on the PNAD dataset. Exclusion criteria were: incomplete food security data (19.30%); informants younger than 18 years old (0.07%); collective households (0.05%); households headed by indigenous persons (0.19%). The modeling was carried out in three stages, beginning with the selection of variables related to food insecurity using univariate logistic regression. The variables chosen to construct the municipal estimates were selected from those included in PNAD as well as the 2000 Census. Multivariate logistic regression was then initiated, removing the non-significant variables with odds ratios adjusted by multiple logistic regression. The Wald Test was applied to check the significance of the coefficients in the logistic equation. The final model included the variables: per capita income; years of schooling; race and gender of the household head; urban or rural residence; access to public water supply; presence of children; total number of household inhabitants and state of residence. The adequacy of the model was tested using the Hosmer-Lemeshow test (p=0.561) and ROC curve (area=0.823). Tests indicated that the model has strong predictive power and can be used to determine household food insecurity in Brazilian municipalities, suggesting that similar predictive models may be useful tools in other Latin American countries.

Recent developments in statistical methods for detecting genetic loci affecting phenotypic variability

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A number of recent works have introduced statistical methods for detecting genetic loci that affect phenotypic variability, which we refer to as variability-controlling quantitative trait loci (vQTL). These are genetic variants whose allelic state predicts how much phenotype values will vary about their expected means. Such loci are of great potential interest in both human and non-human genetic studies, one reason being that a detected vQTL could represent a previously undetected interaction with other genes or environmental factors. The simultaneous publication of these new methods in different journals has in many cases precluded opportunity for comparison. We survey some of these methods, the respective trade-offs they imply, and the connections between them. The methods fall into three main groups: classical non-parametric, fully parametric, and semi-parametric two-stage approximations. Choosing between alternatives involves balancing the need for robustness, flexibility, and speed. For each method, we identify important assumptions and limitations, including those of practical importance, such as their scope for including covariates and random effects. We show in simulations that both parametric methods and their semi-parametric approximations can give elevated false positive rates when they ignore mean-variance relationships intrinsic to the data generation process. We conclude that choice of method depends on the trait distribution, the need to include non-genetic covariates, and the population size and structure, coupled with a critical evaluation of how these fit with the assumptions of the statistical model.

A model to estimate the US term structure of interest rates

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The US term structure of interest rates plays a central role in fixed-income analysis. For example, estimating accurately the US term structure is a crucial step for those interested in analyzing Brazilian Brady bonds such as IDUs, DCBs, FLIRBs, EIs, etc. In this work we present a statistical model to estimate the US term structure of interest rates. We address in this report all major issues which drove us in the process of implementing the model developed, concentrating on important practical issues such as computational efficiency, robustness of the final implementation, the statistical properties of the final model, etc. Numerical examples are provided in order to illustrate the use of the model on a daily basis.

Robust statistical modeling of portfolios

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Atypical points in the data may result in meaningless e±cient frontiers. This follows since portfolios constructed using classical estimates may re°ect neither the usual nor the unusual days patterns. On the other hand, portfolios constructed using robust approaches are able to capture just the dynamics of the usual days, which constitute the majority of the business days. In this paper we propose an statistical model and a robust estimation procedure to obtain an e±cient frontier which would take into account the behavior of both the usual and most of the atypical days. We show, using real data and simulations, that portfolios constructed in this way require less frequent rebalancing, and may yield higher expected returns for any risk level.

Cálculo do Value at Risk (VaR) para o Ibovespa, pós crise de 2008, por meio dos modelos de heterocedasticidade condicional (GARCH) e de volatilidade estocástica (Local Scale Model - LSM)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

O objetivo deste estudo é propor a implementação de um modelo estatístico para cálculo da volatilidade, não difundido na literatura brasileira, o modelo de escala local (LSM), apresentando suas vantagens e desvantagens em relação aos modelos habitualmente utilizados para mensuração de risco. Para estimação dos parâmetros serão usadas as cotações diárias do Ibovespa, no período de janeiro de 2009 a dezembro de 2014, e para a aferição da acurácia empírica dos modelos serão realizados testes fora da amostra, comparando os VaR obtidos para o período de janeiro a dezembro de 2014. Foram introduzidas variáveis explicativas na tentativa de aprimorar os modelos e optou-se pelo correspondente americano do Ibovespa, o índice Dow Jones, por ter apresentado propriedades como: alta correlação, causalidade no sentido de Granger, e razão de log-verossimilhança significativa. Uma das inovações do modelo de escala local é não utilizar diretamente a variância, mas sim a sua recíproca, chamada de “precisão” da série, que segue uma espécie de passeio aleatório multiplicativo. O LSM captou todos os fatos estilizados das séries financeiras, e os resultados foram favoráveis a sua utilização, logo, o modelo torna-se uma alternativa de especificação eficiente e parcimoniosa para estimar e prever volatilidade, na medida em que possui apenas um parâmetro a ser estimado, o que representa uma mudança de paradigma em relação aos modelos de heterocedasticidade condicional.

Constrained optimization model for the design of an adaptive (X)over-bar chart

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An economic-statistical model is developed for variable parameters (VP) (X) over bar charts in which all design parameters vary adaptively, that is, each of the design parameters (sample size, sampling interval and control-limit width) vary as a function of the most recent process information. The cost function due to controlling the process quality through a VP (X) over bar chart is derived. During the optimization of the cost function, constraints are imposed on the expected times to signal when the process is in and out of control. In this way, required statistical properties can be assured. Through a numerical example, the proposed economic-statistical design approach for VP (X) over bar charts is compared to the economic design for VP (X) over bar charts and to the economic-statistical and economic designs for fixed parameters (FP) (X) over bar charts in terms of the operating cost and the expected times to signal. From this example, it is possible to assess the benefits provided by the proposed model. Varying some input parameters, their effect on the optimal cost and on the optimal values of the design parameters was analysed.

Improving methods in gap ecology: Revisiting size and shape distributions using a model selection approach

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Questions: We assess gap size and shape distributions, two important descriptors of the forest disturbance regime, by asking: which statistical model best describes gap size distribution; can simple geometric forms adequately describe gap shape; does gap size or shape vary with forest type, gap age or the method used for gap delimitation; and how similar are the studied forests and other tropical and temperate forests? Location: Southeastern Atlantic Forest, Brazil. Methods: Analysing over 150 gaps in two distinct forest types (seasonal and rain forests), a model selection framework was used to select appropriate probability distributions and functions to describe gap size and gap shape. The first was described using univariate probability distributions, whereas the latter was assessed based on the gap area-perimeter relationship. Comparisons of gap size and shape between sites, as well as size and age classes were then made based on the likelihood of models having different assumptions for the values of their parameters. Results: The log-normal distribution was the best descriptor of gap size distribution, independently of the forest type or gap delimitation method. Because gaps became more irregular as they increased in size, all geometric forms (triangle, rectangle and ellipse) were poor descriptors of gap shape. Only when small and large gaps (> 100 or 400m2 depending on the delimitation method) were treated separately did the rectangle and isosceles triangle become accurate predictors of gap shape. Ellipsoidal shapes were poor descriptors. At both sites, gaps were at least 50% longer than they were wide, a finding with important implications for gap microclimate (e.g. light entrance regime) and, consequently, for gap regeneration. Conclusions: In addition to more appropriate descriptions of gap size and shape, the model selection framework used here efficiently provided a means by which to compare the patterns of two different types of forest. With this framework we were able to recommend the log-normal parameters μ and σ for future comparisons of gap size distribution, and to propose possible mechanisms related to random rates of gap expansion and closure. We also showed that gap shape varied highly and that no single geometric form was able to predict the shape of all gaps, the ellipse in particular should no longer be used as a standard gap shape. © 2012 International Association for Vegetation Science.

Non-destructive linear model for leaf area estimation in Vernonia ferruginea Less

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Analyses of ecological data should account for the uncertainty in the process(es) that generated the data. However, accounting for these uncertainties is a difficult task, since ecology is known for its complexity. Measurement and/or process errors are often the only sources of uncertainty modeled when addressing complex ecological problems, yet analyses should also account for uncertainty in sampling design, in model specification, in parameters governing the specified model, and in initial and boundary conditions. Only then can we be confident in the scientific inferences and forecasts made from an analysis. Probability and statistics provide a framework that accounts for multiple sources of uncertainty. Given the complexities of ecological studies, the hierarchical statistical model is an invaluable tool. This approach is not new in ecology, and there are many examples (both Bayesian and non-Bayesian) in the literature illustrating the benefits of this approach. In this article, we provide a baseline for concepts, notation, and methods, from which discussion on hierarchical statistical modeling in ecology can proceed. We have also planted some seeds for discussion and tried to show where the practical difficulties lie. Our thesis is that hierarchical statistical modeling is a powerful way of approaching ecological analysis in the presence of inevitable but quantifiable uncertainties, even if practical issues sometimes require pragmatic compromises.

Statistical methods for biomedical signal analysis and processing

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Statistical modelling and statistical learning theory are two powerful analytical frameworks for analyzing signals and developing efficient processing and classification algorithms. In this thesis, these frameworks are applied for modelling and processing biomedical signals in two different contexts: ultrasound medical imaging systems and primate neural activity analysis and modelling. In the context of ultrasound medical imaging, two main applications are explored: deconvolution of signals measured from a ultrasonic transducer and automatic image segmentation and classification of prostate ultrasound scans. In the former application a stochastic model of the radio frequency signal measured from a ultrasonic transducer is derived. This model is then employed for developing in a statistical framework a regularized deconvolution procedure, for enhancing signal resolution. In the latter application, different statistical models are used to characterize images of prostate tissues, extracting different features. These features are then uses to segment the images in region of interests by means of an automatic procedure based on a statistical model of the extracted features. Finally, machine learning techniques are used for automatic classification of the different region of interests. In the context of neural activity signals, an example of bio-inspired dynamical network was developed to help in studies of motor-related processes in the brain of primate monkeys. The presented model aims to mimic the abstract functionality of a cell population in 7a parietal region of primate monkeys, during the execution of learned behavioural tasks.

«
1
2
3
4
5
6
7
8
...
63
64
»