973 resultados para statistical speaker models
Resumo:
The estimation of the long-term wind resource at a prospective site based on a relatively short on-site measurement campaign is an indispensable task in the development of a commercial wind farm. The typical industry approach is based on the measure-correlate-predict �MCP� method where a relational model between the site wind velocity data and the data obtained from a suitable reference site is built from concurrent records. In a subsequent step, a long-term prediction for the prospective site is obtained from a combination of the relational model and the historic reference data. In the present paper, a systematic study is presented where three new MCP models, together with two published reference models �a simple linear regression and the variance ratio method�, have been evaluated based on concurrent synthetic wind speed time series for two sites, simulating the prospective and the reference site. The synthetic method has the advantage of generating time series with the desired statistical properties, including Weibull scale and shape factors, required to evaluate the five methods under all plausible conditions. In this work, first a systematic discussion of the statistical fundamentals behind MCP methods is provided and three new models, one based on a nonlinear regression and two �termed kernel methods� derived from the use of conditional probability density functions, are proposed. All models are evaluated by using five metrics under a wide range of values of the correlation coefficient, the Weibull scale, and the Weibull shape factor. Only one of all models, a kernel method based on bivariate Weibull probability functions, is capable of accurately predicting all performance metrics studied.
Resumo:
Accurate decadal climate predictions could be used to inform adaptation actions to a changing climate. The skill of such predictions from initialised dynamical global climate models (GCMs) may be assessed by comparing with predictions from statistical models which are based solely on historical observations. This paper presents two benchmark statistical models for predicting both the radiatively forced trend and internal variability of annual mean sea surface temperatures (SSTs) on a decadal timescale based on the gridded observation data set HadISST. For both statistical models, the trend related to radiative forcing is modelled using a linear regression of SST time series at each grid box on the time series of equivalent global mean atmospheric CO2 concentration. The residual internal variability is then modelled by (1) a first-order autoregressive model (AR1) and (2) a constructed analogue model (CA). From the verification of 46 retrospective forecasts with start years from 1960 to 2005, the correlation coefficient for anomaly forecasts using trend with AR1 is greater than 0.7 over parts of extra-tropical North Atlantic, the Indian Ocean and western Pacific. This is primarily related to the prediction of the forced trend. More importantly, both CA and AR1 give skillful predictions of the internal variability of SSTs in the subpolar gyre region over the far North Atlantic for lead time of 2 to 5 years, with correlation coefficients greater than 0.5. For the subpolar gyre and parts of the South Atlantic, CA is superior to AR1 for lead time of 6 to 9 years. These statistical forecasts are also compared with ensemble mean retrospective forecasts by DePreSys, an initialised GCM. DePreSys is found to outperform the statistical models over large parts of North Atlantic for lead times of 2 to 5 years and 6 to 9 years, however trend with AR1 is generally superior to DePreSys in the North Atlantic Current region, while trend with CA is superior to DePreSys in parts of South Atlantic for lead time of 6 to 9 years. These findings encourage further development of benchmark statistical decadal prediction models, and methods to combine different predictions.
Resumo:
We consider the relation between so called continuous localization models—i.e. non-linear stochastic Schrödinger evolutions—and the discrete GRW-model of wave function collapse. The former can be understood as scaling limit of the GRW process. The proof relies on a stochastic Trotter formula, which is of interest in its own right. Our Trotter formula also allows to complement results on existence theory of stochastic Schrödinger evolutions by Holevo and Mora/Rebolledo.
Resumo:
We review and structure some of the mathematical and statistical models that have been developed over the past half century to grapple with theoretical and experimental questions about the stochastic development of aging over the life course. We suggest that the mathematical models are in large part addressing the problem of partitioning the randomness in aging: How does aging vary between individuals, and within an individual over the lifecourse? How much of the variation is inherently related to some qualities of the individual, and how much is entirely random? How much of the randomness is cumulative, and how much is merely short-term flutter? We propose that recent lines of statistical inquiry in survival analysis could usefully grapple with these questions, all the more so if they were more explicitly linked to the relevant mathematical and biological models of aging. To this end, we describe points of contact among the various lines of mathematical and statistical research. We suggest some directions for future work, including the exploration of information-theoretic measures for evaluating components of stochastic models as the basis for analyzing experiments and anchoring theoretical discussions of aging.
Resumo:
Logistic models are studied as a tool to convert dynamical forecast information (deterministic and ensemble) into probability forecasts. A logistic model is obtained by setting the logarithmic odds ratio equal to a linear combination of the inputs. As with any statistical model, logistic models will suffer from overfitting if the number of inputs is comparable to the number of forecast instances. Computational approaches to avoid overfitting by regularization are discussed, and efficient techniques for model assessment and selection are presented. A logit version of the lasso (originally a linear regression technique), is discussed. In lasso models, less important inputs are identified and the corresponding coefficient is set to zero, providing an efficient and automatic model reduction procedure. For the same reason, lasso models are particularly appealing for diagnostic purposes.
Resumo:
Progress in functional neuroimaging of the brain increasingly relies on the integration of data from complementary imaging modalities in order to improve spatiotemporal resolution and interpretability. However, the usefulness of merely statistical combinations is limited, since neural signal sources differ between modalities and are related non-trivially. We demonstrate here that a mean field model of brain activity can simultaneously predict EEG and fMRI BOLD with proper signal generation and expression. Simulations are shown using a realistic head model based on structural MRI, which includes both dense short-range background connectivity and long-range specific connectivity between brain regions. The distribution of modeled neural masses is comparable to the spatial resolution of fMRI BOLD, and the temporal resolution of the modeled dynamics, importantly including activity conduction, matches the fastest known EEG phenomena. The creation of a cortical mean field model with anatomically sound geometry, extensive connectivity, and proper signal expression is an important first step towards the model-based integration of multimodal neuroimages.
Resumo:
The occurrence of mid-latitude windstorms is related to strong socio-economic effects. For detailed and reliable regional impact studies, large datasets of high-resolution wind fields are required. In this study, a statistical downscaling approach in combination with dynamical downscaling is introduced to derive storm related gust speeds on a high-resolution grid over Europe. Multiple linear regression models are trained using reanalysis data and wind gusts from regional climate model simulations for a sample of 100 top ranking windstorm events. The method is computationally inexpensive and reproduces individual windstorm footprints adequately. Compared to observations, the results for Germany are at least as good as pure dynamical downscaling. This new tool can be easily applied to large ensembles of general circulation model simulations and thus contribute to a better understanding of the regional impact of windstorms based on decadal and climate change projections.
Resumo:
Statistical diagnostics of mixing and transport are computed for a numerical model of forced shallow-water flow on the sphere and a middle-atmosphere general circulation model. In particular, particle dispersion statistics, transport fluxes, Liapunov exponents (probability density functions and ensemble averages), and tracer concentration statistics are considered. It is shown that the behavior of the diagnostics is in accord with that of kinematic chaotic advection models so long as stochasticity is sufficiently weak. Comparisons with random-strain theory are made.
Resumo:
The response of North Atlantic and European extratropical cyclones to climate change is investigated in the climate models participating in phase 5 of the Coupled Model Intercomparison Project (CMIP5). In contrast to previous multimodel studies, a feature-tracking algorithm is here applied to separately quantify the re- sponses in the number, the wind intensity, and the precipitation intensity of extratropical cyclones. Moreover, a statistical framework is employed to formally assess the uncertainties in the multimodel projections. Under the midrange representative concentration pathway (RCP4.5) emission scenario, the December–February (DJF) response is characterized by a tripolar pattern over Europe, with an increase in the number of cyclones in central Europe and a decreased number in the Norwegian and Mediterranean Seas. The June–August (JJA) response is characterized by a reduction in the number of North Atlantic cyclones along the southern flank of the storm track. The total number of cyclones decreases in both DJF (24%) and JJA (22%). Classifying cyclones according to their intensity indicates a slight basinwide reduction in the number of cy- clones associated with strong winds, but an increase in those associated with strong precipitation. However, in DJF, a slight increase in the number and intensity of cyclones associated with strong wind speeds is found over the United Kingdom and central Europe. The results are confirmed under the high-emission RCP8.5 scenario, where the signals tend to be larger. The sources of uncertainty in these projections are discussed.
Resumo:
We outline our first steps towards marrying two new and emerging technologies; the Virtual Observatory (e.g, Astro- Grid) and the computational grid. We discuss the construction of VOTechBroker, which is a modular software tool designed to abstract the tasks of submission and management of a large number of computational jobs to a distributed computer system. The broker will also interact with the AstroGrid workflow and MySpace environments. We present our planned usage of the VOTechBroker in computing a huge number of n–point correlation functions from the SDSS, as well as fitting over a million CMBfast models to the WMAP data.
Resumo:
Although financial theory rests heavily upon the assumption that asset returns are normally distributed, value indices of commercial real estate display significant departures from normality. In this paper, we apply and compare the properties of two recently proposed regime switching models for value indices of commercial real estate in the US and the UK, both of which relax the assumption that observations are drawn from a single distribution with constant mean and variance. Statistical tests of the models' specification indicate that the Markov switching model is better able to capture the non-stationary features of the data than the threshold autoregressive model, although both represent superior descriptions of the data than the models that allow for only one state. Our results have several implications for theoretical models and empirical research in finance.
Resumo:
Second language acquisition researchers often face particular challenges when attempting to generalize study findings to the wider learner population. For example, language learners constitute a heterogeneous group, and it is not always clear how a study’s findings may generalize to other individuals who may differ in terms of language background and proficiency, among many other factors. In this paper, we provide an overview of how mixed-effects models can be used to help overcome these and other issues in the field of second language acquisition. We provide an overview of the benefits of mixed-effects models and a practical example of how mixed-effects analyses can be conducted. Mixed-effects models provide second language researchers with a powerful statistical tool in the analysis of a variety of different types of data.
Resumo:
This paper investigates the feasibility of using approximate Bayesian computation (ABC) to calibrate and evaluate complex individual-based models (IBMs). As ABC evolves, various versions are emerging, but here we only explore the most accessible version, rejection-ABC. Rejection-ABC involves running models a large number of times, with parameters drawn randomly from their prior distributions, and then retaining the simulations closest to the observations. Although well-established in some fields, whether ABC will work with ecological IBMs is still uncertain. Rejection-ABC was applied to an existing 14-parameter earthworm energy budget IBM for which the available data consist of body mass growth and cocoon production in four experiments. ABC was able to narrow the posterior distributions of seven parameters, estimating credible intervals for each. ABC’s accepted values produced slightly better fits than literature values do. The accuracy of the analysis was assessed using cross-validation and coverage, currently the best available tests. Of the seven unnarrowed parameters, ABC revealed that three were correlated with other parameters, while the remaining four were found to be not estimable given the data available. It is often desirable to compare models to see whether all component modules are necessary. Here we used ABC model selection to compare the full model with a simplified version which removed the earthworm’s movement and much of the energy budget. We are able to show that inclusion of the energy budget is necessary for a good fit to the data. We show how our methodology can inform future modelling cycles, and briefly discuss how more advanced versions of ABC may be applicable to IBMs. We conclude that ABC has the potential to represent uncertainty in model structure, parameters and predictions, and to embed the often complex process of optimizing an IBM’s structure and parameters within an established statistical framework, thereby making the process more transparent and objective.
Resumo:
Individual-based models (IBMs) can simulate the actions of individual animals as they interact with one another and the landscape in which they live. When used in spatially-explicit landscapes IBMs can show how populations change over time in response to management actions. For instance, IBMs are being used to design strategies of conservation and of the exploitation of fisheries, and for assessing the effects on populations of major construction projects and of novel agricultural chemicals. In such real world contexts, it becomes especially important to build IBMs in a principled fashion, and to approach calibration and evaluation systematically. We argue that insights from physiological and behavioural ecology offer a recipe for building realistic models, and that Approximate Bayesian Computation (ABC) is a promising technique for the calibration and evaluation of IBMs. IBMs are constructed primarily from knowledge about individuals. In ecological applications the relevant knowledge is found in physiological and behavioural ecology, and we approach these from an evolutionary perspective by taking into account how physiological and behavioural processes contribute to life histories, and how those life histories evolve. Evolutionary life history theory shows that, other things being equal, organisms should grow to sexual maturity as fast as possible, and then reproduce as fast as possible, while minimising per capita death rate. Physiological and behavioural ecology are largely built on these principles together with the laws of conservation of matter and energy. To complete construction of an IBM information is also needed on the effects of competitors, conspecifics and food scarcity; the maximum rates of ingestion, growth and reproduction, and life-history parameters. Using this knowledge about physiological and behavioural processes provides a principled way to build IBMs, but model parameters vary between species and are often difficult to measure. A common solution is to manually compare model outputs with observations from real landscapes and so to obtain parameters which produce acceptable fits of model to data. However, this procedure can be convoluted and lead to over-calibrated and thus inflexible models. Many formal statistical techniques are unsuitable for use with IBMs, but we argue that ABC offers a potential way forward. It can be used to calibrate and compare complex stochastic models and to assess the uncertainty in their predictions. We describe methods used to implement ABC in an accessible way and illustrate them with examples and discussion of recent studies. Although much progress has been made, theoretical issues remain, and some of these are outlined and discussed.
Resumo:
In this paper, we compare the performance of two statistical approaches for the analysis of data obtained from the social research area. In the first approach, we use normal models with joint regression modelling for the mean and for the variance heterogeneity. In the second approach, we use hierarchical models. In the first case, individual and social variables are included in the regression modelling for the mean and for the variance, as explanatory variables, while in the second case, the variance at level 1 of the hierarchical model depends on the individuals (age of the individuals), and in the level 2 of the hierarchical model, the variance is assumed to change according to socioeconomic stratum. Applying these methodologies, we analyze a Colombian tallness data set to find differences that can be explained by socioeconomic conditions. We also present some theoretical and empirical results concerning the two models. From this comparative study, we conclude that it is better to jointly modelling the mean and variance heterogeneity in all cases. We also observe that the convergence of the Gibbs sampling chain used in the Markov Chain Monte Carlo method for the jointly modeling the mean and variance heterogeneity is quickly achieved.