939 resultados para Statistical Language Model
Resumo:
The well-known lack of power of unit root tests has often been attributed to the shortlength of macroeconomic variables and also to DGP s that depart from the I(1)-I(0)alternatives. This paper shows that by using long spans of annual real GNP and GNPper capita (133 years) high power can be achieved, leading to the rejection of both theunit root and the trend-stationary hypothesis. This suggests that possibly neither modelprovides a good characterization of these data. Next, more flexible representations areconsidered, namely, processes containing structural breaks (SB) and fractional ordersof integration (FI). Economic justification for the presence of these features in GNP isprovided. It is shown that the latter models (FI and SB) are in general preferred to theARIMA (I(1) or I(0)) ones. As a novelty in this literature, new techniques are appliedto discriminate between FI and SB models. It turns out that the FI specification ispreferred, implying that GNP and GNP per capita are non-stationary, highly persistentbut mean-reverting series. Finally, it is shown that the results are robust when breaksin the deterministic component are allowed for in the FI model. Some macroeconomicimplications of these findings are also discussed.
Resumo:
This paper presents and estimates a dynamic choice model in the attribute space considering rational consumers. In light of the evidence of several state-dependence patterns, the standard attribute-based model is extended by considering a general utility function where pure inertia and pure variety-seeking behaviors can be explained in the model as particular linear cases. The dynamics of the model are fully characterized by standard dynamic programming techniques. The model presents a stationary consumption pattern that can be inertial, where the consumer only buys one product, or a variety-seeking one, where the consumer shifts among varied products.We run some simulations to analyze the consumption paths out of the steady state. Underthe hybrid utility assumption, the consumer behaves inertially among the unfamiliar brandsfor several periods, eventually switching to a variety-seeking behavior when the stationary levels are approached. An empirical analysis is run using scanner databases for three different product categories: fabric softener, saltine cracker, and catsup. Non-linear specifications provide the best fit of the data, as hybrid functional forms are found in all the product categories for most attributes and segments. These results reveal the statistical superiority of the non-linear structure and confirm the gradual trend to seek variety as the level of familiarity with the purchased items increases.
Resumo:
Many dynamic revenue management models divide the sale period into a finite number of periods T and assume, invoking a fine-enough grid of time, that each period sees at most one booking request. These Poisson-type assumptions restrict the variability of the demand in the model, but researchers and practitioners were willing to overlook this for the benefit of tractability of the models. In this paper, we criticize this model from another angle. Estimating the discrete finite-period model poses problems of indeterminacy and non-robustness: Arbitrarily fixing T leads to arbitrary control values and on the other hand estimating T from data adds an additional layer of indeterminacy. To counter this, we first propose an alternate finite-population model that avoids this problem of fixing T and allows a wider range of demand distributions, while retaining the useful marginal-value properties of the finite-period model. The finite-population model still requires jointly estimating market size and the parameters of the customer purchase model without observing no-purchases. Estimation of market-size when no-purchases are unobservable has rarely been attempted in the marketing or revenue management literature. Indeed, we point out that it is akin to the classical statistical problem of estimating the parameters of a binomial distribution with unknown population size and success probability, and hence likely to be challenging. However, when the purchase probabilities are given by a functional form such as a multinomial-logit model, we propose an estimation heuristic that exploits the specification of the functional form, the variety of the offer sets in a typical RM setting, and qualitative knowledge of arrival rates. Finally we perform simulations to show that the estimator is very promising in obtaining unbiased estimates of population size and the model parameters.
Resumo:
We propose a method for brain atlas deformation in the presence of large space-occupying tumors, based on an a priori model of lesion growth that assumes radial expansion of the lesion from its starting point. Our approach involves three steps. First, an affine registration brings the atlas and the patient into global correspondence. Then, the seeding of a synthetic tumor into the brain atlas provides a template for the lesion. The last step is the deformation of the seeded atlas, combining a method derived from optical flow principles and a model of lesion growth. Results show that a good registration is performed and that the method can be applied to automatic segmentation of structures and substructures in brains with gross deformation, with important medical applications in neurosurgery, radiosurgery, and radiotherapy.
Resumo:
In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).
Resumo:
Analysis of variance is commonly used in morphometry in order to ascertain differences in parameters between several populations. Failure to detect significant differences between populations (type II error) may be due to suboptimal sampling and lead to erroneous conclusions; the concept of statistical power allows one to avoid such failures by means of an adequate sampling. Several examples are given in the morphometry of the nervous system, showing the use of the power of a hierarchical analysis of variance test for the choice of appropriate sample and subsample sizes. In the first case chosen, neuronal densities in the human visual cortex, we find the number of observations to be of little effect. For dendritic spine densities in the visual cortex of mice and humans, the effect is somewhat larger. A substantial effect is shown in our last example, dendritic segmental lengths in monkey lateral geniculate nucleus. It is in the nature of the hierarchical model that sample size is always more important than subsample size. The relative weight to be attributed to subsample size thus depends on the relative magnitude of the between observations variance compared to the between individuals variance.
Resumo:
The predictive potential of six selected factors was assessed in 72 patients with primary myelodysplastic syndrome using univariate and multivariate logistic regression analysis of survival at 18 months. Factors were age (above median of 69 years), dysplastic features in the three myeloid bone marrow cell lineages, presence of chromosome defects, all metaphases abnormal, double or complex chromosome defects (C23), and a Bournemouth score of 2, 3, or 4 (B234). In the multivariate approach, B234 and C23 proved to be significantly associated with a reduction in the survival probability. The similarity of the regression coefficients associated with these two factors means that they have about the same weight. Consequently, the model was simplified by counting the number of factors (0, 1, or 2) present in each patient, thus generating a scoring system called the Lausanne-Bournemouth score (LB score). The LB score combines the well-recognized and easy-to-use Bournemouth score (B score) with the chromosome defect complexity, C23 constituting an additional indicator of patient outcome. The predicted risk of death within 18 months calculated from the model is as follows: 7.1% (confidence interval: 1.7-24.8) for patients with an LB score of 0, 60.1% (44.7-73.8) for an LB score of 1, and 96.8% (84.5-99.4) for an LB score of 2. The scoring system presented here has several interesting features. The LB score may improve the predictive value of the B score, as it is able to recognize two prognostic groups in the intermediate risk category of patients with B scores of 2 or 3. It has also the ability to identify two distinct prognostic subclasses among RAEB and possibly CMML patients. In addition to its above-described usefulness in the prognostic evaluation, the LB score may bring new insights into the understanding of evolution patterns in MDS. We used the combination of the B score and chromosome complexity to define four classes which may be considered four possible states of myelodysplasia and which describe two distinct evolutional pathways.
Resumo:
The work presented evaluates the statistical characteristics of regional bias and expected error in reconstructions of real positron emission tomography (PET) data of human brain fluoro-deoxiglucose (FDG) studies carried out by the maximum likelihood estimator (MLE) method with a robust stopping rule, and compares them with the results of filtered backprojection (FBP) reconstructions and with the method of sieves. The task of evaluating radioisotope uptake in regions-of-interest (ROIs) is investigated. An assessment of bias and variance in uptake measurements is carried out with simulated data. Then, by using three different transition matrices with different degrees of accuracy and a components of variance model for statistical analysis, it is shown that the characteristics obtained from real human FDG brain data are consistent with the results of the simulation studies.
Resumo:
Ground clutter caused by anomalous propagation (anaprop) can affect seriously radar rain rate estimates, particularly in fully automatic radar processing systems, and, if not filtered, can produce frequent false alarms. A statistical study of anomalous propagation detected from two operational C-band radars in the northern Italian region of Emilia Romagna is discussed, paying particular attention to its diurnal and seasonal variability. The analysis shows a high incidence of anaprop in summer, mainly in the morning and evening, due to the humid and hot summer climate of the Po Valley, particularly in the coastal zone. Thereafter, a comparison between different techniques and datasets to retrieve the vertical profile of the refractive index gradient in the boundary layer is also presented. In particular, their capability to detect anomalous propagation conditions is compared. Furthermore, beam path trajectories are simulated using a multilayer ray-tracing model and the influence of the propagation conditions on the beam trajectory and shape is examined. High resolution radiosounding data are identified as the best available dataset to reproduce accurately the local propagation conditions, while lower resolution standard TEMP data suffers from interpolation degradation and Numerical Weather Prediction model data (Lokal Model) are able to retrieve a tendency to superrefraction but not to detect ducting conditions. Observing the ray tracing of the centre, lower and upper limits of the radar antenna 3-dB half-power main beam lobe it is concluded that ducting layers produce a change in the measured volume and in the power distribution that can lead to an additional error in the reflectivity estimate and, subsequently, in the estimated rainfall rate.
Resumo:
The aim of this paper is twofold. First, we study the determinants of economic growth among a wide set of potential variables for the Spanish provinces (NUTS3). Among others, we include various types of private, public and human capital in the group of growth factors. Also,we analyse whether Spanish provinces have converged in economic terms in recent decades. Thesecond objective is to obtain cross-section and panel data parameter estimates that are robustto model speci¯cation. For this purpose, we use a Bayesian Model Averaging (BMA) approach.Bayesian methodology constructs parameter estimates as a weighted average of linear regression estimates for every possible combination of included variables. The weight of each regression estimate is given by the posterior probability of each model.
Resumo:
A general formulation of boundary conditions for semiconductor-metal contacts follows from a phenomenological procedure sketched here. The resulting boundary conditions, which incorporate only physically well-defined parameters, are used to study the classical unipolar drift-diffusion model for the Gunn effect. The analysis of its stationary solutions reveals the presence of bistability and hysteresis for a certain range of contact parameters. Several types of Gunn effect are predicted to occur in the model, when no stable stationary solution exists, depending on the value of the parameters of the injecting contact appearing in the boundary condition. In this way, the critical role played by contacts in the Gunn effect is clearly established.
Resumo:
The aim of this paper is twofold. First, we study the determinants of economic growth among a wide set of potential variables for the Spanish provinces (NUTS3). Among others, we include various types of private, public and human capital in the group of growth factors. Also,we analyse whether Spanish provinces have converged in economic terms in recent decades. Thesecond objective is to obtain cross-section and panel data parameter estimates that are robustto model speci¯cation. For this purpose, we use a Bayesian Model Averaging (BMA) approach.Bayesian methodology constructs parameter estimates as a weighted average of linear regression estimates for every possible combination of included variables. The weight of each regression estimate is given by the posterior probability of each model.
Resumo:
Objective: Health status measures usually have an asymmetric distribution and present a highpercentage of respondents with the best possible score (ceiling effect), specially when they areassessed in the overall population. Different methods to model this type of variables have beenproposed that take into account the ceiling effect: the tobit models, the Censored Least AbsoluteDeviations (CLAD) models or the two-part models, among others. The objective of this workwas to describe the tobit model, and compare it with the Ordinary Least Squares (OLS) model,that ignores the ceiling effect.Methods: Two different data sets have been used in order to compare both models: a) real datacomming from the European Study of Mental Disorders (ESEMeD), in order to model theEQ5D index, one of the measures of utilities most commonly used for the evaluation of healthstatus; and b) data obtained from simulation. Cross-validation was used to compare thepredicted values of the tobit model and the OLS models. The following estimators werecompared: the percentage of absolute error (R1), the percentage of squared error (R2), the MeanSquared Error (MSE) and the Mean Absolute Prediction Error (MAPE). Different datasets werecreated for different values of the error variance and different percentages of individuals withceiling effect. The estimations of the coefficients, the percentage of explained variance and theplots of residuals versus predicted values obtained under each model were compared.Results: With regard to the results of the ESEMeD study, the predicted values obtained with theOLS model and those obtained with the tobit models were very similar. The regressioncoefficients of the linear model were consistently smaller than those from the tobit model. In thesimulation study, we observed that when the error variance was small (s=1), the tobit modelpresented unbiased estimations of the coefficients and accurate predicted values, specially whenthe percentage of individuals wiht the highest possible score was small. However, when theerrror variance was greater (s=10 or s=20), the percentage of explained variance for the tobitmodel and the predicted values were more similar to those obtained with an OLS model.Conclusions: The proportion of variability accounted for the models and the percentage ofindividuals with the highest possible score have an important effect in the performance of thetobit model in comparison with the linear model.
Resumo:
BACKGROUND: Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have developed specific computational tools that provide original methods to analyse qualitative models, a standard format to exchange qualitative models has been missing. RESULTS: We present the Systems Biology Markup Language (SBML) Qualitative Models Package ("qual"), an extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological networks. We demonstrate the interoperability of models via SBML qual through the analysis of a specific signalling network by three independent software tools. Furthermore, the collective effort to define the SBML qual format paved the way for the development of LogicalModel, an open-source model library, which will facilitate the adoption of the format as well as the collaborative development of algorithms to analyse qualitative models. CONCLUSIONS: SBML qual allows the exchange of qualitative models among a number of complementary software tools. SBML qual has the potential to promote collaborative work on the development of novel computational approaches, as well as on the specification and the analysis of comprehensive qualitative models of regulatory and signalling networks.
Resumo:
We study the behavior of the random-bond Ising model at zero temperature by numerical simulations for a variable amount of disorder. The model is an example of systems exhibiting a fluctuationless first-order phase transition similar to some field-induced phase transitions in ferromagnetic systems and the martensitic phase transition appearing in a number of metallic alloys. We focus on the study of the hysteresis cycles appearing when the external field is swept from positive to negative values. By using a finite-size scaling hypothesis, we analyze the disorder-induced phase transition between the phase exhibiting a discontinuity in the hysteresis cycle and the phase with the continuous hysteresis cycle. Critical exponents characterizing the transition are obtained. We also analyze the size and duration distributions of the magnetization jumps (avalanches).