925 resultados para MIXED LINEAR-MODELS
Resumo:
Index tracking has become one of the most common strategies in asset management. The index-tracking problem consists of constructing a portfolio that replicates the future performance of an index by including only a subset of the index constituents in the portfolio. Finding the most representative subset is challenging when the number of stocks in the index is large. We introduce a new three-stage approach that at first identifies promising subsets by employing data-mining techniques, then determines the stock weights in the subsets using mixed-binary linear programming, and finally evaluates the subsets based on cross validation. The best subset is returned as the tracking portfolio. Our approach outperforms state-of-the-art methods in terms of out-of-sample performance and running times.
Resumo:
Consider a nonparametric regression model Y=mu*(X) + e, where the explanatory variables X are endogenous and e satisfies the conditional moment restriction E[e|W]=0 w.p.1 for instrumental variables W. It is well known that in these models the structural parameter mu* is 'ill-posed' in the sense that the function mapping the data to mu* is not continuous. In this paper, we derive the efficiency bounds for estimating linear functionals E[p(X)mu*(X)] and int_{supp(X)}p(x)mu*(x)dx, where p is a known weight function and supp(X) the support of X, without assuming mu* to be well-posed or even identified.
Resumo:
The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^
Resumo:
The authors are from UPM and are relatively grouped, and all have intervened in different academic or real cases on the subject, at different times as being of different age. With precedent from E. Torroja and A. Páez in Madrid Spain Safety Probabilistic models for concrete about 1957, now in ICOSSAR conferences, author J.M. Antón involved since autumn 1967 for euro-steel construction in CECM produced a math model for independent load superposition reductions, and using it a load coefficient pattern for codes in Rome Feb. 1969, practically adopted for European constructions, giving in JCSS Lisbon Feb. 1974 suggestion of union for concrete-steel-al.. That model uses model for loads like Gumbel type I, for 50 years for one type of load, reduced to 1 year to be added to other independent loads, the sum set in Gumbel theories to 50 years return period, there are parallel models. A complete reliability system was produced, including non linear effects as from buckling, phenomena considered somehow in actual Construction Eurocodes produced from Model Codes. The system was considered by author in CEB in presence of Hydraulic effects from rivers, floods, sea, in reference with actual practice. When redacting a Road Drainage Norm in MOPU Spain an optimization model was realized by authors giving a way to determine the figure of Return Period, 10 to 50 years, for the cases of hydraulic flows to be considered in road drainage. Satisfactory examples were a stream in SE of Spain with Gumbel Type I model and a paper of Ven Te Chow with Mississippi in Keokuk using Gumbel type II, and the model can be modernized with more varied extreme laws. In fact in the MOPU drainage norm the redacting commission acted also as expert to set a table of return periods for elements of road drainage, in fact as a multi-criteria complex decision system. These precedent ideas were used e.g. in wide Codes, indicated in symposia or meetings, but not published in journals in English, and a condensate of contributions of authors is presented. The authors are somehow involved in optimization for hydraulic and agro planning, and give modest hints of intended applications in presence of agro and environment planning as a selection of the criteria and utility functions involved in bayesian, multi-criteria or mixed decision systems. Modest consideration is made of changing in climate, and on the production and commercial systems, and on others as social and financial.
Resumo:
Algorithms for distributed agreement are a powerful means for formulating distributed versions of existing centralized algorithms. We present a toolkit for this task and show how it can be used systematically to design fully distributed algorithms for static linear Gaussian models, including principal component analysis, factor analysis, and probabilistic principal component analysis. These algorithms do not rely on a fusion center, require only low-volume local (1-hop neighborhood) communications, and are thus efficient, scalable, and robust. We show how they are also guaranteed to asymptotically converge to the same solution as the corresponding existing centralized algorithms. Finally, we illustrate the functioning of our algorithms on two examples, and examine the inherent cost-performance tradeoff.
Resumo:
Assessing wind conditions on complex terrain has become a hard task as terrain complexity increases. That is why there is a need to extrapolate in a reliable manner some wind parameters that determine wind farms viability such as annual average wind speed at all hub heights as well as turbulence intensities. The development of these tasks began in the early 90´s with the widely used linear model WAsP and WAsP Engineering especially designed for simple terrain with remarkable results on them but not so good on complex orographies. Simultaneously non-linearized Navier Stokes solvers have been rapidly developed in the last decade through CFD (Computational Fluid Dynamics) codes allowing simulating atmospheric boundary layer flows over steep complex terrain more accurately reducing uncertainties. This paper describes the features of these models by validating them through meteorological masts installed in a highly complex terrain. The study compares the results of the mentioned models in terms of wind speed and turbulence intensity.
Resumo:
Short-run forecasting of electricity prices has become necessary for power generation unit schedule, since it is the basis of every profit maximization strategy. In this article a new and very easy method to compute accurate forecasts for electricity prices using mixed models is proposed. The main idea is to develop an efficient tool for one-step-ahead forecasting in the future, combining several prediction methods for which forecasting performance has been checked and compared for a span of several years. Also as a novelty, the 24 hourly time series has been modelled separately, instead of the complete time series of the prices. This allows one to take advantage of the homogeneity of these 24 time series. The purpose of this paper is to select the model that leads to smaller prediction errors and to obtain the appropriate length of time to use for forecasting. These results have been obtained by means of a computational experiment. A mixed model which combines the advantages of the two new models discussed is proposed. Some numerical results for the Spanish market are shown, but this new methodology can be applied to other electricity markets as well
Resumo:
Purely data-driven approaches for machine learning present difficulties when data are scarce relative to the complexity of the model or when the model is forced to extrapolate. On the other hand, purely mechanistic approaches need to identify and specify all the interactions in the problem at hand (which may not be feasible) and still leave the issue of how to parameterize the system. In this paper, we present a hybrid approach using Gaussian processes and differential equations to combine data-driven modeling with a physical model of the system. We show how different, physically inspired, kernel functions can be developed through sensible, simple, mechanistic assumptions about the underlying system. The versatility of our approach is illustrated with three case studies from motion capture, computational biology, and geostatistics.
Radar track segmentation with cubic splines for collision risk models in high density terminal areas
Resumo:
This paper presents a method to segment airplane radar tracks in high density terminal areas where the air traffic follows trajectories with several changes in heading, speed and altitude. The radar tracks are modelled with different types of segments, straight lines, cubic spline function and shape preserving cubic function. The longitudinal, lateral and vertical deviations are calculated for terminal manoeuvring area scenarios. The most promising model of the radar tracks resulted from a mixed interpolation using straight lines for linear segments and spline cubic functions for curved segments. A sensitivity analysis is used to optimise the size of the window for the segmentation process.
Resumo:
En la presente tesis desarrollamos una estrategia para la simulación numérica del comportamiento mecánico de la aorta humana usando modelos de elementos finitos no lineales. Prestamos especial atención a tres aspectos claves relacionados con la biomecánica de los tejidos blandos. Primero, el análisis del comportamiento anisótropo característico de los tejidos blandos debido a las familias de fibras de colágeno. Segundo, el análisis del ablandamiento presentado por los vasos sanguíneos cuando estos soportan cargas fuera del rango de funcionamiento fisiológico. Y finalmente, la inclusión de las tensiones residuales en las simulaciones en concordancia con el experimento de apertura de ángulo. El análisis del daño se aborda mediante dos aproximaciones diferentes. En la primera aproximación se presenta una formulación de daño local con regularización. Esta formulación tiene dos ingredientes principales. Por una parte, usa los principios de la teoría de la fisura difusa para garantizar la objetividad de los resultados con diferentes mallas. Por otra parte, usa el modelo bidimensional de Hodge-Petruska para describir el comportamiento mesoscópico de los fibriles. Partiendo de este modelo mesoscópico, las propiedades macroscópicas de las fibras de colágeno son obtenidas a través de un proceso de homogenización. En la segunda aproximación se presenta un modelo de daño no-local enriquecido con el gradiente de la variable de daño. El modelo se construye a partir del enriquecimiento de la función de energía con un término que contiene el gradiente material de la variable de daño no-local. La inclusión de este término asegura una regularización implícita de la implementación por elementos finitos, dando lugar a resultados de las simulaciones que no dependen de la malla. La aplicabilidad de este último modelo a problemas de biomecánica se estudia por medio de una simulación de un procedimiento quirúrgico típico conocido como angioplastia de balón. In the present thesis we develop a framework for the numerical simulation of the mechanical behaviour of the human aorta using non-linear finite element models. Special attention is paid to three key aspects related to the biomechanics of soft tissues. First, the modelling of the characteristic anisotropic behaviour of the softue due to the collagen fibre families. Secondly, the modelling of damage-related softening that blood vessels exhibit when subjected to loads beyond their physiological range. And finally, the inclusion of the residual stresses in the simulations in accordance with the opening-angle experiment The modelling of damage is addressed with two major and different approaches. In the first approach a continuum local damage formulation with regularisation is presented. This formulation has two principal ingredients. On the one hand, it makes use of the principles of the smeared crack theory to avoid the mesh size dependence of the structural response in softening. On the other hand, it uses a Hodge-Petruska bidimensional model to describe the fibrils as staggered arrays of tropocollagen molecules, and from this mesoscopic model the macroscopic material properties of the collagen fibres are obtained using an homogenisation process. In the second approach a non-local gradient-enhanced damage formulation is introduced. The model is built around the enhancement of the free energy function by means of a term that contains the referential gradient of the non-local damage variable. The inclusion of this term ensures an implicit regularisation of the finite element implementation, yielding mesh-objective results of the simulations. The applicability of the later model to biomechanically-related problems is studied by means of the simulation of a typical surgical procedure, namely, the balloon angioplasty.
Resumo:
To effectively assess and mitigate risk of permafrost disturbance, disturbance-p rone areas can be predicted through the application of susceptibility models. In this study we developed regional susceptibility models for permafrost disturbances using a field disturbance inventory to test the transferability of the model to a broader region in the Canadian High Arctic. Resulting maps of susceptibility were then used to explore the effect of terrain variables on the occurrence of disturbances within this region. To account for a large range of landscape charac- teristics, the model was calibrated using two locations: Sabine Peninsula, Melville Island, NU, and Fosheim Pen- insula, Ellesmere Island, NU. Spatial patterns of disturbance were predicted with a generalized linear model (GLM) and generalized additive model (GAM), each calibrated using disturbed and randomized undisturbed lo- cations from both locations and GIS-derived terrain predictor variables including slope, potential incoming solar radiation, wetness index, topographic position index, elevation, and distance to water. Each model was validated for the Sabine and Fosheim Peninsulas using independent data sets while the transferability of the model to an independent site was assessed at Cape Bounty, Melville Island, NU. The regional GLM and GAM validated well for both calibration sites (Sabine and Fosheim) with the area under the receiver operating curves (AUROC) N 0.79. Both models were applied directly to Cape Bounty without calibration and validated equally with AUROC's of 0.76; however, each model predicted disturbed and undisturbed samples differently. Addition- ally, the sensitivity of the transferred model was assessed using data sets with different sample sizes. Results in- dicated that models based on larger sample sizes transferred more consistently and captured the variability within the terrain attributes in the respective study areas. Terrain attributes associated with the initiation of dis- turbances were similar regardless of the location. Disturbances commonly occurred on slopes between 4 and 15°, below Holocene marine limit, and in areas with low potential incoming solar radiation
Resumo:
Pspline uses xtmixed to fit a penalized spline regression and plots the smoothed function. Additional covariates can be specified to adjust the smooth and plot partial residuals.
Resumo:
Cover title.