21 resultados para Data modeling
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
Panel data can be arranged into a matrix in two ways, called 'long' and 'wide' formats (LFand WF). The two formats suggest two alternative model approaches for analyzing paneldata: (i) univariate regression with varying intercept; and (ii) multivariate regression withlatent variables (a particular case of structural equation model, SEM). The present papercompares the two approaches showing in which circumstances they yield equivalent?insome cases, even numerically equal?results. We show that the univariate approach givesresults equivalent to the multivariate approach when restrictions of time invariance (inthe paper, the TI assumption) are imposed on the parameters of the multivariate model.It is shown that the restrictions implicit in the univariate approach can be assessed bychi-square difference testing of two nested multivariate models. In addition, commontests encountered in the econometric analysis of panel data, such as the Hausman test, areshown to have an equivalent representation as chi-square difference tests. Commonalitiesand differences between the univariate and multivariate approaches are illustrated usingan empirical panel data set of firms' profitability as well as a simulated panel data.
Resumo:
Roughly fifteen years ago, the Church of Jesus Christ of Latter-day Saints published a new proposed standard file format. They call it GEDCOM. It was designed to allow different genealogy programs to exchange data.Five years later, in may 2000, appeared the GENTECH Data Modeling Project, with the support of the Federation of Genealogical Societies (FGS) and other American genealogical societies. They attempted to define a genealogical logic data model to facilitate data exchange between different genealogical programs. Although genealogists deal with an enormous variety of data sources, one of the central concepts of this data model was that all genealogical data could be broken down into a series of short, formal genealogical statements. It was something more versatile than only export/import data records on a predefined fields. This project was finally absorbed in 2004 by the National Genealogical Society (NGS).Despite being a genealogical reference in many applications, these models have serious drawbacks to adapt to different cultural and social environments. At the present time we have no formal proposal for a recognized standard to represent the family domain.Here we propose an alternative conceptual model, largely inherited from aforementioned models. The design is intended to overcome their limitations. However, its major innovation lies in applying the ontological paradigm when modeling statements and entities.
Resumo:
We explore the determinants of usage of six different types of health care services, using the Medical Expenditure Panel Survey data, years 1996-2000. We apply a number of models for univariate count data, including semiparametric, semi-nonparametric and finite mixture models. We find that the complexity of the model that is required to fit the data well depends upon the way in which the data is pooled across sexes and over time, and upon the characteristics of the usage measure. Pooling across time and sexes is almost always favored, but when more heterogeneous data is pooled it is often the case that a more complex statistical model is required.
Resumo:
We review recent likelihood-based approaches to modeling demand for medical care. A semi-nonparametric model along the lines of Cameron and Johansson's Poisson polynomial model, but using a negative binomial baseline model, is introduced. We apply these models, as well a semiparametric Poisson, hurdle semiparametric Poisson, and finite mixtures of negative binomial models to six measures of health care usage taken from the Medical Expenditure Panel survey. We conclude that most of the models lead to statistically similar results, both in terms of information criteria and conditional and unconditional prediction. This suggests that applied researchers may not need to be overly concerned with the choice of which of these models they use to analyze data on health care demand.
Resumo:
Projecte de recerca elaborat a partir d’una estada a la National Oceanography Centre of Southampton (NOCS), Gran Bretanya, entre maig i juliol del 2006. La possibilitat d’obtenir una estimació precissa de la salinitat marina (SSS) és important per a investigar i predir l’extensió del fenòmen del canvi climàtic. La missió Soil Moisture and Ocean Salinity (SMOS) va ser seleccionada per l’Agència Espacial Europea (ESA) per a obtenir mapes de salinitat de la superfície marina a escala global i amb un temps de revisita petit. Abans del llençament de SMOS es preveu l’anàlisi de la variabilitat horitzontal de la SSS i del potencial de les dades recuperades a partir de mesures de SMOS per a reproduir comportaments oceanogràfics coneguts. L’objectiu de tot plegat és emplenar el buit existent entre les fonts de dades d’entrada/auxiliars fiables i les eines desenvolupades per a simular i processar les dades adquirides segons la configuració de SMOS. El SMOS End-to-end Performance Simulator (SEPS) és un simulador adhoc desenvolupat per la Universitat Politècnica de Catalunya (UPC) per a generar dades segons la configuració de SMOS. Es va utilitzar dades d’entrada a SEPS procedents del projecte Ocean Circulation and Climate Advanced Modeling (OCCAM), utilitzat al NOCS, a diferents resolucions espacials. Modificant SEPS per a poder fer servir com a entrada les dades OCCAM es van obtenir dades de temperatura de brillantor simulades durant un mes amb diferents observacions ascendents que cobrien la zona seleccionada. Les tasques realitzades durant l’estada a NOCS tenien la finalitat de proporcionar una tècnica fiable per a realitzar la calibració externa i per tant cancel•lar el bias, una metodologia per a promitjar temporalment les diferents adquisicions durant les observacions ascendents, i determinar la millor configuració de la funció de cost abans d’explotar i investigar les posibiltats de les dades SEPS/OCCAM per a derivar la SSS recuperada amb patrons d’alta resolució.
Resumo:
This paper presents a two-factor (Vasicek-CIR) model of the term structure of interest rates and develops its pricing and empirical properties. We assume that default free discount bond prices are determined by the time to maturity and two factors, the long-term interest rate and the spread. Assuming a certain process for both factors, a general bond pricing equation is derived and a closed-form expression for bond prices is obtained. Empirical evidence of the model's performance in comparisson with a double Vasicek model is presented. The main conclusion is that the modeling of the volatility in the long-term rate process can help (in a large amount) to fit the observed data can improve - in a reasonable quantity - the prediction of the future movements in the medium- and long-term interest rates. However, for shorter maturities, it is shown that the pricing errors are, basically, negligible and it is not so clear which is the best model to be used.
Resumo:
The goal of this paper is to estimate time-varying covariance matrices.Since the covariance matrix of financial returns is known to changethrough time and is an essential ingredient in risk measurement, portfolioselection, and tests of asset pricing models, this is a very importantproblem in practice. Our model of choice is the Diagonal-Vech version ofthe Multivariate GARCH(1,1) model. The problem is that the estimation ofthe general Diagonal-Vech model model is numerically infeasible indimensions higher than 5. The common approach is to estimate more restrictive models which are tractable but may not conform to the data. Our contributionis to propose an alternative estimation method that is numerically feasible,produces positive semi-definite conditional covariance matrices, and doesnot impose unrealistic a priori restrictions. We provide an empiricalapplication in the context of international stock markets, comparing thenew estimator to a number of existing ones.
Resumo:
ic first-order transition line ending in a critical point. This critical point is responsible for the existence of large premartensitic fluctuations which manifest as broad peaks in the specific heat, not always associated with a true phase transition. The main conclusion is that premartensitic effects result from the interplay between the softness of the anomalous phonon driving the modulation and the magnetoelastic coupling. In particular, the premartensitic transition occurs when such coupling is strong enough to freeze the involved mode phonon. The implication of the results in relation to the available experimental data is discussed.
Resumo:
Stress-strain trajectories associated with pseudoelastic behavior of a Cu¿19.4 Zn¿13.1 Al (at.%) single crystal at room temperature have been determined experimentally. For a constant cross-head speed the trajectories and the associated hysteresis behavior are perfectly reproducible; the trajectories exhibit memory properties, dependent only on the values of return points, where transformation direction is reverted. An adapted version of the Preisach model for hysteresis has been implemented to predict the observed trajectories, using a set of experimental first¿order reversal curves as input data. Explicit formulas have been derived giving all trajectories in terms of this data set, with no adjustable parameters. Comparison between experimental and calculated trajectories shows a much better agreement for descending than for ascending paths, an indication of a dissymmetry between the dissipation mechanisms operative in forward and reverse directions of martensitic transformation.
Resumo:
We study theoretical and empirical aspects of the mean exit time (MET) of financial time series. The theoretical modeling is done within the framework of continuous time random walk. We empirically verify that the mean exit time follows a quadratic scaling law and it has associated a prefactor which is specific to the analyzed stock. We perform a series of statistical tests to determine which kind of correlation are responsible for this specificity. The main contribution is associated with the autocorrelation property of stock returns. We introduce and solve analytically both two-state and three-state Markov chain models. The analytical results obtained with the two-state Markov chain model allows us to obtain a data collapse of the 20 measured MET profiles in a single master curve.
Resumo:
We formulate a new mixing model to explore hydrological and chemical conditions under which the interface between the stream and catchment interface (SCI) influences the release of reactive solutes into stream water during storms. Physically, the SCI corresponds to the hyporheic/riparian sediments. In the new model this interface is coupled through a bidirectional water exchange to the conventional two components mixing model. Simulations show that the influence of the SCI on stream solute dynamics during storms is detectable when the runoff event is dominated by the infiltrated groundwater component that flows through the SCI before entering the stream and when the flux of solutes released from SCI sediments is similar to, or higher than, the solute flux carried by the groundwater. Dissolved organic carbon (DOC) and nitrate data from two small Mediterranean streams obtained during storms are compared to results from simulations using the new model to discern the circumstances under which the SCI is likely to control the dynamics of reactive solutes in streams. The simulations and the comparisons with empirical data suggest that the new mixing model may be especially appropriate for streams in which the periodic, or persistent, abrupt changes in the level of riparian groundwater exert hydrologic control on flux of biologically reactive fluxes between the riparian/hyporheic compartment and the stream water.
Resumo:
Based on provious (Hemelrijk 1998; Puga-González, Hildenbrant & Hemelrijk 2009), we have developed an agent-based model and software, called A-KinGDom, which allows us to simulate the emergence of the social structure in a group of non-human primates. The model includes dominance and affiliative interactions and incorporate s two main innovations (preliminary dominance interactions and a kinship factor), which allow us to define four different attack and affiliative strategies. In accordance with these strategies, we compared the data obtained under four simulation conditions with the results obtained in a provious study (Dolado & Beltran 2012) involving empirical observations of a captive group of mangabeys (Cercocebus torquatus)
Resumo:
High-energy charged particles in the van Allen radiation belts and in solar energetic particle events can damage satellites on orbit leading to malfunctions and loss of satellite service. Here we describe some recent results from the SPACECAST project on modelling and forecasting the radiation belts, and modelling solar energetic particle events. We describe the SPACECAST forecasting system that uses physical models that include wave-particle interactions to forecast the electron radiation belts up to 3 h ahead. We show that the forecasts were able to reproduce the >2 MeV electron flux at GOES 13 during the moderate storm of 7-8 October 2012, and the period following a fast solar wind stream on 25-26 October 2012 to within a factor of 5 or so. At lower energies of 10- a few 100 keV we show that the electron flux at geostationary orbit depends sensitively on the high-energy tail of the source distribution near 10 RE on the nightside of the Earth, and that the source is best represented by a kappa distribution. We present a new model of whistler mode chorus determined from multiple satellite measurements which shows that the effects of wave-particle interactions beyond geostationary orbit are likely to be very significant. We also present radial diffusion coefficients calculated from satellite data at geostationary orbit which vary with Kp by over four orders of magnitude. We describe a new automated method to determine the position at the shock that is magnetically connected to the Earth for modelling solar energetic particle events and which takes into account entropy, and predict the form of the mean free path in the foreshock, and particle injection efficiency at the shock from analytical theory which can be tested in simulations.
Resumo:
High-energy charged particles in the van Allen radiation belts and in solar energetic particle events can damage satellites on orbit leading to malfunctions and loss of satellite service. Here we describe some recent results from the SPACECAST project on modelling and forecasting the radiation belts, and modelling solar energetic particle events. We describe the SPACECAST forecasting system that uses physical models that include wave-particle interactions to forecast the electron radiation belts up to 3 h ahead. We show that the forecasts were able to reproduce the >2 MeV electron flux at GOES 13 during the moderate storm of 7-8 October 2012, and the period following a fast solar wind stream on 25-26 October 2012 to within a factor of 5 or so. At lower energies of 10- a few 100 keV we show that the electron flux at geostationary orbit depends sensitively on the high-energy tail of the source distribution near 10 RE on the nightside of the Earth, and that the source is best represented by a kappa distribution. We present a new model of whistler mode chorus determined from multiple satellite measurements which shows that the effects of wave-particle interactions beyond geostationary orbit are likely to be very significant. We also present radial diffusion coefficients calculated from satellite data at geostationary orbit which vary with Kp by over four orders of magnitude. We describe a new automated method to determine the position at the shock that is magnetically connected to the Earth for modelling solar energetic particle events and which takes into account entropy, and predict the form of the mean free path in the foreshock, and particle injection efficiency at the shock from analytical theory which can be tested in simulations.
Resumo:
Background: During the last part of the 1990s the chance of surviving breast cancer increased. Changes in survival functions reflect a mixture of effects. Both, the introduction of adjuvant treatments and early screening with mammography played a role in the decline in mortality. Evaluating the contribution of these interventions using mathematical models requires survival functions before and after their introduction. Furthermore, required survival functions may be different by age groups and are related to disease stage at diagnosis. Sometimes detailed information is not available, as was the case for the region of Catalonia (Spain). Then one may derive the functions using information from other geographical areas. This work presents the methodology used to estimate age- and stage-specific Catalan breast cancer survival functions from scarce Catalan survival data by adapting the age- and stage-specific US functions. Methods: Cubic splines were used to smooth data and obtain continuous hazard rate functions. After, we fitted a Poisson model to derive hazard ratios. The model included time as a covariate. Then the hazard ratios were applied to US survival functions detailed by age and stage to obtain Catalan estimations. Results: We started estimating the hazard ratios for Catalonia versus the USA before and after the introduction of screening. The hazard ratios were then multiplied by the age- and stage-specific breast cancer hazard rates from the USA to obtain the Catalan hazard rates. We also compared breast cancer survival in Catalonia and the USA in two time periods, before cancer control interventions (USA 1975–79, Catalonia 1980–89) and after (USA and Catalonia 1990–2001). Survival in Catalonia in the 1980–89 period was worse than in the USA during 1975–79, but the differences disappeared in 1990–2001. Conclusion: Our results suggest that access to better treatments and quality of care contributed to large improvements in survival in Catalonia. On the other hand, we obtained detailed breast cancer survival functions that will be used for modeling the effect of screening and adjuvant treatments in Catalonia.