52 resultados para Vector Space Model
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
In this article we present a hybrid approach for automatic summarization of Spanish medical texts. There are a lot of systems for automatic summarization using statistics or linguistics, but only a few of them combining both techniques. Our idea is that to reach a good summary we need to use linguistic aspects of texts, but as well we should benefit of the advantages of statistical techniques. We have integrated the Cortex (Vector Space Model) and Enertex (statistical physics) systems coupled with the Yate term extractor, and the Disicosum system (linguistics). We have compared these systems and afterwards we have integrated them in a hybrid approach. Finally, we have applied this hybrid system over a corpora of medical articles and we have evaluated their performances obtaining good results.
Resumo:
Viruses rapidly evolve, and HIV in particular is known to be one of the fastest evolving human viruses. It is now commonly accepted that viral evolution is the cause of the intriguing dynamics exhibited during HIV infections and the ultimate success of the virus in its struggle with the immune system. To study viral evolution, we use a simple mathematical model of the within-host dynamics of HIV which incorporates random mutations. In this model, we assume a continuous distribution of viral strains in a one-dimensional phenotype space where random mutations are modelled by di ffusion. Numerical simulations show that random mutations combined with competition result in evolution towards higher Darwinian fitness: a stable traveling wave of evolution, moving towards higher levels of fi tness, is formed in the phenoty space.
Resumo:
This note describes how the Kalman filter can be modified to allow for thevector of observables to be a function of lagged variables without increasing the dimensionof the state vector in the filter. This is useful in applications where it is desirable to keepthe dimension of the state vector low. The modified filter and accompanying code (whichnests the standard filter) can be used to compute (i) the steady state Kalman filter (ii) thelog likelihood of a parameterized state space model conditional on a history of observables(iii) a smoothed estimate of latent state variables and (iv) a draw from the distribution oflatent states conditional on a history of observables.
Resumo:
El déficit existente a nuestro país con respecto a la disponibilidad de indicadores cuantitativos con los que llevar a término un análisis coyuntural de la actividad industrial regional ha abierto un debate centrado en el estudio de cuál es la metodología más adecuada para elaborar indicadores de estas características. Dentro de este marco, en este trabajo se presentan las principales conclusiones obtenidas en anteriores estudios (Clar, et. al., 1997a, 1997b y 1998) sobre la idoneidad de extender las metodologías que actualmente se están aplicando a las regiones españolas para elaborar indicadores de la actividad industrial mediante métodos indirectos. Estas conclusiones llevan a plantear una estrategia distinta a las que actualmente se vienen aplicando. En concreto, se propone (siguiendo a Israilevich y Kuttner, 1993) un modelo de variables latentes para estimar el indicador de la producción industrial regional. Este tipo de modelo puede especificarse en términos de un modelo statespace y estimarse mediante el filtro de Kalman. Para validar la metodología propuesta se estiman unos indicadores de acuerdo con ella para tres de las cuatro regiones españolas que disponen d¿un Índice de Producción Industrial (IPI) elaborado mediante el método directo (Andalucía, Asturias y el País Vasco) y se comparan con los IPIs publicados (oficiales). Los resultados obtenidos muestran el buen comportamiento de l¿estrategia propuesta, abriendo así una línea de trabajo con la que subsanar el déficit al que se hacía referencia anteriormente
Resumo:
El déficit existente a nuestro país con respecto a la disponibilidad de indicadores cuantitativos con los que llevar a término un análisis coyuntural de la actividad industrial regional ha abierto un debate centrado en el estudio de cuál es la metodología más adecuada para elaborar indicadores de estas características. Dentro de este marco, en este trabajo se presentan las principales conclusiones obtenidas en anteriores estudios (Clar, et. al., 1997a, 1997b y 1998) sobre la idoneidad de extender las metodologías que actualmente se están aplicando a las regiones españolas para elaborar indicadores de la actividad industrial mediante métodos indirectos. Estas conclusiones llevan a plantear una estrategia distinta a las que actualmente se vienen aplicando. En concreto, se propone (siguiendo a Israilevich y Kuttner, 1993) un modelo de variables latentes para estimar el indicador de la producción industrial regional. Este tipo de modelo puede especificarse en términos de un modelo statespace y estimarse mediante el filtro de Kalman. Para validar la metodología propuesta se estiman unos indicadores de acuerdo con ella para tres de las cuatro regiones españolas que disponen d¿un Índice de Producción Industrial (IPI) elaborado mediante el método directo (Andalucía, Asturias y el País Vasco) y se comparan con los IPIs publicados (oficiales). Los resultados obtenidos muestran el buen comportamiento de l¿estrategia propuesta, abriendo así una línea de trabajo con la que subsanar el déficit al que se hacía referencia anteriormente
Resumo:
Macroeconomic activity has become less volatile over the past three decades in most G7 economies. Current literature focuses on the characterization of the volatility reduction and explanations for this so called "moderation" in each G7 economy separately. In opposed to individual country analysis and individual variable analysis, this paper focuses on common characteristics of the reduction and common explanations for the moderation in G7 countries. In particular, we study three explanations: structural changes in the economy, changes in common international shocks and changes in domestic shocks. We study these explanations in a unified model structure. To this end, we propose a Bayesian factor structural vector autoregressive model. Using the proposed model, we investigate whether we can find common explanations for all G7 economies when information is pooled from multiple domestic and international sources. Our empirical analysis suggests that volatility reductions can largely be attributed to the decline in the magnitudes of the shocks in most G7 countries while only for the U.K., the U.S. and Italy they can partially be attributed to structural changes in the economy. Analyzing the components of the volatility, we also find that domestic shocks rather than common international shocks can account for a large part of the volatility reduction in most of the G7 countries. Finally, we find that after mid-1980s the structure of the economy changes substantially in five of the G7 countries: Germany, Italy, Japan, the U.K. and the U.S..
Resumo:
Error-correcting codes and matroids have been widely used in the study of ordinary secret sharing schemes. In this paper, the connections between codes, matroids, and a special class of secret sharing schemes, namely, multiplicative linear secret sharing schemes (LSSSs), are studied. Such schemes are known to enable multiparty computation protocols secure against general (nonthreshold) adversaries.Two open problems related to the complexity of multiplicative LSSSs are considered in this paper. The first one deals with strongly multiplicative LSSSs. As opposed to the case of multiplicative LSSSs, it is not known whether there is an efficient method to transform an LSSS into a strongly multiplicative LSSS for the same access structure with a polynomial increase of the complexity. A property of strongly multiplicative LSSSs that could be useful in solving this problem is proved. Namely, using a suitable generalization of the well-known Berlekamp–Welch decoder, it is shown that all strongly multiplicative LSSSs enable efficient reconstruction of a shared secret in the presence of malicious faults. The second one is to characterize the access structures of ideal multiplicative LSSSs. Specifically, the considered open problem is to determine whether all self-dual vector space access structures are in this situation. By the aforementioned connection, this in fact constitutes an open problem about matroid theory, since it can be restated in terms of representability of identically self-dual matroids by self-dual codes. A new concept is introduced, the flat-partition, that provides a useful classification of identically self-dual matroids. Uniform identically self-dual matroids, which are known to be representable by self-dual codes, form one of the classes. It is proved that this property also holds for the family of matroids that, in a natural way, is the next class in the above classification: the identically self-dual bipartite matroids.
Resumo:
The Aitchison vector space structure for the simplex is generalized to a Hilbert space structure A2(P) for distributions and likelihoods on arbitrary spaces. Centralnotations of statistics, such as Information or Likelihood, can be identified in the algebraical structure of A2(P) and their corresponding notions in compositional data analysis, such as Aitchison distance or centered log ratio transform.In this way very elaborated aspects of mathematical statistics can be understoodeasily in the light of a simple vector space structure and of compositional data analysis. E.g. combination of statistical information such as Bayesian updating,combination of likelihood and robust M-estimation functions are simple additions/perturbations in A2(Pprior). Weighting observations corresponds to a weightedaddition of the corresponding evidence.Likelihood based statistics for general exponential families turns out to have aparticularly easy interpretation in terms of A2(P). Regular exponential families formfinite dimensional linear subspaces of A2(P) and they correspond to finite dimensionalsubspaces formed by their posterior in the dual information space A2(Pprior).The Aitchison norm can identified with mean Fisher information. The closing constant itself is identified with a generalization of the cummulant function and shown to be Kullback Leiblers directed information. Fisher information is the local geometry of the manifold induced by the A2(P) derivative of the Kullback Leibler information and the space A2(P) can therefore be seen as the tangential geometry of statistical inference at the distribution P.The discussion of A2(P) valued random variables, such as estimation functionsor likelihoods, give a further interpretation of Fisher information as the expected squared norm of evidence and a scale free understanding of unbiased reasoning
Resumo:
We estimate the response of stock prices to exogenous monetary policy shocks usinga vector-autoregressive model with time-varying parameters. Our evidence points toprotracted episodes in which, after a a short-run decline, stock prices increase persistently in response to an exogenous tightening of monetary policy. That responseis clearly at odds with the "conventional" view on the effects of monetary policy onbubbles, as well as with the predictions of bubbleless models. We also argue that it isunlikely that such evidence be accounted for by an endogenous response of the equitypremium to the monetary policy shocks.
Resumo:
The final year project came to us as an opportunity to get involved in a topic which has appeared to be attractive during the learning process of majoring in economics: statistics and its application to the analysis of economic data, i.e. econometrics.Moreover, the combination of econometrics and computer science is a very hot topic nowadays, given the Information Technologies boom in the last decades and the consequent exponential increase in the amount of data collected and stored day by day. Data analysts able to deal with Big Data and to find useful results from it are verydemanded in these days and, according to our understanding, the work they do, although sometimes controversial in terms of ethics, is a clear source of value added both for private corporations and the public sector. For these reasons, the essence of this project is the study of a statistical instrument valid for the analysis of large datasets which is directly related to computer science: Partial Correlation Networks.The structure of the project has been determined by our objectives through the development of it. At first, the characteristics of the studied instrument are explained, from the basic ideas up to the features of the model behind it, with the final goal of presenting SPACE model as a tool for estimating interconnections in between elements in large data sets. Afterwards, an illustrated simulation is performed in order to show the power and efficiency of the model presented. And at last, the model is put into practice by analyzing a relatively large data set of real world data, with the objective of assessing whether the proposed statistical instrument is valid and useful when applied to a real multivariate time series. In short, our main goals are to present the model and evaluate if Partial Correlation Network Analysis is an effective, useful instrument and allows finding valuable results from Big Data.As a result, the findings all along this project suggest the Partial Correlation Estimation by Joint Sparse Regression Models approach presented by Peng et al. (2009) to work well under the assumption of sparsity of data. Moreover, partial correlation networks are shown to be a very valid tool to represent cross-sectional interconnections in between elements in large data sets.The scope of this project is however limited, as there are some sections in which deeper analysis would have been appropriate. Considering intertemporal connections in between elements, the choice of the tuning parameter lambda, or a deeper analysis of the results in the real data application are examples of aspects in which this project could be completed.To sum up, the analyzed statistical tool has been proved to be a very useful instrument to find relationships that connect the elements present in a large data set. And after all, partial correlation networks allow the owner of this set to observe and analyze the existing linkages that could have been omitted otherwise.
Resumo:
Available empirical evidence regarding the degree of symmetry between European economies in the context of Monetary Unification is not conclusive. This paper offers new empirical evidence concerning this issue related to the manufacturing sector. Instead of using a static approach as most empirical studies do, we analyse the dynamic evolution of shock symmetry using a state-space model. The results show a clear reduction of asymmetries in terms of demand shocks between 1975 and 1996, with an increase in terms of supply shocks at the end of the period.
Resumo:
This paper characterizes a mixed strategy Nash equilibrium in a one-dimensional Downsian model of two-candidate elections with a continuous policy space, where candidates are office motivated and one candidate enjoys a non-policy advantage over the other candidate. We assume that voters have quadratic preferences over policies and that their ideal points are drawn from a uniform distribution over the unit interval. In our equilibrium the advantaged candidate chooses the expected median voter with probability one and the disadvantaged candidate uses a mixed strategy that is symmetric around it. We show that this equilibrium exists if the number of voters is large enough relative to the size of the advantage.
Resumo:
The space subdivision in cells resulting from a process of random nucleation and growth is a subject of interest in many scientific fields. In this paper, we deduce the expected value and variance of these distributions while assuming that the space subdivision process is in accordance with the premises of the Kolmogorov-Johnson-Mehl-Avrami model. We have not imposed restrictions on the time dependency of nucleation and growth rates. We have also developed an approximate analytical cell size probability density function. Finally, we have applied our approach to the distributions resulting from solid phase crystallization under isochronal heating conditions
Resumo:
This paper presents a dynamic choice model in the attributespace considering rational consumers that discount the future. In lightof the evidence of several state-dependence patterns, the model isfurther extended by considering a utility function that allows for thedifferent types of behavior described in the literature: pure inertia,pure variety seeking and hybrid. The model presents a stationaryconsumption pattern that can be inertial, where the consumer only buysone product, or a variety-seeking one, where the consumer buys severalproducts simultane-ously. Under the inverted-U marginal utilityassumption, the consumer behaves inertial among the existing brands forseveral periods, and eventually, once the stationary levels areapproached, the consumer turns to a variety-seeking behavior. An empiricalanalysis is run using a scanner database for fabric softener andsignificant evidence of hybrid behavior for most attributes is found,which supports the functional form considered in the theory.
Resumo:
This paper presents and estimates a dynamic choice model in the attribute space considering rational consumers. In light of the evidence of several state-dependence patterns, the standard attribute-based model is extended by considering a general utility function where pure inertia and pure variety-seeking behaviors can be explained in the model as particular linear cases. The dynamics of the model are fully characterized by standard dynamic programming techniques. The model presents a stationary consumption pattern that can be inertial, where the consumer only buys one product, or a variety-seeking one, where the consumer shifts among varied products.We run some simulations to analyze the consumption paths out of the steady state. Underthe hybrid utility assumption, the consumer behaves inertially among the unfamiliar brandsfor several periods, eventually switching to a variety-seeking behavior when the stationary levels are approached. An empirical analysis is run using scanner databases for three different product categories: fabric softener, saltine cracker, and catsup. Non-linear specifications provide the best fit of the data, as hybrid functional forms are found in all the product categories for most attributes and segments. These results reveal the statistical superiority of the non-linear structure and confirm the gradual trend to seek variety as the level of familiarity with the purchased items increases.