936 resultados para Vector Space Model
Resumo:
In this article we present a hybrid approach for automatic summarization of Spanish medical texts. There are a lot of systems for automatic summarization using statistics or linguistics, but only a few of them combining both techniques. Our idea is that to reach a good summary we need to use linguistic aspects of texts, but as well we should benefit of the advantages of statistical techniques. We have integrated the Cortex (Vector Space Model) and Enertex (statistical physics) systems coupled with the Yate term extractor, and the Disicosum system (linguistics). We have compared these systems and afterwards we have integrated them in a hybrid approach. Finally, we have applied this hybrid system over a corpora of medical articles and we have evaluated their performances obtaining good results.
Resumo:
In this paper, we propose a text mining method called LRD (latent relation discovery), which extends the traditional vector space model of document representation in order to improve information retrieval (IR) on documents and document clustering. Our LRD method extracts terms and entities, such as person, organization, or project names, and discovers relationships between them by taking into account their co-occurrence in textual corpora. Given a target entity, LRD discovers other entities closely related to the target effectively and efficiently. With respect to such relatedness, a measure of relation strength between entities is defined. LRD uses relation strength to enhance the vector space model, and uses the enhanced vector space model for query based IR on documents and clustering documents in order to discover complex relationships among terms and entities. Our experiments on a standard dataset for query based IR shows that our LRD method performed significantly better than traditional vector space model and other five standard statistical methods for vector expansion.
Resumo:
This paper introduces a State Space approach to explain the dynamics of rent growth, expected returns and Price-Rent ratio in housing markets. According to the present value model, movements in price to rent ratio should be matched by movements in expected returns and expected rent growth. The state space framework assume that both variables follow an autoregressive process of order one. The model is applied to the US and UK housing market, which yields series of the latent variables given the behaviour of the Price-Rent ratio. Resampling techniques and bootstrapped likelihood ratios show that expected returns tend to be highly persistent compared to rent growth. The Öltered expected returns is considered in a simple predictability of excess returns model with high statistical predictability evidenced for the UK. Overall, it is found that the present value model tends to have strong statistical predictability in the UK housing markets.
Resumo:
This paper introduces a State Space approach to explain the dynamics of rent growth, expected returns and Price-Rent ratio in housing markets. According to the present value model, movements in price to rent ratio should be matched by movements in expected returns and expected rent growth. The state space framework assume that both variables follow an autoregression process of order one. The model is applied to the US and UK housing market, which yields series of the latent variables given the behaviour of the Price-Rent ratio. Resampling techniques and bootstrapped likelihood ratios show that expected returns tend to be highly persistent compared to rent growth. The filtered expected returns is considered in a simple predictability of excess returns model with high statistical predictability evidence for the UK. Overall, it is found that the present value model tends to have strong statistical predictability in the UK housing markets.
Resumo:
Viruses rapidly evolve, and HIV in particular is known to be one of the fastest evolving human viruses. It is now commonly accepted that viral evolution is the cause of the intriguing dynamics exhibited during HIV infections and the ultimate success of the virus in its struggle with the immune system. To study viral evolution, we use a simple mathematical model of the within-host dynamics of HIV which incorporates random mutations. In this model, we assume a continuous distribution of viral strains in a one-dimensional phenotype space where random mutations are modelled by di ffusion. Numerical simulations show that random mutations combined with competition result in evolution towards higher Darwinian fitness: a stable traveling wave of evolution, moving towards higher levels of fi tness, is formed in the phenoty space.
Resumo:
We model the large scale fading of wireless THz communications links deployed in a metropolitan area taking into account reception through direct line of sight, ground or wall reflection and diffraction. The movement of the receiver in the three dimensions is modelled by an autonomous dynamic linear system in state-space whereas the geometric relations involved in the attenuation and multi-path propagation of the electric field are described by a static non-linear mapping. A subspace algorithm in conjunction with polynomial regression is used to identify a Wiener model from time-domain measurements of the field intensity.
Resumo:
Recently, two international standard organizations, ISO and OGC, have done the work of standardization for GIS. Current standardization work for providing interoperability among GIS DB focuses on the design of open interfaces. But, this work has not considered procedures and methods for designing river geospatial data. Eventually, river geospatial data has its own model. When we share the data by open interface among heterogeneous GIS DB, differences between models result in the loss of information. In this study a plan was suggested both to respond to these changes in the information envirnment and to provide a future Smart River-based river information service by understanding the current state of river geospatial data model, improving, redesigning the database. Therefore, primary and foreign key, which can distinguish attribute information and entity linkages, were redefined to increase the usability. Database construction of attribute information and entity relationship diagram have been newly redefined to redesign linkages among tables from the perspective of a river standard database. In addition, this study was undertaken to expand the current supplier-oriented operating system to a demand-oriented operating system by establishing an efficient management of river-related information and a utilization system, capable of adapting to the changes of a river management paradigm.
Resumo:
Abstract Background To understand the molecular mechanisms underlying important biological processes, a detailed description of the gene products networks involved is required. In order to define and understand such molecular networks, some statistical methods are proposed in the literature to estimate gene regulatory networks from time-series microarray data. However, several problems still need to be overcome. Firstly, information flow need to be inferred, in addition to the correlation between genes. Secondly, we usually try to identify large networks from a large number of genes (parameters) originating from a smaller number of microarray experiments (samples). Due to this situation, which is rather frequent in Bioinformatics, it is difficult to perform statistical tests using methods that model large gene-gene networks. In addition, most of the models are based on dimension reduction using clustering techniques, therefore, the resulting network is not a gene-gene network but a module-module network. Here, we present the Sparse Vector Autoregressive model as a solution to these problems. Results We have applied the Sparse Vector Autoregressive model to estimate gene regulatory networks based on gene expression profiles obtained from time-series microarray experiments. Through extensive simulations, by applying the SVAR method to artificial regulatory networks, we show that SVAR can infer true positive edges even under conditions in which the number of samples is smaller than the number of genes. Moreover, it is possible to control for false positives, a significant advantage when compared to other methods described in the literature, which are based on ranks or score functions. By applying SVAR to actual HeLa cell cycle gene expression data, we were able to identify well known transcription factor targets. Conclusion The proposed SVAR method is able to model gene regulatory networks in frequent situations in which the number of samples is lower than the number of genes, making it possible to naturally infer partial Granger causalities without any a priori information. In addition, we present a statistical test to control the false discovery rate, which was not previously possible using other gene regulatory network models.