2 resultados para Latent Semantic Indexing
em Open University Netherlands
Resumo:
Text cohesion is an important element of discourse processing. This paper presents a new approach to modeling, quantifying, and visualizing text cohesion using automated cohesion flow indices that capture semantic links among paragraphs. Cohesion flow is calculated by applying Cohesion Network Analysis, a combination of semantic distances, Latent Semantic Analysis, and Latent Dirichlet Allocation, as well as Social Network Analysis. Experiments performed on 315 timed essays indicated that cohesion flow indices are significantly correlated with human ratings of text coherence and essay quality. Visualizations of the global cohesion indices are also included to support a more facile understanding of how cohesion flow impacts coherence in terms of semantic dependencies between paragraphs.
Resumo:
This paper introduces a novel, in-depth approach of analyzing the differences in writing style between two famous Romanian orators, based on automated textual complexity indices for Romanian language. The considered authors are: (a) Mihai Eminescu, Romania’s national poet and a remarkable journalist of his time, and (b) Ion C. Brătianu, one of the most important Romanian politicians from the middle of the 18th century. Both orators have a common journalistic interest consisting in their desire to spread the word about political issues in Romania via the printing press, the most important public voice at that time. In addition, both authors exhibit writing style particularities, and our aim is to explore these differences through our ReaderBench framework that computes a wide range of lexical and semantic textual complexity indices for Romanian and other languages. The used corpus contains two collections of speeches for each orator that cover the period 1857–1880. The results of this study highlight the lexical and cohesive textual complexity indices that reflect very well the differences in writing style, measures relying on Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) semantic models.