906 resultados para Text mining, Classificazione, Stemming, Text categorization
Resumo:
Fil: Disalvo, Santiago Aníbal. Universidad Nacional de La Plata. Facultad de Humanidades y Ciencias de la Educación; Argentina.
Resumo:
This paper sets out to report on findings about features of task-specific reformulation observed in university students in the middle stretch of the Psychology degree course (N=58) and in a reference group of students from the degree courses in Modern Languages, Spanish and Library Studies (N=33) from the National University of La Plata (Argentina). Three types of reformulation were modeled: summary reformulation, comprehensive and productive reformulation.The study was based on a corpus of 621 reformulations rendered from different kinds of text. The versions obtained were categorised according to the following criteria: presence or absence of normative, morphosyntactic and semantic difficulties. Findings show that problems arise particularly with paraphrase and summary writing. Observation showed difficulties concerning punctuation, text cohesion and coherence , and semantic distortion or omission as regards extracting and/or substituting gist, with limited lexical resources and confusion as to suitability of style/register in writing. The findings in this study match those of earlier, more comprehensive research on the issue and report on problems experienced by a significant number of university students when interacting with both academic texts and others of a general nature. Moreover, they led to questions, on the one hand, as to the nature of such difficulties, which appear to be production-related problems and indirectly account for inadequate text comprehension, and on the other hand, as to the features of university tuition when it comes to text handling.
Resumo:
Fil: Disalvo, Santiago Aníbal. Universidad Nacional de La Plata. Facultad de Humanidades y Ciencias de la Educación; Argentina.
Resumo:
This paper sets out to report on findings about features of task-specific reformulation observed in university students in the middle stretch of the Psychology degree course (N=58) and in a reference group of students from the degree courses in Modern Languages, Spanish and Library Studies (N=33) from the National University of La Plata (Argentina). Three types of reformulation were modeled: summary reformulation, comprehensive and productive reformulation.The study was based on a corpus of 621 reformulations rendered from different kinds of text. The versions obtained were categorised according to the following criteria: presence or absence of normative, morphosyntactic and semantic difficulties. Findings show that problems arise particularly with paraphrase and summary writing. Observation showed difficulties concerning punctuation, text cohesion and coherence , and semantic distortion or omission as regards extracting and/or substituting gist, with limited lexical resources and confusion as to suitability of style/register in writing. The findings in this study match those of earlier, more comprehensive research on the issue and report on problems experienced by a significant number of university students when interacting with both academic texts and others of a general nature. Moreover, they led to questions, on the one hand, as to the nature of such difficulties, which appear to be production-related problems and indirectly account for inadequate text comprehension, and on the other hand, as to the features of university tuition when it comes to text handling.
Resumo:
von J. A. Schlipf
Resumo:
Mobile phones are becoming increasingly popular and are already the first access technology to information and communication. However, people with disabilities have to face a lot of barriers when using this kind of technology. This paper presents an Accessible Contact Manager and a Real Time Text application, designed to be used by all users with disabilities. Both applications are focused to improve accessibility of mobile phones.
Resumo:
This paper proposes an architecture, based on statistical machine translation, for developing the text normalization module of a text to speech conversion system. The main target is to generate a language independent text normalization module, based on data and flexible enough to deal with all situa-tions presented in this task. The proposed architecture is composed by three main modules: a tokenizer module for splitting the text input into a token graph (tokenization), a phrase-based translation module (token translation) and a post-processing module for removing some tokens. This paper presents initial exper-iments for numbers and abbreviations. The very good results obtained validate the proposed architecture.
Resumo:
This paper describes a low complexity strategy for detecting and recognizing text signs automatically. Traditional approaches use large image algorithms for detecting the text sign, followed by the application of an Optical Character Recognition (OCR) algorithm in the previously identified areas. This paper proposes a new architecture that applies the OCR to a whole lightly treated image and then carries out the text detection process of the OCR output. The strategy presented in this paper significantly reduces the processing time required for text localization in an image, while guaranteeing a high recognition rate. This strategy will facilitate the incorporation of video processing-based applications into the automatic detection of text sign similar to that of a smartphone. These applications will increase the autonomy of visually impaired people in their daily life.
Resumo:
This paper describes the text normalization module of a text to speech fully-trainable conversion system and its application to number transcription. The main target is to generate a language independent text normalization module, based on data instead of on expert rules. This paper proposes a general architecture based on statistical machine translation techniques. This proposal is composed of three main modules: a tokenizer for splitting the text input into a token graph, a phrase-based translation module for token translation, and a post-processing module for removing some tokens. This architecture has been evaluated for number transcription in several languages: English, Spanish and Romanian. Number transcription is an important aspect in the text normalization problem.
Resumo:
Modern sensor technologies and simulators applied to large and complex dynamic systems (such as road traffic networks, sets of river channels, etc.) produce large amounts of behavior data that are difficult for users to interpret and analyze. Software tools that generate presentations combining text and graphics can help users understand this data. In this paper we describe the results of our research on automatic multimedia presentation generation (including text, graphics, maps, images, etc.) for interactive exploration of behavior datasets. We designed a novel user interface that combines automatically generated text and graphical resources. We describe the general knowledge-based design of our presentation generation tool. We also present applications that we developed to validate the method, and a comparison with related work.
Resumo:
Se comparan y contrastan las destrezas requeridas para la comprensión lectora con aquellas que se necesitan para la producción de escritos correctos, en inglés, coherentes y bien cohesionados. Se comentan las actividades didácticas relacionadas con ello.The aim of this article is to establish the relevance of teaching reading and writing skills to students at Madrid Polytechnic University, and to show the relationship and interdependence of these activities in EAP courses. The skills involved in reading and writing processes for academic purposes for L2 students are compared and commented on from a rhetorical point of view. Learning tasks based on text-type analysis are recommended as adequate activities to build schemata for writing and represent a synthesis of the teaching objectives proposed for reading and writing English courses.