954 resultados para 280205 Text Processing


Relevância:

80.00% 80.00%

Publicador:

Resumo:

某些书写系统的文字(如蒙古文、维文、藏文等)具有比拉丁文字复杂的特性,当计算机在处理这类文字时,运用传统的字体技术(如TrueType)几乎不可能在显现出规范的书写形式的同时,实现对Unicode标准编码的支持.就这个问题介绍一种基于OpenType字体的处理模型.事实证明,这是一种可行的方案.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

计算机与不同用户的交互通常必须实现通过多种文字信息的输入/输出以实现,因此操作系统对多种文字的支持程度是其功能性的一个衡量标准。各种文字特征的巨大差异导致现代操作系统的文字处理实现非常复杂。本文总结了操作系统文字处理的范围与内容,包括文本输入与存储,文本处理以及用户交互处理;归纳了通用的文字处理模型和可能采取的技术途径及其优缺点;分析了常用操作系统的文字处理实现;最后展望了文字处理仍面临的挑战。

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R. Jensen and Q. Shen. Fuzzy-Rough Sets Assisted Attribute Selection. IEEE Transactions on Fuzzy Systems, vol. 15, no. 1, pp. 73-89, 2007.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R. Jensen and Q. Shen. Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough Based Approaches. IEEE Transactions on Knowledge and Data Engineering, 16(12): 1457-1471. 2004.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R. Jensen and Q. Shen, 'Fuzzy-Rough Data Reduction with Ant Colony Optimization,' Fuzzy Sets and Systems, vol. 149, no. 1, pp. 5-20, 2005.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R. Jensen, Q. Shen and A. Tuson, 'Finding Rough Set Reducts with SAT,' Proceedings of the 10th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, LNAI 3641, pp. 194-203, 2005.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This article takes as its main point of departure a body of empirical research on reading and text processing, and makes particular reference to the type of experiments conducted in Egidi and Gerrig (2006) and Rapp and Gerrig (2006). Broadly put, these experiments (i) explore the psychology of readers’ preferences for narrative outcomes, (ii) examine the way readers react to characters’ goals and actions, and (iii) investigate how readers tend to identify with characters’ goals the more ‘urgently’ those goals are narrated. The present article signals how stylistics can productively enrich such experimental work. Stylistics, it is argued, is well equipped to deal with subtle and nuanced variations in textual patterns without losing sight of the broader cognitive and discoursal positioning of readers in relation to these patterns. Making particular reference to what might constitute narrative ‘urgency’, the article develops a model which amalgamates different strands of contemporary research in narrative stylistics. This model advances and elaborates three key components: a Stylistic Profile, a Burlesque Block and a Kuleshov Monitor. Developing analyses of, and informal informant tests on, examples of both fiction and film, the article calls for a more rounded and sophisticated understanding of style in empirical research on subjects’ responses to patterns in narrative.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Taking into account the study of Luegi (2006), where eye movements of 20 Portuguese university students while reading text passages were analyzed, in this article we discuss some methodological issues concerning eye tracking measures to evaluate reading difficulties. Relating syntactic complexity, grammaticality and ambiguity to eye movements, we will discuss the use of many different dependent variables that indicate the immediate and delayed processes in text processing. We propose a new measure that we called Progression-Path which permits analyzing, in the critical region, what happens when the reader proceeds on the sentence instead of going backwards to solve a problem that s/he found (which is the most common expected behavior but not the only one, as is illustrated by some of our examples).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objectives: The current study examined younger and older adults’ error detection accuracy, prediction calibration, and postdiction calibration on a proofreading task, to determine if age-related difference would be present in this type of common error detection task. Method: Participants were given text passages, and were first asked to predict the percentage of errors they would detect in the passage. They then read the passage and circled errors (which varied in complexity and locality), and made postdictions regarding their performance, before repeating this with another passage and answering a comprehension test of both passages. Results: There were no age-related differences in error detection accuracy, text comprehension, or metacognitive calibration, though participants in both age groups were overconfident overall in their metacognitive judgments. Both groups gave similar ratings of motivation to complete the task. The older adults rated the passages as more interesting than younger adults did, although this level of interest did not appear to influence error-detection performance. Discussion: The age equivalence in both proofreading ability and calibration suggests that the ability to proofread text passages and the associated metacognitive monitoring used in judging one’s own performance are maintained in aging. These age-related similarities persisted when younger adults completed the proofreading tasks on a computer screen, rather than with paper and pencil. The findings provide novel insights regarding the influence that cognitive aging may have on metacognitive accuracy and text processing in an everyday task.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analyses of various kinds. This paper presents a novel multidimensional projection technique based on least square approximations. The approximations compute the coordinates of a set of projected points based on the coordinates of a reduced number of control points with defined geometry. We name the technique Least Square Projections ( LSP). From an initial projection of the control points, LSP defines the positioning of their neighboring points through a numerical solution that aims at preserving a similarity relationship between the points given by a metric in mD. In order to perform the projection, a small number of distance calculations are necessary, and no repositioning of the points is required to obtain a final solution with satisfactory precision. The results show the capability of the technique to form groups of points by degree of similarity in 2D. We illustrate that capability through its application to mapping collections of textual documents from varied sources, a strategic yet difficult application. LSP is faster and more accurate than other existing high-quality methods, particularly where it was mostly tested, that is, for mapping text sets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This text aims to elucidate the textual processing strategies that give meaning to chronicle "The miracle of the leaves", of Clarice Lispector (1920-1977). Therefore, it starts from the basic postulate that the meaning is not in the fabric or verbal imagery, but the meanings constructed from the elements of language that are there to meet with the reader (BRONCKART, 2009, p. 257). Justified this reading strategy, since the chronic Lispector reveals multiple meanings endowed with a language and dialogue. The text processing is discussed in terms of an audience potentially youthful element that assigns characteristics relevant to the understanding of the functioning of this type of text in relation to younger readers. To achieve the goal, we start from the assumption that chronic, being a literary text endowed with aesthetic validity, not only conveys a content, but recreates it, adding to it new meanings.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The exponential increase of subjective, user-generated content since the birth of the Social Web, has led to the necessity of developing automatic text processing systems able to extract, process and present relevant knowledge. In this paper, we tackle the Opinion Retrieval, Mining and Summarization task, by proposing a unified framework, composed of three crucial components (information retrieval, opinion mining and text summarization) that allow the retrieval, classification and summarization of subjective information. An extensive analysis is conducted, where different configurations of the framework are suggested and analyzed, in order to determine which is the best one, and under which conditions. The evaluation carried out and the results obtained show the appropriateness of the individual components, as well as the framework as a whole. By achieving an improvement over 10% compared to the state-of-the-art approaches in the context of blogs, we can conclude that subjective text can be efficiently dealt with by means of our proposed framework.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El campo de procesamiento de lenguaje natural (PLN), ha tenido un gran crecimiento en los últimos años; sus áreas de investigación incluyen: recuperación y extracción de información, minería de datos, traducción automática, sistemas de búsquedas de respuestas, generación de resúmenes automáticos, análisis de sentimientos, entre otras. En este artículo se presentan conceptos y algunas herramientas con el fin de contribuir al entendimiento del procesamiento de texto con técnicas de PLN, con el propósito de extraer información relevante que pueda ser usada en un gran rango de aplicaciones. Se pueden desarrollar clasificadores automáticos que permitan categorizar documentos y recomendar etiquetas; estos clasificadores deben ser independientes de la plataforma, fácilmente personalizables para poder ser integrados en diferentes proyectos y que sean capaces de aprender a partir de ejemplos. En el presente artículo se introducen estos algoritmos de clasificación, se analizan algunas herramientas de código abierto disponibles actualmente para llevar a cabo estas tareas y se comparan diversas implementaciones utilizando la métrica F en la evaluación de los clasificadores.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we present a complete system for the treatment of both geographical and temporal dimensions in text and its application to information retrieval. This system has been evaluated in both the GeoTime task of the 8th and 9th NTCIR workshop in the years 2010 and 2011 respectively, making it possible to compare the system to contemporary approaches to the topic. In order to participate in this task we have added the temporal dimension to our GIR system. The system proposed here has a modular architecture in order to add or modify features. In the development of this system, we have followed a QA-based approach as well as multi-search engines to improve the system performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes the followed methodology to automatically generate titles for a corpus of questions that belong to sociological opinion polls. Titles for questions have a twofold function: (1) they are the input of user searches and (2) they inform about the whole contents of the question and possible answer options. Thus, generation of titles can be considered as a case of automatic summarization. However, the fact that summarization had to be performed over very short texts together with the aforementioned quality conditions imposed on new generated titles led the authors to follow knowledge-rich and domain-dependent strategies for summarization, disregarding the more frequent extractive techniques for summarization.