960 resultados para 280205 Text Processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The activities of the Institute of Information Technologies in the area of automatic text processing are outlined. Major problems related to different steps of processing are pointed out together with the shortcomings of the existing solutions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The principal feature of ontology, which is developed for a text processing, is wider knowledge representation of an external world due to introduction of three-level hierarchy. It allows to improve semantic interpretation of natural language texts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Résumé : Une définition opérationnelle de la dyslexie qui est adéquate et pertinente à l'éducation n'a pu être identifiée suite à une recension des écrits. Les études sur la dyslexie se retrouvent principalement dans trois champs: la neurologie, la neurolinguistique et la génétique. Les résultats de ces recherches cependant, se limitent au domaine médical et ont peu d'utilité pour une enseignante ou un enseignant. La classification de la dyslexie de surface et la dyslexie profonde est la plus appropriée lorsque la dyslexie est définie comme trouble de lecture dans le contexte de l'éducation. L'objectif de ce mémoire était de développer un cadre conceptuel théorique dans lequel les troubles de lecture chez les enfants dyslexiques sont dû à une difficulté en résolution de problèmes dans le traitement de l'information. La validation du cadre conceptuel a été exécutée à l'aide d'un expert en psychologie cognitive, un expert en dyslexie et une enseignante. La perspective de la résolution de problèmes provient du traitement de l'information en psychologie cognitive. Le cadre conceptuel s'adresse uniquement aux troubles de lectures qui sont manifestés par les enfants dyslexiques.||Abstract : An extensive literature review failed to uncover an adequate operational definition of dyslexia applicable to education. The predominant fields of research that have produced most of the studies on dyslexia are neurology, neurolinguistics and genetics. Their perspectives were shown to be more pertinent to medical experts than to teachers. The categorization of surface and deep dyslexia was shown to be the best description of dyslexia in an educational context. The purpose of the present thesis was to develop a theoretical conceptual framework which describes a link between dyslexia, a text-processing model and problem solving. This conceptual framework was validated by three experts specializing in a specific field (either cognitive psychology, dyslexia or teaching). The concept of problem solving was based on information-processing theories in cognitive psychology. This framework applies specifically to reading difficulties which are manifested by dyslexic children.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with texts from a German newspaper. With nearly perfect reliability the SVM was able to reject other authors and detected the target author in 60–80% of the cases. In a second experiment, we ignored nouns, verbs and adjectives and replaced them by grammatical tags and bigrams. This resulted in slightly reduced performance. Author detection with SVMs on full word forms was remarkably robust even if the author wrote about different topics.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Machine learning techniques for prediction and rule extraction from artificial neural network methods are used. The hypothesis that market sentiment and IPO specific attributes are equally responsible for first-day IPO returns in the US stock market is tested. Machine learning methods used are Bayesian classifications, support vector machines, decision tree techniques, rule learners and artificial neural networks. The outcomes of the research are predictions and rules associated With first-day returns of technology IPOs. The hypothesis that first-day returns of technology IPOs are equally determined by IPO specific and market sentiment is rejected. Instead lower yielding IPOs are determined by IPO specific and market sentiment attributes, while higher yielding IPOs are largely dependent on IPO specific attributes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Experiences showed that developing business applications that base on text analysis normally requires a lot of time and expertise in the field of computer linguistics. Several approaches of integrating text analysis systems with business applications have been proposed, but so far there has been no coordinated approach which would enable building scalable and flexible applications of text analysis in enterprise scenarios. In this paper, a service-oriented architecture for text processing applications in the business domain is introduced. It comprises various groups of processing components and knowledge resources. The architecture, created as a result of our experiences with building natural language processing applications in business scenarios, allows for the reuse of text analysis and other components, and facilitates the development of business applications. We verify our approach by showing how the proposed architecture can be applied to create a text analytics enabled business application that addresses a concrete business scenario. © 2010 IEEE.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Clinical text understanding (CTU) is of interest to health informatics because critical clinical information frequently represented as unconstrained text in electronic health records are extensively used by human experts to guide clinical practice, decision making, and to document delivery of care, but are largely unusable by information systems for queries and computations. Recent initiatives advocating for translational research call for generation of technologies that can integrate structured clinical data with unstructured data, provide a unified interface to all data, and contextualize clinical information for reuse in multidisciplinary and collaborative environment envisioned by CTSA program. This implies that technologies for the processing and interpretation of clinical text should be evaluated not only in terms of their validity and reliability in their intended environment, but also in light of their interoperability, and ability to support information integration and contextualization in a distributed and dynamic environment. This vision adds a new layer of information representation requirements that needs to be accounted for when conceptualizing implementation or acquisition of clinical text processing tools and technologies for multidisciplinary research. On the other hand, electronic health records frequently contain unconstrained clinical text with high variability in use of terms and documentation practices, and without commitmentto grammatical or syntactic structure of the language (e.g. Triage notes, physician and nurse notes, chief complaints, etc). This hinders performance of natural language processing technologies which typically rely heavily on the syntax of language and grammatical structure of the text. This document introduces our method to transform unconstrained clinical text found in electronic health information systems to a formal (computationally understandable) representation that is suitable for querying, integration, contextualization and reuse, and is resilient to the grammatical and syntactic irregularities of the clinical text. We present our design rationale, method, and results of evaluation in processing chief complaints and triage notes from 8 different emergency departments in Houston Texas. At the end, we will discuss significance of our contribution in enabling use of clinical text in a practical bio-surveillance setting.