Biblioteca Digital

906 resultados para Handwritten text

Automatic ICD-10 classification of cancers from free-text death certificates

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.

Handwritten dedication on verso of reproduction of painted portrait of Marcus Elias Marcus.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

See F 83565, Reproduction of painted portrait of Marcus Elias Marcus, age 76

Portrait of man (name handwritten on verso illegible).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Digital Image

Dynamic Space Warping Of Strokes For Recognition Of Online Handwritten Characters

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper suggests a scheme for classifying online handwritten characters, based on dynamic space warping of strokes within the characters. A method for segmenting components into strokes using velocity profiles is proposed. Each stroke is a simple arbitrary shape and is encoded using three attributes. Correspondence between various strokes is established using Dynamic Space Warping. A distance measure which reliably differentiates between two corresponding simple shapes (strokes) has been formulated thus obtaining a perceptual distance measure between any two characters. Tests indicate an accuracy of over 85% on two different datasets of characters.

Mediation of improvements in sun protective and skin self-examination behaviours: Results from the healthy text study

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective Melanoma is on the rise, especially in Caucasian populations exposed to high ultraviolet radiation such as in Australia. This paper examined the psychological components facilitating change in skin cancer prevention or early detection behaviours following a text message intervention. Methods The Queensland-based participants were 18 to 42 years old, from the Healthy Text study (N = 546). Overall, 512 (94%) participants completed the 12-month follow-up questionnaires. Following the social cognitive model, potential mediators of skin self-examination (SSE) and sun protection behaviour change were examined using stepwise logistic regression models. Results At 12-month follow-up, odds of performing an SSE in the past 12 months were mediated by baseline confidence in finding time to check skin (an outcome expectation), with a change in odds ratio of 11.9% in the SSE group versus the control group when including the mediator. Odds of greater than average sun protective habits index at 12-month follow-up were mediated by (a) an attempt to get a suntan at baseline (an outcome expectation) and (b) baseline sun protective habits index, with a change in odds ratio of 10.0% and 11.8%, respectively in the SSE group versus the control group. Conclusions Few of the suspected mediation pathways were confirmed with the exception of outcome expectations and past behaviours. Future intervention programmes could use alternative theoretical models to elucidate how improvements in health behaviours can optimally be facilitated.

Democratic Party invitation (form letter)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Form political letter. Printed on Executive Committee stationery with handwritten salutation in blue ink. Invitation to be a Vice President to ratify nominations of Greely and Brown, Democratic party.

Letter to Dr. Simeon Leo regarding a donation to Mount Sinai Hospital

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Letter to Dr. Simeon Leo from Joseph Scherer about a donation to Mount Sinai Hospital. Handwritten in English script. Folded.

Tree Matching Problems with Applications to Structured Text Databases

Relevância:

20.00% 20.00%

Publicador:

Repetition-Based Text Indexes

Relevância:

20.00% 20.00%

Publicador:

Indexing Heterogeneous XML for Full-Text Search

Relevância:

20.00% 20.00%

Publicador:

Resumo:

XML documents are becoming more and more common in various environments. In particular, enterprise-scale document management is commonly centred around XML, and desktop applications as well as online document collections are soon to follow. The growing number of XML documents increases the importance of appropriate indexing methods and search tools in keeping the information accessible. Therefore, we focus on content that is stored in XML format as we develop such indexing methods. Because XML is used for different kinds of content ranging all the way from records of data fields to narrative full-texts, the methods for Information Retrieval are facing a new challenge in identifying which content is subject to data queries and which should be indexed for full-text search. In response to this challenge, we analyse the relation of character content and XML tags in XML documents in order to separate the full-text from data. As a result, we are able to both reduce the size of the index by 5-6\% and improve the retrieval precision as we select the XML fragments to be indexed. Besides being challenging, XML comes with many unexplored opportunities which are not paid much attention in the literature. For example, authors often tag the content they want to emphasise by using a typeface that stands out. The tagged content constitutes phrases that are descriptive of the content and useful for full-text search. They are simple to detect in XML documents, but also possible to confuse with other inline-level text. Nonetheless, the search results seem to improve when the detected phrases are given additional weight in the index. Similar improvements are reported when related content is associated with the indexed full-text including titles, captions, and references. Experimental results show that for certain types of document collections, at least, the proposed methods help us find the relevant answers. Even when we know nothing about the document structure but the XML syntax, we are able to take advantage of the XML structure when the content is indexed for full-text search.

vSpeak: Edge detection based feature extraction for sign to text conversion

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents 'vSpeak', the first initiative taken in Pakistan for ICT enabled conversion of dynamic Sign Urdu gestures into natural language sentences. To realize this, vSpeak has adopted a novel approach for feature extraction using edge detection and image compression which gives input to the Artificial Neural Network that recognizes the gesture. This technique caters for the blurred images as well. The training and testing is currently being performed on a dataset of 200 patterns of 20 words from Sign Urdu with target accuracy of 90% and above.

Line removal and restoration of handwritten strokes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In document images, we often find printed lines over-lapping with hand written elements especially in case of signatures. Typical examples of such images are bank cheques and payment slips. Although the detection and removal of the horizontal lines has been addressed, the restoration of the handwritten area after removal of lines, persists to be a problem of interest. lit this paper, we propose a method for line removal and restoration of the erased areas of the handwritten elements. Subjective evaluation of the results have been conducted to analyze the effectiveness of the proposed method. The results are promising with an accuracy of 86.33%. The entire Process takes less than half a second for completion on a 2.4 GHz 512 MB RAM Pentium IV PC for a document image.

Overview of the 2015 ALTA shared task: Identifying French cognates in English text

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an overview of the 6th ALTA shared task that ran in 2015. The task was to identify in English texts all the potential cognates from the perspective of the French language. In other words, identify all the words in the English text that would acceptably translate into a similar word in French. We present the motivations for the task, the description of the data and the results of the 4 participating teams. We discuss the results against a baseline and prior work.

Machine recognition of online handwritten Devanagari characters

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we describe a system for the automatic recognition of isolated handwritten Devanagari characters obtained by linearizing consonant conjuncts. Owing to the large number of characters and resulting demands on data acquisition, we use structural recognition techniques to reduce some characters to others. The residual characters are then classified using the subspace method. Finally the results of structural recognition and feature-based matching are mapped to give final output. The proposed system Ifs evaluated for the writer dependent scenario.

New Method for Delexicalization and its Application to Prosodic Tagging for Text-to-Speech Synthesis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a new flexible delexicalization method based on glottal excited parametric speech synthesis scheme. The system utilizes inverse filtered glottal flow and all-pole modelling of the vocal tract. The method provides a possibil- ity to retain and manipulate all relevant prosodic features of any kind of speech. Most importantly, the features include voice quality, which has not been properly modeled in earlier delex- icalization methods. The functionality of the new method was tested in a prosodic tagging experiment aimed at providing word prominence data for a text-to-speech synthesis system. The ex- periment confirmed the usefulness of the method and further corroborated earlier evidence that linguistic factors influence the perception of prosodic prominence.

«
1
2
...
8
9
10
11
12
13
14
...
60
61
»