987 resultados para HTML (Language for Labelling Documents)


Relevância:

30.00% 30.00%

Publicador:

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article suggests that the impact of long-term language contact between the languages of Irish, Scots and English in the province of Ulster led to a hybridisation of accent which challenges traditional ethnolinguistic differentiations - namely, the myth that Catholics and Protestants can be differentiated by their accent. The digitisation of archive recordings from the Tape Recorded Survey of Hiberno-English (TRSHE) permitted a detailed phonetic analysis of two speakers from Atticall, a rural townland in the Mourne Mountains with a unique geographical and linguistic setting, due to the close proximity of Ulster Scots and Irish speakers in the area. Phonological features associated with Irish, Northern English and Lowland Scots were garnered from previous dialectological research in Irish, English and Scots phonologies, which aided with the interpretation of the data. Other contemporaneous recordings from the TRSHE allowed further comparison of phonological features with areas of Ulster in which linguistic interaction between Scots and Irish was expected to be less prevalent, such as Arranmore, Donegal (primarily Irish) and Glarryford, Antrim (primarily Scots). Accommodation theory and substrate/superstrate interaction illuminate patterns of phonological transfer in Mourne, Arranmore and Glarryford English, supporting the conclusion that accent in contemporary Northern Ireland is built upon a linguistic heritage of contact and exchange, rather than political or ethnolinguistic division

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background and aims: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.

Materials and methods: The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: ‘semi-structured’ and ‘unstructured’. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.

Results: The best result of 99.4% accuracy – which included only one semi-structured report predicted as unstructured – was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.

Conclusions: These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper examines the relationship between football and language from a sociological point of view. This has often been couched in negative terms but the paper argues that such a view distorts the majority of ‘Football Talk’. The discourse surrounding football within everyday interactions is often positive and integrative. ‘Football Talk’ acts as a lingua franca amongst football supporters. This language code is therefore both inclusive and exclusive.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Speech and language ability is not a unitary concept; rather, it is made up of multiple abilities such as grammar, articulation and vocabulary. Young children from socio-economically deprived areas are more likely to experience language difficulties than those living in more affluent areas. However, less is known about individual differences in language difficulties amongst young children from socio-economically deprived backgrounds. The present research examined 172 four-year-old children from socio-economically deprived areas on standardised measures of core language, receptive vocabulary, articulation, information conveyed and grammar. Of the total sample, 26% had difficulty in at least one area of language. While most children with speech and language difficulty had generally low performance in all areas, around one in 10 displayed more uneven language abilities. For example, some children had generally good speech and language ability, but had specific difficulty with grammar. In such cases their difficulty is masked somewhat by good overall performance on language tests but they could still benefit from intervention in a specific area. The analysis also identified a number of typically achieving children who were identified as having borderline speech and language difficulty and should be closely monitored

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To assess the quality of the labels for clinical trial samples through current regulations, and to analyze its potential correlation with the specific characteristics of each sample. Method: A transversal multicenter study where the clinical trial samples from two third level hospitals were analyzed. The eleven items from Directive 2003/94/EC, as well as the name of the clinical trial and the dose on the label cover, were considered variables for labelling quality. The influence of the characteristics of each sample on labelling quality was also analyzed. Outcome: The study included 503 samples from 220 clinical trials. The mean quality of labelling, understood as the proportion of items from Appendix 13, was of 91.9%. Out of these, 6.6% did not include the name of the sample in the outer face of the label, while in 9.7% the dose was missing. The samples with clinical trial-type samples presented a higher quality (p < 0.049), blinding reduced their quality (p = 0.017), and identification by kit number or by patient increased it (p < 0.01). The promoter was the variable which introduced the highest variability into the analysis. Conclusions: The mean quality of labelling is adequate in the majority of clinical trial samples. The lack of essential information in some samples, such as the clinical trial code and the period of validity, is alarming and might be the potential source for dispensing or administration errors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two complementary de facto standards for the publication of electronic documents are HTML on theWorldWideWeb and Adobe s PDF (Portable Document Format) language for use with Acrobat viewers. Both these formats provide support for hypertext features to be embedded within documents. We present a method, which allows links and other hypertext material to be kept in an abstract form in separate link databases. The links can then be interpreted or compiled at any stage and applied, in the correct format to some specific representation such as HTML or PDF. This approach is of great value in keeping hyperlinks relevant, up-to-date and in a form which is independent of the finally delivered electronic document format. Four models are discussed for allowing publishers to insert links into documents at a late stage. The techniques discussed have been implemented using a combination of Acrobat plug-ins, Web servers and Web browsers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper draws a parallel between document preparation and the traditional processes of compilation and link editing for computer programs. A block-based document model is described which allows for separate compilation of various portions of a document. These portions are brought together and merged by a linker program, called dlink, whose pilot implementation is based on ditroff and on its underlying intermediate code. In the light of experiences with dlink the requirements for a universal object-module language for documents are discussed. These requirements often resemble the characteristics of the intermediate codes used by programming-language compilers but with interesting extra constraints which arise from the way documents are executed .