4 resultados para Text-Encoding of Medieval Manuscripts


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In Harley 2782, Servius’s late antique commentary on Vergil was transmitted as an independent text, edited, corrected, glossed, marked for mythological information, provided with NOTA monograms and headings, as well as interspersed and augmented with scholia adespota and non-Servian material. The scholarly conventions attested in this manuscript show the kinds of critical apparatus that fed into the early medieval appropriation of Vergil and above all demonstrate that Servius was a staple of the Carolingian world.

Dans le manuscrit Harley 2782, le commentaire tardo-antique de Servius sur Virgile a été transmis sous forme d’un document indépendant, modifié, corrigé, glosé, préparé pour retrouver les informations mythologiques, enrichi de monogrammes Nota, ainsi qu’il a été entrecoupé et augmenté à l’aide des scholia adespota et du matériel non Servien. Les conventions scolastiques attestées par ce manuscrit montrent quels types d’apparats critiques alimentaient le début de l’appropriation médiévale de Virgile et surtout prouvent que Servius était une composante essentielle du monde carolingien.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background and aims: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.

Materials and methods: The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: ‘semi-structured’ and ‘unstructured’. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.

Results: The best result of 99.4% accuracy – which included only one semi-structured report predicted as unstructured – was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.

Conclusions: These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This project surveyed the Sundarban Anchalik Sangrahasala Collection housed in the Sundarban area of South 24 Parganas District of West Bengal.
The survey includes the historical background to the collection, the challenges the team faced, the listing for the manuscripts that were digitised, and an inventory of the manuscripts housed at Sundarban Anchalik Sangrahasala (pages 18-35)
Further details about the project can be found on the EAP website, as well as the digital versions of manuscripts that were copied.