6 resultados para scholarly text editing
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
This work is concerned with the increasing relationships between two distinct multidisciplinary research fields, Semantic Web technologies and scholarly publishing, that in this context converge into one precise research topic: Semantic Publishing. In the spirit of the original aim of Semantic Publishing, i.e. the improvement of scientific communication by means of semantic technologies, this thesis proposes theories, formalisms and applications for opening up semantic publishing to an effective interaction between scholarly documents (e.g., journal articles) and their related semantic and formal descriptions. In fact, the main aim of this work is to increase the users' comprehension of documents and to allow document enrichment, discovery and linkage to document-related resources and contexts, such as other articles and raw scientific data. In order to achieve these goals, this thesis investigates and proposes solutions for three of the main issues that semantic publishing promises to address, namely: the need of tools for linking document text to a formal representation of its meaning, the lack of complete metadata schemas for describing documents according to the publishing vocabulary, and absence of effective user interfaces for easily acting on semantic publishing models and theories.
Resumo:
The need for a convergence between semi-structured data management and Information Retrieval techniques is manifest to the scientific community. In order to fulfil this growing request, W3C has recently proposed XQuery Full Text, an IR-oriented extension of XQuery. However, the issue of query optimization requires the study of important properties like query equivalence and containment; to this aim, a formal representation of document and queries is needed. The goal of this thesis is to establish such formal background. We define a data model for XML documents and propose an algebra able to represent most of XQuery Full-Text expressions. We show how an XQuery Full-Text expression can be translated into an algebraic expression and how an algebraic expression can be optimized.
Resumo:
The Ph chromosome is the most frequent cytogenetic aberration associated with adult ALL and it represents the single most significant adverse prognostic marker. Despite imatinib has led to significant improvements in the treatment of patients with Ph+ ALL, in the majority of cases resistance developed quickly and disease progressed. Some mechanisms of resistance have been widely described but the full knowledge of contributing factors, driving both the disease and resistance, remains to be defined. The observation of rapid development of lymphoblastic leukemia in mice expressing altered Ikaros (Ik) isoforms represented the background of this study. Ikaros is a zinc finger transcription factor required for normal hemopoietic differentiation and proliferation, particularly in the lymphoid lineages. By means of alternative splicing, Ikaros encodes several proteins that differ in their abilities to bind to a consensus DNA-binding site. Shorter, DNA nonbinding isoforms exert a dominant negative effect, inhibiting the ability of longer heterodimer partners to bind DNA. The differential expression pattern of Ik isoforms in Ph+ ALL patients was analyzed in order to determine if molecular abnormalities involving the Ik gene could associate with resistance to imatinib and dasatinib. Bone marrow and peripheral blood samples from 46 adult patients (median age 55 yrs, 18-76) with Ph+ ALL at diagnosis and during treatment with imatinib (16 pts) or dasatinib (30 pts) were collected. We set up a fast, high-throughput method based on capillary electrophoresis technology to detect and quantify splice variants. 41% Ph+ ALL patients expressed high levels of the non DNA-binding dominant negative Ik6 isoform lacking critical N-terminal zinc-fingers which display abnormal subcellular compartmentalization pattern. Nuclear extracts from patients expressed Ik6 failed to bind DNA in mobility shift assay using a DNA probe containing an Ikaros-specific DNA binding sequence. In 59% Ph+ ALL patients there was the coexistence in the same PCR sample and at the same time of many splice variants corresponded to Ik1, Ik2, Ik4, Ik4A, Ik5A, Ik6, Ik6 and Ik8 isoforms. In these patients aberrant full-length Ikaros isoforms in Ph+ ALL characterized by a 60-bp insertion immediately downstream of exon 3 and a recurring 30-bp in-frame deletion at the end of exon 7 involving most frequently the Ik2, Ik4 isoforms were also identified. Both the insertion and deletion were due to the selection of alternative splice donor and acceptor sites. The molecular monitoring of minimal residual disease showed for the first time in vivo that the Ik6 expression strongly correlated with the BCR-ABL transcript levels suggesting that this alteration could depend on the Bcr-Abl activity. Patient-derived leukaemia cells expressed dominant-negative Ik6 at diagnosis and at the time of relapse, but never during remission. In order to mechanistically demonstrated whether in vitro the overexpression of Ik6 impairs the response to tyrosine kinase inhibitors (TKIs) and contributes to resistance, an imatinib-sensitive Ik6-negative Ph+ ALL cell line (SUP-B15) was transfected with the complete Ik6 DNA coding sequence. The expression of Ik6 strongly increased proliferation and inhibited apoptosis in TKI sensitive cells establishing a previously unknown link between specific molecular defects that involve the Ikaros gene and the resistance to TKIs in Ph+ ALL patients. Amplification and genomic sequence analysis of the exon splice junction regions showed the presence of 2 single nucleotide polymorphisms (SNPs): rs10251980 [A/G] in the exon2/3 splice junction and of rs10262731 [A/G] in the exon 7/8 splice junction in 50% and 36% of patients, respectively. A variant of the rs11329346 [-/C], in 16% of patients was also found. Other two different single nucleotide substitutions not recognized as SNP were observed. Some mutations were predicted by computational analyses (RESCUE approach) to alter cis-splicing elements. In conclusion, these findings demonstrated that the post-transcriptional regulation of alternative splicing of Ikaros gene is defective in the majority of Ph+ ALL patients treated with TKIs. The overexpression of Ik6 blocking B-cell differentiation could contribute to resistance opening a time frame, during which leukaemia cells acquire secondary transforming events that confer definitive resistance to imatinib and dasatinib.
Resumo:
This study aims to the elaboration of juridical and administrative terminology in Ladin language, actually on the Ladin idiom spoken in Val Badia. The necessity of this study is strictly connected to the fact that in South Tyrol the Ladin language is not just safeguarded, but the editing of administrative and normative text is guaranteed by law. This means that there is a need for a unique terminology in order to support translators and editors of specialised texts. The starting point of this study are, on one side the need of a unique terminology, and on the other side the translation work done till now from the employees of the public administration in Ladin language. In order to document their efforts a corpus made up of digitalized administrative and normative documents was build. The first two chapters focuses on the state of the art of projects on terminology and corpus linguistics for lesser used languages. The information were collected thanks to the help of institutes, universities and researchers dealing with lesser used languages. The third chapter focuses on the development of administrative language in Ladin language and the fourth chapter focuses on the creation of the trilingual Italian – German – Ladin corpus made up of administrative and normative documents. The last chapter deals with the methodologies applied in order to elaborate the terminology entries in Ladin language though the use of the trilingual corpus. Starting from the terminology entry all steps are described, from term extraction, to the extraction of equivalents, contexts and definitions and of course also of the elaboration of translation proposals for not found equivalences. Finally the problems referring to the elaboration of terminology in Ladin language are illustrated.
Resumo:
Il lavoro consiste nella traduzione dell’adagio 2001, Herculei labores con commento delle righe 1-116, che comprendono il racconto della fatica di Ercole contro l’idra di Lerna e le interpretazioni che Erasmo ne fornisce per introdurre la filologia come impresa erculea in chiave autobiografica. L’introduzione ha lo scopo di presentare una sintesi degli elementi notevoli del commento e alcune osservazioni sull’autorappresentazione di sé dell’umanista. Erasmo fa dell’identificazione con Ercole un topos della propria descrizione in chiave ironica, ma si propone anche come emulo di Girolamo, di cui cura l’edizione delle lettere. Questo lavoro prende in considerazione infine il ritratto di Erasmo dipinto da Holbein e custodito a Longford Castle in relazione al testo dell’adagio, al quale allude con la scritta in primo piano, ΗΡΑΚΛΕΙΟΙ ΠΟΝΟΙ.
Resumo:
Information is nowadays a key resource: machine learning and data mining techniques have been developed to extract high-level information from great amounts of data. As most data comes in form of unstructured text in natural languages, research on text mining is currently very active and dealing with practical problems. Among these, text categorization deals with the automatic organization of large quantities of documents in priorly defined taxonomies of topic categories, possibly arranged in large hierarchies. In commonly proposed machine learning approaches, classifiers are automatically trained from pre-labeled documents: they can perform very accurate classification, but often require a consistent training set and notable computational effort. Methods for cross-domain text categorization have been proposed, allowing to leverage a set of labeled documents of one domain to classify those of another one. Most methods use advanced statistical techniques, usually involving tuning of parameters. A first contribution presented here is a method based on nearest centroid classification, where profiles of categories are generated from the known domain and then iteratively adapted to the unknown one. Despite being conceptually simple and having easily tuned parameters, this method achieves state-of-the-art accuracy in most benchmark datasets with fast running times. A second, deeper contribution involves the design of a domain-independent model to distinguish the degree and type of relatedness between arbitrary documents and topics, inferred from the different types of semantic relationships between respective representative words, identified by specific search algorithms. The application of this model is tested on both flat and hierarchical text categorization, where it potentially allows the efficient addition of new categories during classification. Results show that classification accuracy still requires improvements, but models generated from one domain are shown to be effectively able to be reused in a different one.