2 resultados para vocabularies
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
This thesis aims at investigating a new approach to document analysis based on the idea of structural patterns in XML vocabularies. My work is founded on the belief that authors do naturally converge to a reasonable use of markup languages and that extreme, yet valid instances are rare and limited. Actual documents, therefore, may be used to derive classes of elements (patterns) persisting across documents and distilling the conceptualization of the documents and their components, and may give ground for automatic tools and services that rely on no background information (such as schemas) at all. The central part of my work consists in introducing from the ground up a formal theory of eight structural patterns (with three sub-patterns) that are able to express the logical organization of any XML document, and verifying their identifiability in a number of different vocabularies. This model is characterized by and validated against three main dimensions: terseness (i.e. the ability to represent the structure of a document with a small number of objects and composition rules), coverage (i.e. the ability to capture any possible situation in any document) and expressiveness (i.e. the ability to make explicit the semantics of structures, relations and dependencies). An algorithm for the automatic recognition of structural patterns is then presented, together with an evaluation of the results of a test performed on a set of more than 1100 documents from eight very different vocabularies. This language-independent analysis confirms the ability of patterns to capture and summarize the guidelines used by the authors in their everyday practice. Finally, I present some systems that work directly on the pattern-based representation of documents. The ability of these tools to cover very different situations and contexts confirms the effectiveness of the model.
Resumo:
Il patrimonio culturale è l’espressione della comunità a cui si riferisce e il digitale può essere un valido strumento per raccontare le storie relative ai beni culturali affinché siano, non solo studiati, ma anche recepiti nel loro significato più profondo da più pubblici. L’inserimento di testi manoscritti sul web utilizzando le tecnologie dei Linked Data facilitano la fruizione del testo da parte dell’utente non specializzato e la creazione di strumenti per la ricerca. La proposta di digitalizzazione della tesi ha come oggetto la vita di Federico da Montefeltro scritta da Vespasiano da Bisticci utilizzando i vocabolari schema.org, FOAF e Relationship per la marcatura del testo e i Content Management System per la pubblicazione dei dati. In questo modo sarà possibile avere un sito web in cui potrà essere curato anche l’aspetto grafico seguendo le regole della user experience e dell’information achitecture per valorizzare le figure del duca di Urbino e del cartolaio fiorentino.