968 resultados para XML-document
Resumo:
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally and semantically similar sub-trees), and to allow the end-user to adjust the comparison process according to her requirements. Our framework consists of four main modules for (i) discovering the structural commonalities between sub-trees, (ii) identifying sub-tree semantic resemblances, (iii) computing tree-based edit operations costs, and (iv) computing tree edit distance. Experimental results demonstrate higher comparison accuracy with respect to alternative methods, while timing experiments reflect the impact of semantic similarity on overall system performance.
Resumo:
XML Schema is one of the most used specifications for defining types of XML documents. It provides an extensive set of primitive data types, ways to extend and reuse definitions and an XML syntax that simplifies automatic manipulation. However, many features that make XML Schema Definitions (XSD) so interesting also make them rather cumbersome to read. Several tools to visualize and browse schema definitions have been proposed to cope with this issue. The novel approach proposed in this paper is to base XSD visualization and navigation on the XML document itself, using solely the web browser, without requiring a pre-processing step or an intermediate representation. We present the design and implementation of a web-based XML Schema browser called schem@Doc that operates over the XSD file itself. With this approach, XSD visualization is synchronized with the source file and always reflects its current state. This tool fits well in the schema development process and is easy to integrate in web repositories containing large numbers of XSD files.
Resumo:
En aquest TFC s'ha proposat dissenyar i implementar en llenguatge Java un sistema segur de descàrrega anònima de fitxers. Per fer aquesta tasca es proposa un conjunt d'aplicacions web que es s'intercanviaran dades en format XML de forma segura, sempre sota protocol HTTPS i mantenint la integritat, autenticitat i autenticació de les parts amb l'ajut d'un PKI.
Resumo:
Conèixer les diferents opcions a l'hora d'emmagatzemar dades i documents amb format XML. Familiaritzar-se amb els SGBD nadius i l'accés i maneig de la informació. Conèixer diferents formes d'accés al SGBD, conèixer les APIs disponible Centre en el llenguatge Java. Integració de tots els coneixements adquirits, desenvolupant una aplicació que accedeixi i gestioni dades emmagatzemades al SGBD.
Resumo:
El objetivo de este proyecto es familiarizarse con las tecnologías de Semántica, entender que es una ontología y aprender a modelar una en un dominio elegido por nosotros. Realizar un parser que conectándose a la la Wikipedia y/o DBpedia rellene dicha ontología permitiendo al usuario navegar por sus conceptos y estudiar sus relaciones.
Resumo:
Much has been written about the Semantic Web, on the theory, its possible applications, how to implement it, how to make it effective and commercially attractive ... However, this work aims to be more specific. It is attempted to address a detailed analysis of the technologies that use concepts and tools of the Semantic Web, verifying their actual usefulness, quantifying as far as possible the impact and trying to extrapolate data about its future.It will be an analysis of what the Semantic Web is, how it is defined, which languages are the most appropriate for their development, the commercial applications that can be developed with Semantic Web technology, the applications that are currently running with this technology what kind of deployment of the Semantic Web currently exists, the real use of the Semantic Web technology...
Resumo:
Nykyään kolmeen kerrokseen perustuvat client-server –sovellukset ovat suuri kinnostuskohde sekä niiden kehittäjille etta käyttäjille. Tietotekniikan nopean kehityksen ansiosta näillä sovelluksilla on monipuolinen käyttö teollisuuden eri alueilla. Tällä hetkellä on olemassa paljon työkaluja client-server –sovellusten kehittämiseen, jotka myös tyydyttävät asiakkaiden asettamia vaatimuksia. Nämä työkalut eivät kuitenkaan mahdollista joustavaa toimintaa graafisen käyttöliittyman kanssa. Tämä diplomityö käsittelee client-server –sovellusten kehittamistä XML –kielen avulla. Tämä lähestymistapa mahdollistaa client-server –sovellusten rakentamista niin, että niiden graafinen käyttöliittymä ja ulkonäkö olisivat helposti muokattavissa ilman ohjelman ytimen uudelleenkääntämistä. Diplomityö koostuu kahdesta ostasta: teoreettisesta ja käytännöllisestä. Teoreettinen osa antaa yleisen tiedon client-server –arkkitehtuurista ja kuvailee ohjelmistotekniikan pääkohdat. Käytannöllinen osa esittää tulokset, client-server –sovellusten kehittämisteknologian kehittämislähestymistavan XML: ää käyttäen ja tuloksiin johtavat usecase– ja sekvenssidiagrammit. Käytännöllinen osa myos sisältää esimerkit toteutetuista XML-struktuureista, jotka kuvaavat client –sovellusten kuvaruutukaavakkeiden esintymisen ja serverikyselykaaviot.
Resumo:
This thesis aims at investigating a new approach to document analysis based on the idea of structural patterns in XML vocabularies. My work is founded on the belief that authors do naturally converge to a reasonable use of markup languages and that extreme, yet valid instances are rare and limited. Actual documents, therefore, may be used to derive classes of elements (patterns) persisting across documents and distilling the conceptualization of the documents and their components, and may give ground for automatic tools and services that rely on no background information (such as schemas) at all. The central part of my work consists in introducing from the ground up a formal theory of eight structural patterns (with three sub-patterns) that are able to express the logical organization of any XML document, and verifying their identifiability in a number of different vocabularies. This model is characterized by and validated against three main dimensions: terseness (i.e. the ability to represent the structure of a document with a small number of objects and composition rules), coverage (i.e. the ability to capture any possible situation in any document) and expressiveness (i.e. the ability to make explicit the semantics of structures, relations and dependencies). An algorithm for the automatic recognition of structural patterns is then presented, together with an evaluation of the results of a test performed on a set of more than 1100 documents from eight very different vocabularies. This language-independent analysis confirms the ability of patterns to capture and summarize the guidelines used by the authors in their everyday practice. Finally, I present some systems that work directly on the pattern-based representation of documents. The ability of these tools to cover very different situations and contexts confirms the effectiveness of the model.
Resumo:
This paper describes a tool for recombining the logical structure from an XML document with the typeset appearance of the corresponding PDF document. The tool uses the XML representation as a template for the insertion of the logical structure into the existing PDF document, thereby creating a Structured/Tagged PDF. The addition of logical structure adds value to the PDF in three ways: the accessibility is improved (PDF screen readers for visually impaired users perform better), media options are enhanced (the ability to reflow PDF documents, using structure as a guide, makes PDF viable for use on hand-held devices) and the re-usability of the PDF documents benefits greatly from the presence of an XML-like structure tree to guide the process of text retrieval in reading order (e.g. when interfacing to XML applications and databases).
Resumo:
Documents are often marked up in XML-based tagsets to delineate major structural components such as headings, paragraphs, figure captions and so on, without much regard to their eventual displayed appearance. And yet these same abstract documents, after many transformations and 'typesetting' processes, often emerge in the popular format of Adobe PDF, either for dissemination or archiving. Until recently PDF has been a totally display-based document representation, relying on the underlying PostScript semantics of PDF. Early versions of PDF had no mechanism for retaining any form of abstract document structure but recent releases have now introduced an internal structure tree to create the so called 'Tagged PDF'. This paper describes the development of a plugin for Adobe Acrobat which creates a two-window display. In one window is shown an XML document original and in the other its Tagged PDF counterpart is seen, with an internal structure tree that, in some sense, matches the one seen in XML. If a component is highlighted in either window then the corresponding structured item, with any attendant text, is also highlighted in the other window. Important applications of correctly Tagged PDF include making PDF documents reflow intelligently on small screen devices and enabling them to be read out in correct reading order, via speech synthesiser software, for the visually impaired. By tracing structure transformation from source document to destination one can implement the repair of damaged PDF structure or the adaptation of an existing structure tree to an incrementally updated document.
Resumo:
In this work, we propose a Geographical Information System that can be used as a tool for the treatment and study of problems related with environmental and city management issues. It is based on the Scalable Vector Graphics (SVG) standard for Web development of graphics. The project uses the concept of remate and real-time mar creation by database access through instructions executed by browsers on the Internet. As a way of proving the system effectiveness, we present two study cases;.the first on a region named Maracajaú Coral Reefs, located in Rio Grande do Norte coast, and the second in the Switzerland Northeast in which we intended to promote the substitution of MapServer by the system proposed here. We also show some results that demonstrate the larger geographical data capability achieved by the use of the standardized codes and open source tools, such as Extensible Markup Language (XML), Document Object Model (DOM), script languages ECMAScript/ JavaScript, Hypertext Preprocessor (PHP) and PostgreSQL and its extension, PostGIS
Resumo:
El trabajo presentado a lo largo de este documento es el resultado del TFG1 realizado por Israel Suárez Santiago, alumno de la Escuela Técnica Superior de Ingenieros Informáticos (ETSIINF) de la Universidad Politécnica de Madrid (UPM). Dicho trabajo tiene como finalidad proporcionar una herramienta que, basada en estándares previamente estudiados, permita la fácil creación y gestión de plantillas de mensajes HL7v32 a las que posteriormente se le añadirán datos clínicos que serán insertados en una base de datos para su fácil acceso y consulta. La herramienta desarrollada únicamente facilita una serie de opciones para la creación de la plantilla en sí, que servirá como base para la creación de mensajes HL7v3, es decir, no permite la inclusión de datos específicos en las plantillas generadas, que deberá hacerse con alguna herramienta externa o bien manualmente. Las plantillas generadas por la herramienta se basan principalmente en el estándar CDA3, que proporciona una amplia guía para la correcta generación de mensajes HL7v3. La herramienta garantiza que las plantillas resultantes estarán correctamente formadas, siendo acordes al estándar anteriormente citado y siendo, además, sintácticamente correctas, es decir, el documento .xml generado no contendrá errores. ---ABSTRACT---This document is the result of the TFG developed by Israel Suárez Santiago, student of Escuela Técnica Superior de Ingenieros Informáticos (ETSIINF) of the Universidad Politécnica de Madrid (UPM). This work aims to offer a tool based on standards that can facilitate and manage the creation of HL7v3 templates. Clinical data will be added to those templates in order to load them into a database and query them fast and easily. The tool only facilitates several options to create the template, that will be used to generate the HL7v3 messages, but it does not permit the inclusion of data on them. The inclusion of data will be done manually or using an external tool. The generated templates are based mainly on the CDA1 standard, that provides a widely guide to create HL7v32 messages. The tool guarantees that the resulting templates have been correctly generated, following the previous standard and with no errors in the .xml document generated.
Resumo:
We argue that, for certain constrained domains, elaborate model transformation technologies-implemented from scratch in general-purpose programming languages-are unnecessary for model-driven engineering; instead, lightweight configuration of commercial off-the-shelf productivity tools suffices. In particular, in the CancerGrid project, we have been developing model-driven techniques for the generation of software tools to support clinical trials. A domain metamodel captures the community's best practice in trial design. A scientist authors a trial protocol, modelling their trial by instantiating the metamodel; customized software artifacts to support trial execution are generated automatically from the scientist's model. The metamodel is expressed as an XML Schema, in such a way that it can be instantiated by completing a form to generate a conformant XML document. The same process works at a second level for trial execution: among the artifacts generated from the protocol are models of the data to be collected, and the clinician conducting the trial instantiates such models in reporting observations-again by completing a form to create a conformant XML document, representing the data gathered during that observation. Simple standard form management tools are all that is needed. Our approach is applicable to a wide variety of information-modelling domains: not just clinical trials, but also electronic public sector computing, customer relationship management, document workflow, and so on. © 2012 Springer-Verlag.
Resumo:
Clinical decision support systems (CDSSs) often base their knowledge and advice on human expertise. Knowledge representation needs to be in a format that can be easily understood by human users as well as supporting ongoing knowledge engineering, including evolution and consistency of knowledge. This paper reports on the development of an ontology specification for managing knowledge engineering in a CDSS for assessing and managing risks associated with mental-health problems. The Galatean Risk and Safety Tool, GRiST, represents mental-health expertise in the form of a psychological model of classification. The hierarchical structure was directly represented in the machine using an XML document. Functionality of the model and knowledge management were controlled using attributes in the XML nodes, with an accompanying paper manual for specifying how end-user tools should behave when interfacing with the XML. This paper explains the advantages of using the web-ontology language, OWL, as the specification, details some of the issues and problems encountered in translating the psychological model to OWL, and shows how OWL benefits knowledge engineering. The conclusions are that OWL can have an important role in managing complex knowledge domains for systems based on human expertise without impeding the end-users' understanding of the knowledge base. The generic classification model underpinning GRiST makes it applicable to many decision domains and the accompanying OWL specification facilitates its implementation.
Resumo:
Este proyecto presenta el desarrollo de una aplicación que permite traducir Redes de Petri Coloreadas diseñadas en CPN Tools a un lenguaje para la generación de ficheros de entrada a un simulador/optimizador de Redes de Petri Coloreadas. De esta manera se podrán optimizar modelos creados en CPN Tools ya que esta herramienta no facilita la optimización. Todo el proyecto se ha realizado en C++.