977 resultados para XML Markup Language
Resumo:
In this paper, we classify, review, and experimentally compare major methods that are exploited in the definition, adoption, and utilization of element similarity measures in the context of XML schema matching. We aim at presenting a unified view which is useful when developing a new element similarity measure, when implementing an XML schema matching component, when using an XML schema matching system, and when comparing XML schema matching systems.
Resumo:
This paper describes the approach taken to the clustering task at INEX 2009 by a group at the Queensland University of Technology. The Random Indexing (RI) K-tree has been used with a representation that is based on the semantic markup available in the INEX 2009 Wikipedia collection. The RI K-tree is a scalable approach to clustering large document collections. This approach has produced quality clustering when evaluated using two different methodologies.
Resumo:
Nowadays people heavily rely on the Internet for information and knowledge. Wikipedia is an online multilingual encyclopaedia that contains a very large number of detailed articles covering most written languages. It is often considered to be a treasury of human knowledge. It includes extensive hypertext links between documents of the same language for easy navigation. However, the pages in different languages are rarely cross-linked except for direct equivalent pages on the same subject in different languages. This could pose serious difficulties to users seeking information or knowledge from different lingual sources, or where there is no equivalent page in one language or another. In this thesis, a new information retrieval task—cross-lingual link discovery (CLLD) is proposed to tackle the problem of the lack of cross-lingual anchored links in a knowledge base such as Wikipedia. In contrast to traditional information retrieval tasks, cross language link discovery algorithms actively recommend a set of meaningful anchors in a source document and establish links to documents in an alternative language. In other words, cross-lingual link discovery is a way of automatically finding hypertext links between documents in different languages, which is particularly helpful for knowledge discovery in different language domains. This study is specifically focused on Chinese / English link discovery (C/ELD). Chinese / English link discovery is a special case of cross-lingual link discovery task. It involves tasks including natural language processing (NLP), cross-lingual information retrieval (CLIR) and cross-lingual link discovery. To justify the effectiveness of CLLD, a standard evaluation framework is also proposed. The evaluation framework includes topics, document collections, a gold standard dataset, evaluation metrics, and toolkits for run pooling, link assessment and system evaluation. With the evaluation framework, performance of CLLD approaches and systems can be quantified. This thesis contributes to the research on natural language processing and cross-lingual information retrieval in CLLD: 1) a new simple, but effective Chinese segmentation method, n-gram mutual information, is presented for determining the boundaries of Chinese text; 2) a voting mechanism of name entity translation is demonstrated for achieving a high precision of English / Chinese machine translation; 3) a link mining approach that mines the existing link structure for anchor probabilities achieves encouraging results in suggesting cross-lingual Chinese / English links in Wikipedia. This approach was examined in the experiments for better, automatic generation of cross-lingual links that were carried out as part of the study. The overall major contribution of this thesis is the provision of a standard evaluation framework for cross-lingual link discovery research. It is important in CLLD evaluation to have this framework which helps in benchmarking the performance of various CLLD systems and in identifying good CLLD realisation approaches. The evaluation methods and the evaluation framework described in this thesis have been utilised to quantify the system performance in the NTCIR-9 Crosslink task which is the first information retrieval track of this kind.
Resumo:
基于内容的Pub/Sub系统能够为大范围软件集成提供松耦合的、可扩展的集成能力,越来越引起人们的重视.现存的方法主要集中在对单一事件的匹配,缺乏对复合事件模型及其订阅语言的研究.针对此问题,文章给出一种基于XML事件的复合事件模型及其订阅语言,该模型包括事件时序逻辑模型和事件复合模式,并扩展了XML—QL语言得到复合事件订阅语言EXML—QL支持事件复合模型和模式;最后,依据该订阅语言的特点构造复合事件匹配算法,给出该算法的分析和验证.
Resumo:
We present a type system, StaXML, which employs the stacked type syntax to represent essential aspects of the potential roles of XML fragments to the structure of complete XML documents. The simplest application of this system is to enforce well-formedness upon the construction of XML documents without requiring the use of templates or balanced "gap plugging" operators; this allows it to be applied to programs written according to common imperative web scripting idioms, particularly the echoing of unbalanced XML fragments to an output buffer. The system can be extended to verify particular XML applications such as XHTML and identifying individual XML tags constructed from their lexical components. We also present StaXML for PHP, a prototype precompiler for the PHP4 scripting language which infers StaXML types for expressions without assistance from the programmer.
A policy-definition language and prototype implementation library for policy-based autonomic systems
Resumo:
This paper presents work towards generic policy toolkit support for autonomic computing systems in which the policies themselves can be adapted dynamically and automatically. The work is motivated by three needs: the need for longer-term policy-based adaptation where the policy itself is dynamically adapted to continually maintain or improve its effectiveness despite changing environmental conditions; the need to enable non autonomics-expert practitioners to embed self-managing behaviours with low cost and risk; and the need for adaptive policy mechanisms that are easy to deploy into legacy code. A policy definition language is presented; designed to permit powerful expression of self-managing behaviours. The language is very flexible through the use of simple yet expressive syntax and semantics, and facilitates a very diverse policy behaviour space through both hierarchical and recursive uses of language elements. A prototype library implementation of the policy support mechanisms is described. The library reads and writes policies in well-formed XML script. The implementation extends the state of the art in policy-based autonomics through innovations which include support for multiple policy versions of a given policy type, multiple configuration templates, and meta-policies to dynamically select between policy instances and templates. Most significantly, the scheme supports hot-swapping between policy instances. To illustrate the feasibility and generalised applicability of these tools, two dissimilar example deployment scenarios are examined. The first is taken from an exploratory implementation of self-managing parallel processing, and is used to demonstrate the simple and efficient use of the tools. The second example demonstrates more-advanced functionality, in the context of an envisioned multi-policy stock trading scheme which is sensitive to environmental volatility
Resumo:
This paper outlines the design and development of a Java-based, unified and flexible natural language dialogue system that enables users to interact using natural language, e.g. speech. A number of software development issues are considered with the aim of designing an architecture that enables different discourse components to be readily and flexibly combined in a manner that permits information to be easily shared. Use of XML schemas assists this component interaction. The paper describes how a range of Java language features were employed to support the development of the architecture, providing an illustration of how a modern programming language makes tractable the development of a complex dialogue system.
Resumo:
El artículo está incluido en un número monográfico especial con los trabajos del I Simposio Pluridisciplinar sobre Diseño, Evaluación y Descripción de Contenidos Educativos Reutilizables (Guadalajara, Octubre 2004).Resumen tomado de la publicación
Resumo:
XML has become an important medium for data exchange, and is frequently used as an interface to - i.e. a view of - a relational database. Although lots of work have been done on querying relational databases through XML views, the problem of updating relational databases through XML views has not received much attention. In this work, we give the rst steps towards solving this problem. Using query trees to capture the notions of selection, projection, nesting, grouping, and heterogeneous sets found throughout most XML query languages, we show how XML views expressed using query trees can be mapped to a set of corresponding relational views. Thus, we transform the problem of updating relational databases through XML views into a classical problem of updating relational databases through relational views. We then show how updates on the XML view are mapped to updates on the corresponding relational views. Existing work on updating relational views can then be leveraged to determine whether or not the relational views are updatable with respect to the relational updates, and if so, to translate the updates to the underlying relational database. Since query trees are a formal characterization of view de nition queries, they are not well suited for end-users. We then investigate how a subset of XQuery can be used as a top level language, and show how query trees can be used as an intermediate representation of view de nitions expressed in this subset.
Resumo:
This thesis aims at investigating a new approach to document analysis based on the idea of structural patterns in XML vocabularies. My work is founded on the belief that authors do naturally converge to a reasonable use of markup languages and that extreme, yet valid instances are rare and limited. Actual documents, therefore, may be used to derive classes of elements (patterns) persisting across documents and distilling the conceptualization of the documents and their components, and may give ground for automatic tools and services that rely on no background information (such as schemas) at all. The central part of my work consists in introducing from the ground up a formal theory of eight structural patterns (with three sub-patterns) that are able to express the logical organization of any XML document, and verifying their identifiability in a number of different vocabularies. This model is characterized by and validated against three main dimensions: terseness (i.e. the ability to represent the structure of a document with a small number of objects and composition rules), coverage (i.e. the ability to capture any possible situation in any document) and expressiveness (i.e. the ability to make explicit the semantics of structures, relations and dependencies). An algorithm for the automatic recognition of structural patterns is then presented, together with an evaluation of the results of a test performed on a set of more than 1100 documents from eight very different vocabularies. This language-independent analysis confirms the ability of patterns to capture and summarize the guidelines used by the authors in their everyday practice. Finally, I present some systems that work directly on the pattern-based representation of documents. The ability of these tools to cover very different situations and contexts confirms the effectiveness of the model.
Resumo:
In this paper we present XSAMPL3D, a novel language for the high-level representation of actions performed on objects by (virtual) humans. XSAMPL3D was designed to serve as action representation language in an imitation-based approach to character animation: First, a human demonstrates a sequence of object manipulations in an immersive Virtual Reality (VR) environment. From this demonstration, an XSAMPL3D description is automatically derived that represents the actions in terms of high-level action types and involved objects. The XSAMPL3D action description can then be used for the synthesis of animations where virtual humans of different body sizes and proportions reproduce the demonstrated action. Actions are encoded in a compact and human-readable XML-format. Thus, XSAMPL3D describtions are also amenable to manual authoring, e.g. for rapid prototyping of animations when no immersive VR environment is at the animator's disposal. However, when XSAMPL3D descriptions are derived from VR interactions, they can accomodate many details of the demonstrated action, such as motion trajectiories,hand shapes and other hand-object relations during grasping. Such detail would be hard to specify with manual motion authoring techniques only. Through the inclusion of language features that allow the representation of all relevant aspects of demonstrated object manipulations, XSAMPL3D is a suitable action representation language for the imitation-based approach to character animation.
Resumo:
I work in the field of Armenian historiography. This means I get to play with medieval manuscripts. The things I'm doing with the manuscripts are theoretically interesting, but pretty boring in practice, so I'm using Perl to program away the most boring bits. I will talk about the problems of text criticism in general, what sorts of things can and can't be done by the computer, my initial aversion to XML, how I was shown (some of) the error of my ways, and how I'm combining a bunch of isolated pieces of technology that were mostly already in use to achieve fame and fortune in the world of Armenian studies.
Resumo:
Clinical decision support systems (CDSSs) often base their knowledge and advice on human expertise. Knowledge representation needs to be in a format that can be easily understood by human users as well as supporting ongoing knowledge engineering, including evolution and consistency of knowledge. This paper reports on the development of an ontology specification for managing knowledge engineering in a CDSS for assessing and managing risks associated with mental-health problems. The Galatean Risk and Safety Tool, GRiST, represents mental-health expertise in the form of a psychological model of classification. The hierarchical structure was directly represented in the machine using an XML document. Functionality of the model and knowledge management were controlled using attributes in the XML nodes, with an accompanying paper manual for specifying how end-user tools should behave when interfacing with the XML. This paper explains the advantages of using the web-ontology language, OWL, as the specification, details some of the issues and problems encountered in translating the psychological model to OWL, and shows how OWL benefits knowledge engineering. The conclusions are that OWL can have an important role in managing complex knowledge domains for systems based on human expertise without impeding the end-users' understanding of the knowledge base. The generic classification model underpinning GRiST makes it applicable to many decision domains and the accompanying OWL specification facilitates its implementation.