994 resultados para Langage de balisage XML


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Despite being poised as a standard for data exchange for operation and maintenance data, the database heritage of the MIMOSA OSA-EAI is clearly evident from using a relational model at its core. The XML schema (XSD) definitions, which are used for communication between asset management systems, are based on the MIMOSA common relational information schema (CRIS), a relational model, and consequently, many database concepts permeate the communications layer. The adoption of a relational model leads to several deficiencies, and overlooks advances in object-oriented approach for an upcoming version of the specification, and the common conceptual object model (CCOM) sees a transition to fully utilising object-oriented features for the standard. Unified modelling language (UML) is used as a medium for documentation as well as facilitating XSD code generation. This paper details some of the decisions faced in developing the CCOM and provides a glimpse into the future of asset management and data exchange models.

Relevância:

10.00% 10.00%

Publicador:

Relevância:

10.00% 10.00%

Publicador:

Relevância:

10.00% 10.00%

Publicador:

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This chapter deals with technical aspects of how USDL service descriptions can be read from and written to different representations for use by humans and tools. A combination of techniques for representing and exchanging USDL have been drawn from Model-Driven Engineering and Semantic Web technologies. The USDL language's structural definition is specified as a MOF meta-model, but some modules were originally defined using the OWL language from the Semantic Web community and translated to the meta-model format. We begin with the important topic of serializing USDL descriptions into XML, so that they can be exchanged beween editors, repositories, and other tools. The following topic is how USDL can be made available through the Semantic Web as a network of linked data, connected via URIs. Finally, consideration is given to human-readable representations of USDL descriptions, and how they can be generated, in large part, from the contents of a stored USDL model.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Divergence from a random baseline is a technique for the evaluation of document clustering. It ensures cluster quality measures are performing work that prevents ineffective clusterings from giving high scores to clusterings that provide no useful result. These concepts are defined and analysed using intrinsic and extrinsic approaches to the evaluation of document cluster quality. This includes the classical clusters to categories approach and a novel approach that uses ad hoc information retrieval. The divergence from a random baseline approach is able to differentiate ineffective clusterings encountered in the INEX XML Mining track. It also appears to perform a normalisation similar to the Normalised Mutual Information (NMI) measure but it can be applied to any measure of cluster quality. When it is applied to the intrinsic measure of distortion as measured by RMSE, subtraction from a random baseline provides a clear optimum that is not apparent otherwise. This approach can be applied to any clustering evaluation. This paper describes its use in the context of document clustering evaluation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Person re-identification involves recognising individuals in different locations across a network of cameras and is a challenging task due to a large number of varying factors such as pose (both subject and camera) and ambient lighting conditions. Existing databases do not adequately capture these variations, making evaluations of proposed techniques difficult. In this paper, we present a new challenging multi-camera surveillance database designed for the task of person re-identification. This database consists of 150 unscripted sequences of subjects travelling in a building environment though up to eight camera views, appearing from various angles and in varying illumination conditions. A flexible XML-based evaluation protocol is provided to allow a highly configurable evaluation setup, enabling a variety of scenarios relating to pose and lighting conditions to be evaluated. A baseline person re-identification system consisting of colour, height and texture models is demonstrated on this database.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Children and the environment cover a broad, interdisciplinary field of research and practice. The social sciences often use the word “environment” to mean the social, political, or economic context of children’s lives, but this bibliography covers physical settings. It focuses on a place-based scale that children can see, hear, taste, smell, touch, and navigate: not large, abstract scales such as national identities or population dynamics, or small scales such as environmental impacts on genes or cell functions. Attention to the everyday settings of children’s lives grew in the 18th century, when Romantic literature introduced the theme of children and nature. In the 19th century, concern for children’s welfare included an interest in conditions for children in burgeoning industrial cities, and justifications for early streetcar and railroad suburbs included claims that they would save children from the dangers of cities and provide the healthful benefits of natural surroundings. In the 20th century, academic disciplines developed different lines of inquiry about the impact of the physical environment on children and how children relate to places: ethnographic studies of children in different parts of the world in the fields of anthropology and geography; sociological studies of different populations of children in different settings; educational research on the learning opportunities that different school and out-of-school settings afford; medical research to understand disease vectors and the impact of pollutants on children; and efforts in the field of environment and behavior research more broadly, to understand how built and designed environments affect children physically, cognitively, socially, and emotionally. At the beginning of the 21st century, children and the environment is an active area of inquiry seeking to understand rapidly changing conditions for children as the world urbanizes, opportunities for free play outdoors and independent mobility erode in many parts of the world, media environments consume more of children’s time, and awareness grows that children need opportunities to contribute to creating sustainable societies.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we describe a machine-translated parallel English corpus for the NTCIR Chinese, Japanese and Korean (CJK) Wikipedia collections. This document collection is named CJK2E Wikipedia XML corpus. The corpus could be used by the information retrieval research community and knowledge sharing in Wikipedia in many ways; for example, this corpus could be used for experimentations in cross-lingual information retrieval, cross-lingual link discovery, or omni-lingual information retrieval research. Furthermore, the translated CJK articles could be used to further expand the current coverage of the English Wikipedia.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Due to the development of XML and other data models such as OWL and RDF, sharing data is an increasingly common task since these data models allow simple syntactic translation of data between applications. However, in order for data to be shared semantically, there must be a way to ensure that concepts are the same. One approach is to employ commonly usedschemas—called standard schemas —which help guarantee that syntactically identical objects have semantically similar meanings. As a result of the spread of data sharing, there has been widespread adoption of standard schemas in a broad range of disciplines and for a wide variety of applications within a very short period of time. However, standard schemas are still in their infancy and have not yet matured or been thoroughly evaluated. It is imperative that the data management research community takes a closer look at how well these standard schemas have fared in real-world applications to identify not only their advantages, but also the operational challenges that real users face. In this paper, we both examine the usability of standard schemas in a comparison that spans multiple disciplines, and describe our first step at resolving some of these issues in our Semantic Modeling System. We evaluate our Semantic Modeling System through a careful case study of the use of standard schemas in architecture, engineering, and construction, which we conducted with domain experts. We discuss how our Semantic Modeling System can help the broader problem and also discuss a number of challenges that still remain.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nowadays people heavily rely on the Internet for information and knowledge. Wikipedia is an online multilingual encyclopaedia that contains a very large number of detailed articles covering most written languages. It is often considered to be a treasury of human knowledge. It includes extensive hypertext links between documents of the same language for easy navigation. However, the pages in different languages are rarely cross-linked except for direct equivalent pages on the same subject in different languages. This could pose serious difficulties to users seeking information or knowledge from different lingual sources, or where there is no equivalent page in one language or another. In this thesis, a new information retrieval task—cross-lingual link discovery (CLLD) is proposed to tackle the problem of the lack of cross-lingual anchored links in a knowledge base such as Wikipedia. In contrast to traditional information retrieval tasks, cross language link discovery algorithms actively recommend a set of meaningful anchors in a source document and establish links to documents in an alternative language. In other words, cross-lingual link discovery is a way of automatically finding hypertext links between documents in different languages, which is particularly helpful for knowledge discovery in different language domains. This study is specifically focused on Chinese / English link discovery (C/ELD). Chinese / English link discovery is a special case of cross-lingual link discovery task. It involves tasks including natural language processing (NLP), cross-lingual information retrieval (CLIR) and cross-lingual link discovery. To justify the effectiveness of CLLD, a standard evaluation framework is also proposed. The evaluation framework includes topics, document collections, a gold standard dataset, evaluation metrics, and toolkits for run pooling, link assessment and system evaluation. With the evaluation framework, performance of CLLD approaches and systems can be quantified. This thesis contributes to the research on natural language processing and cross-lingual information retrieval in CLLD: 1) a new simple, but effective Chinese segmentation method, n-gram mutual information, is presented for determining the boundaries of Chinese text; 2) a voting mechanism of name entity translation is demonstrated for achieving a high precision of English / Chinese machine translation; 3) a link mining approach that mines the existing link structure for anchor probabilities achieves encouraging results in suggesting cross-lingual Chinese / English links in Wikipedia. This approach was examined in the experiments for better, automatic generation of cross-lingual links that were carried out as part of the study. The overall major contribution of this thesis is the provision of a standard evaluation framework for cross-lingual link discovery research. It is important in CLLD evaluation to have this framework which helps in benchmarking the performance of various CLLD systems and in identifying good CLLD realisation approaches. The evaluation methods and the evaluation framework described in this thesis have been utilised to quantify the system performance in the NTCIR-9 Crosslink task which is the first information retrieval track of this kind.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This project was a step forward in developing and evaluating a novel, mathematical model that can deduce the meaning of words based on their use in language. This model can be applied to a wide range of natural language applications, including the information seeking process most of us undertake on a daily basis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The INEX workshop is concerned with evaluating the effectiveness of XML retrieval systems. In 2004 a natural language query task was added to the INEX Ad hoc track. Standard INEX Ad hoc topic titles are specified in NEXI -- a simplified and restricted subset of XPath, with a similar feel, and yet with a distinct IR flavour and interpretation. The syntax of NEXI is rigid and it imposes some limitations on the kind of information need that it can faithfully capture. At INEX 2004 the NLP question to be answered was simple -- is it practical to use a natural language query that is the equivalent of the formal NEXI title? The results of this experiment are reported and some information on the future direction of the NLP task is presented.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This report describes the available functionality and use of the ClusterEval evaluation software. It implements novel and standard measures for the evaluation of cluster quality. This software has been used at the INEX XML Mining track and in the MediaEval Social Event Detection task.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Web is a steadily evolving resource comprising much more than mere HTML pages. With its ever-growing data sources in a variety of formats, it provides great potential for knowledge discovery. In this article, we shed light on some interesting phenomena of the Web: the deep Web, which surfaces database records as Web pages; the Semantic Web, which de�nes meaningful data exchange formats; XML, which has established itself as a lingua franca for Web data exchange; and domain-speci�c markup languages, which are designed based on XML syntax with the goal of preserving semantics in targeted domains. We detail these four developments in Web technology, and explain how they can be used for data mining. Our goal is to show that all these areas can be as useful for knowledge discovery as the HTML-based part of the Web.