9 resultados para Markup Language for Manuscript Images
em Aston University Research Archive
Resumo:
Traditionally, geostatistical algorithms are contained within specialist GIS and spatial statistics software. Such packages are often expensive, with relatively complex user interfaces and steep learning curves, and cannot be easily integrated into more complex process chains. In contrast, Service Oriented Architectures (SOAs) promote interoperability and loose coupling within distributed systems, typically using XML (eXtensible Markup Language) and Web services. Web services provide a mechanism for a user to discover and consume a particular process, often as part of a larger process chain, with minimal knowledge of how it works. Wrapping current geostatistical algorithms with a Web service layer would thus increase their accessibility, but raises several complex issues. This paper discusses a solution to providing interoperable, automatic geostatistical processing through the use of Web services, developed in the INTAMAP project (INTeroperability and Automated MAPping). The project builds upon Open Geospatial Consortium standards for describing observations, typically used within sensor webs, and employs Geography Markup Language (GML) to describe the spatial aspect of the problem domain. Thus the interpolation service is extremely flexible, being able to support a range of observation types, and can cope with issues such as change of support and differing error characteristics of sensors (by utilising descriptions of the observation process provided by SensorML). XML is accepted as the de facto standard for describing Web services, due to its expressive capabilities which allow automatic discovery and consumption by ‘naive’ users. Any XML schema employed must therefore be capable of describing every aspect of a service and its processes. However, no schema currently exists that can define the complex uncertainties and modelling choices that are often present within geostatistical analysis. We show a solution to this problem, developing a family of XML schemata to enable the description of a full range of uncertainty types. These types will range from simple statistics, such as the kriging mean and variances, through to a range of probability distributions and non-parametric models, such as realisations from a conditional simulation. By employing these schemata within a Web Processing Service (WPS) we show a prototype moving towards a truly interoperable geostatistical software architecture.
Resumo:
This thesis provides an interoperable language for quantifying uncertainty using probability theory. A general introduction to interoperability and uncertainty is given, with particular emphasis on the geospatial domain. Existing interoperable standards used within the geospatial sciences are reviewed, including Geography Markup Language (GML), Observations and Measurements (O&M) and the Web Processing Service (WPS) specifications. The importance of uncertainty in geospatial data is identified and probability theory is examined as a mechanism for quantifying these uncertainties. The Uncertainty Markup Language (UncertML) is presented as a solution to the lack of an interoperable standard for quantifying uncertainty. UncertML is capable of describing uncertainty using statistics, probability distributions or a series of realisations. The capabilities of UncertML are demonstrated through a series of XML examples. This thesis then provides a series of example use cases where UncertML is integrated with existing standards in a variety of applications. The Sensor Observation Service - a service for querying and retrieving sensor-observed data - is extended to provide a standardised method for quantifying the inherent uncertainties in sensor observations. The INTAMAP project demonstrates how UncertML can be used to aid uncertainty propagation using a WPS by allowing UncertML as input and output data. The flexibility of UncertML is demonstrated with an extension to the GML geometry schemas to allow positional uncertainty to be quantified. Further applications and developments of UncertML are discussed.
Resumo:
Most object-based approaches to Geographical Information Systems (GIS) have concentrated on the representation of geometric properties of objects in terms of fixed geometry. In our road traffic marking application domain we have a requirement to represent the static locations of the road markings but also enforce the associated regulations, which are typically geometric in nature. For example a give way line of a pedestrian crossing in the UK must be within 1100-3000 mm of the edge of the crossing pattern. In previous studies of the application of spatial rules (often called 'business logic') in GIS emphasis has been placed on the representation of topological constraints and data integrity checks. There is very little GIS literature that describes models for geometric rules, although there are some examples in the Computer Aided Design (CAD) literature. This paper introduces some of the ideas from so called variational CAD models to the GIS application domain, and extends these using a Geography Markup Language (GML) based representation. In our application we have an additional requirement; the geometric rules are often changed and vary from country to country so should be represented in a flexible manner. In this paper we describe an elegant solution to the representation of geometric rules, such as requiring lines to be offset from other objects. The method uses a feature-property model embraced in GML 3.1 and extends the possible relationships in feature collections to permit the application of parameterized geometric constraints to sub features. We show the parametric rule model we have developed and discuss the advantage of using simple parametric expressions in the rule base. We discuss the possibilities and limitations of our approach and relate our data model to GML 3.1. © 2006 Springer-Verlag Berlin Heidelberg.
Resumo:
INTAMAP is a web processing service for the automatic interpolation of measured point data. Requirements were (i) using open standards for spatial data such as developed in the context of the open geospatial consortium (OGC), (ii) using a suitable environment for statistical modelling and computation, and (iii) producing an open source solution. The system couples the 52-North web processing service, accepting data in the form of an observations and measurements (O&M) document with a computing back-end realized in the R statistical environment. The probability distribution of interpolation errors is encoded with UncertML, a new markup language to encode uncertain data. Automatic interpolation needs to be useful for a wide range of applications and the algorithms have been designed to cope with anisotropies and extreme values. In the light of the INTAMAP experience, we discuss the lessons learnt.
Resumo:
INTAMAP is a Web Processing Service for the automatic spatial interpolation of measured point data. Requirements were (i) using open standards for spatial data such as developed in the context of the Open Geospatial Consortium (OGC), (ii) using a suitable environment for statistical modelling and computation, and (iii) producing an integrated, open source solution. The system couples an open-source Web Processing Service (developed by 52°North), accepting data in the form of standardised XML documents (conforming to the OGC Observations and Measurements standard) with a computing back-end realised in the R statistical environment. The probability distribution of interpolation errors is encoded with UncertML, a markup language designed to encode uncertain data. Automatic interpolation needs to be useful for a wide range of applications and the algorithms have been designed to cope with anisotropy, extreme values, and data with known error distributions. Besides a fully automatic mode, the system can be used with different levels of user control over the interpolation process.
Resumo:
Models are central tools for modern scientists and decision makers, and there are many existing frameworks to support their creation, execution and composition. Many frameworks are based on proprietary interfaces, and do not lend themselves to the integration of models from diverse disciplines. Web based systems, or systems based on web services, such as Taverna and Kepler, allow composition of models based on standard web service technologies. At the same time the Open Geospatial Consortium has been developing their own service stack, which includes the Web Processing Service, designed to facilitate the executing of geospatial processing - including complex environmental models. The current Open Geospatial Consortium service stack employs Extensible Markup Language as a default data exchange standard, and widely-used encodings such as JavaScript Object Notation can often only be used when incorporated with Extensible Markup Language. Similarly, no successful engagement of the Web Processing Service standard with the well-supported technologies of Simple Object Access Protocol and Web Services Description Language has been seen. In this paper we propose a pure Simple Object Access Protocol/Web Services Description Language processing service which addresses some of the issues with the Web Processing Service specication and brings us closer to achieving a degree of interoperability between geospatial models, and thus realising the vision of a useful 'model web'.
Resumo:
The Semantic Web relies on carefully structured, well defined, data to allow machines to communicate and understand one another. In many domains (e.g. geospatial) the data being described contains some uncertainty, often due to incomplete knowledge; meaningful processing of this data requires these uncertainties to be carefully analysed and integrated into the process chain. Currently, within the SemanticWeb there is no standard mechanism for interoperable description and exchange of uncertain information, which renders the automated processing of such information implausible, particularly where error must be considered and captured as it propagates through a processing sequence. In particular we adopt a Bayesian perspective and focus on the case where the inputs / outputs are naturally treated as random variables. This paper discusses a solution to the problem in the form of the Uncertainty Markup Language (UncertML). UncertML is a conceptual model, realised as an XML schema, that allows uncertainty to be quantified in a variety of ways i.e. realisations, statistics and probability distributions. UncertML is based upon a soft-typed XML schema design that provides a generic framework from which any statistic or distribution may be created. Making extensive use of Geography Markup Language (GML) dictionaries, UncertML provides a collection of definitions for common uncertainty types. Containing both written descriptions and mathematical functions, encoded as MathML, the definitions within these dictionaries provide a robust mechanism for defining any statistic or distribution and can be easily extended. Universal Resource Identifiers (URIs) are used to introduce semantics to the soft-typed elements by linking to these dictionary definitions. The INTAMAP (INTeroperability and Automated MAPping) project provides a use case for UncertML. This paper demonstrates how observation errors can be quantified using UncertML and wrapped within an Observations & Measurements (O&M) Observation. The interpolation service uses the information within these observations to influence the prediction outcome. The output uncertainties may be encoded in a variety of UncertML types, e.g. a series of marginal Gaussian distributions, a set of statistics, such as the first three marginal moments, or a set of realisations from a Monte Carlo treatment. Quantifying and propagating uncertainty in this way allows such interpolation results to be consumed by other services. This could form part of a risk management chain or a decision support system, and ultimately paves the way for complex data processing chains in the Semantic Web.
Resumo:
This paper describes part of the corpus collection efforts underway in the EC funded Companions project. The Companions project is collecting substantial quantities of dialogue a large part of which focus on reminiscing about photographs. The texts are in English and Czech. We describe the context and objectives for which this dialogue corpus is being collected, the methodology being used and make observations on the resulting data. The corpora will be made available to the wider research community through the Companions Project web site.
Resumo:
Time after time… and aspect and mood. Over the last twenty five years, the study of time, aspect and - to a lesser extent - mood acquisition has enjoyed increasing popularity and a constant widening of its scope. In such a teeming field, what can be the contribution of this book? We believe that it is unique in several respects. First, this volume encompasses studies from different theoretical frameworks: functionalism vs generativism or function-based vs form-based approaches. It also brings together various sub-fields (first and second language acquisition, child and adult acquisition, bilingualism) that tend to evolve in parallel rather than learn from each other. A further originality is that it focuses on a wide range of typologically different languages, and features less studied languages such as Korean and Bulgarian. Finally, the book gathers some well-established scholars, young researchers, and even research students, in a rich inter-generational exchange, that ensures the survival but also the renewal and the refreshment of the discipline. The book at a glance The first part of the volume is devoted to the study of child language acquisition in monolingual, impaired and bilingual acquisition, while the second part focuses on adult learners. In this section, we will provide an overview of each chapter. The first study by Aviya Hacohen explores the acquisition of compositional telicity in Hebrew L1. Her psycholinguistic approach contributes valuable data to refine theoretical accounts. Through an innovating methodology, she gathers information from adults and children on the influence of definiteness, number, and the mass vs countable distinction on the constitution of a telic interpretation of the verb phrase. She notices that the notion of definiteness is mastered by children as young as 10, while the mass/count distinction does not appear before 10;7. However, this does not entail an adult-like use of telicity. She therefore concludes that beyond definiteness and noun type, pragmatics may play an important role in the derivation of Hebrew compositional telicity. For the second chapter we move from a Semitic language to a Slavic one. Milena Kuehnast focuses on the acquisition of negative imperatives in Bulgarian, a form that presents the specificity of being grammatical only with the imperfective form of the verb. The study examines how 40 Bulgarian children distributed in two age-groups (15 between 2;11-3;11, and 25 between 4;00 and 5;00) develop with respect to the acquisition of imperfective viewpoints, and the use of imperfective morphology. It shows an evolution in the recourse to expression of force in the use of negative imperatives, as well as the influence of morphological complexity on the successful production of forms. With Yi-An Lin’s study, we concentrate both on another type of informant and of framework. Indeed, he studies the production of children suffering from Specific Language Impairment (SLI), a developmental language disorder the causes of which exclude cognitive impairment, psycho-emotional disturbance, and motor-articulatory disorders. Using the Leonard corpus in CLAN, Lin aims to test two competing accounts of SLI (the Agreement and Tense Omission Model [ATOM] and his own Phonetic Form Deficit Model [PFDM]) that conflicts on the role attributed to spellout in the impairment. Spellout is the point at which the Computational System for Human Language (CHL) passes over the most recently derived part of the derivation to the interface components, Phonetic Form (PF) and Logical Form (LF). ATOM claims that SLI sufferers have a deficit in their syntactic representation while PFDM suggests that the problem only occurs at the spellout level. After studying the corpus from the point of view of tense / agreement marking, case marking, argument-movement and auxiliary inversion, Lin finds further support for his model. Olga Gupol, Susan Rohstein and Sharon Armon-Lotem’s chapter offers a welcome bridge between child language acquisition and multilingualism. Their study explores the influence of intensive exposure to L2 Hebrew on the development of L1 Russian tense and aspect morphology through an elicited narrative. Their informants are 40 Russian-Hebrew sequential bilingual children distributed in two age groups 4;0 – 4;11 and 7;0 - 8;0. They come to the conclusion that bilingual children anchor their narratives in perfective like monolinguals. However, while aware of grammatical aspect, bilinguals lack the full form-function mapping and tend to overgeneralize the imperfective on the principles of simplicity (as imperfective are the least morphologically marked forms), universality (as it covers more functions) and interference. Rafael Salaberry opens the second section on foreign language learners. In his contribution, he reflects on the difficulty L2 learners of Spanish encounter when it comes to distinguishing between iterativity (conveyed with the use of the preterite) and habituality (expressed through the imperfect). He examines in turn the theoretical views that see, on the one hand, habituality as part of grammatical knowledge and iterativity as pragmatic knowledge, and on the other hand both habituality and iterativity as grammatical knowledge. He comes to the conclusion that the use of preterite as a default past tense marker may explain the impoverished system of aspectual distinctions, not only at beginners but also at advanced levels, which may indicate that the system is differentially represented among L1 and L2 speakers. Acquiring the vast array of functions conveyed by a form is therefore no mean feat, as confirmed by the next study. Based on the prototype theory, Kathleen Bardovi-Harlig’s chapter focuses on the development of the progressive in L2 English. It opens with an overview of the functions of the progressive in English. Then, a review of acquisition research on the progressive in English and other languages is provided. The bulk of the chapter reports on a longitudinal study of 16 learners of L2 English and shows how their use of the progressive expands from the prototypical uses of process and continuousness to the less prototypical uses of repetition and future. The study concludes that the progressive spreads in interlanguage in accordance with prototype accounts. However, it suggests additional stages, not predicted by the Aspect Hypothesis, in the development from activities and accomplishments at least for the meaning of repeatedness. A similar theoretical framework is adopted in the following chapter, but it deals with a lesser studied language. Hyun-Jin Kim revisits the claims of the Aspect Hypothesis in relation to the acquisition of L2 Korean by two L1 English learners. Inspired by studies on L2 Japanese, she focuses on the emergence and spread of the past / perfective marker ¬–ess- and the progressive – ko iss- in the interlanguage of her informants throughout their third and fourth semesters of study. The data collected through six sessions of conversational interviews and picture description tasks seem to support the Aspect Hypothesis. Indeed learners show a strong association between past tense and accomplishments / achievements at the start and a gradual extension to other types; a limited use of past / perfective marker with states and an affinity of progressive with activities / accomplishments and later achievements. In addition, - ko iss– moves from progressive to resultative in the specific category of Korean verbs meaning wear / carry. While the previous contributions focus on function, Evgeniya Sergeeva and Jean-Pierre Chevrot’s is interested in form. The authors explore the acquisition of verbal morphology in L2 French by 30 instructed native speakers of Russian distributed in a low and high levels. They use an elicitation task for verbs with different models of stem alternation and study how token frequency and base forms influence stem selection. The analysis shows that frequency affects correct production, especially among learners with high proficiency. As for substitution errors, it appears that forms with a simple structure are systematically more frequent than the target form they replace. When a complex form serves as a substitute, it is more frequent only when it is replacing another complex form. As regards the use of base forms, the 3rd person singular of the present – and to some extent the infinitive – play this role in the corpus. The authors therefore conclude that the processing of surface forms can be influenced positively or negatively by the frequency of the target forms and of other competing stems, and by the proximity of the target stem to a base form. Finally, Martin Howard’s contribution takes up the challenge of focusing on the poorer relation of the TAM system. On the basis of L2 French data obtained through sociolinguistic interviews, he studies the expression of futurity, conditional and subjunctive in three groups of university learners with classroom teaching only (two or three years of university teaching) or with a mixture of classroom teaching and naturalistic exposure (2 years at University + 1 year abroad). An analysis of relative frequencies leads him to suggest a continuum of use going from futurate present to conditional with past hypothetic conditional clauses in si, which needs to be confirmed by further studies. Acknowledgements The present volume was inspired by the conference Acquisition of Tense – Aspect – Mood in First and Second Language held on 9th and 10th February 2008 at Aston University (Birmingham, UK) where over 40 delegates from four continents and over a dozen countries met for lively and enjoyable discussions. This collection of papers was double peer-reviewed by an international scientific committee made of Kathleen Bardovi-Harlig (Indiana University), Christine Bozier (Lund Universitet), Alex Housen (Vrije Universiteit Brussel), Martin Howard (University College Cork), Florence Myles (Newcastle University), Urszula Paprocka (Catholic University of Lublin), †Clive Perdue (Université Paris 8), Michel Pierrard (Vrije Universiteit Brussel), Rafael Salaberry (University of Texas at Austin), Suzanne Schlyter (Lund Universitet), Richard Towell (Salford University), and Daniel Véronique (Université d’Aix-en-Provence). We are very much indebted to that scientific committee for their insightful input at each step of the project. We are also thankful for the financial support of the Association for French Language Studies through its workshop grant, and to the Aston Modern Languages Research Foundation for funding the proofreading of the manuscript.