Biblioteca Digital

16 resultados para English for Speakers of Other Languages (ESOL)

em Universidad Politécnica de Madrid

Challenges for the Multilingual Web of Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Web has witnessed an enormous growth in the amount of semantic information published in recent years. This growth has been stimulated to a large extent by the emergence of Linked Data. Although this brings us a big step closer to the vision of a Semantic Web, it also raises new issues such as the need for dealing with information expressed in different natural languages. Indeed, although the Web of Data can contain any kind of information in any language, it still lacks explicit mechanisms to automatically reconcile such information when it is expressed in different languages. This leads to situations in which data expressed in a certain language is not easily accessible to speakers of other languages. The Web of Data shows the potential for being extended to a truly multilingual web as vocabularies and data can be published in a language-independent fashion, while associated language-dependent (linguistic) information supporting the access across languages can be stored separately. In this sense, the multilingual Web of Data can be realized in our view as a layer of services and resources on top of the existing Linked Data infrastructure adding i) linguistic information for data and vocabularies in different languages, ii) mappings between data with labels in different languages, and iii) services to dynamically access and traverse Linked Data across different languages. In this article we present this vision of a multilingual Web of Data. We discuss challenges that need to be addressed to make this vision come true and discuss the role that techniques such as ontology localization, ontology mapping, and cross-lingual ontology-based information access and presentation will play in achieving this. Further, we propose an initial architecture and describe a roadmap that can provide a basis for the implementation of this vision.

Applications of Formal Languages to Management of Manned and Unmanned Aircraft

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En el futuro, la gestión del tráfico aéreo (ATM, del inglés air traffic management) requerirá un cambio de paradigma, de la gestión principalmente táctica de hoy, a las denominadas operaciones basadas en trayectoria. Un incremento en el nivel de automatización liberará al personal de ATM —controladores, tripulación, etc.— de muchas de las tareas que realizan hoy. Las personas seguirán siendo el elemento central en la gestión del tráfico aéreo del futuro, pero lo serán mediante la gestión y toma de decisiones. Se espera que estas dos mejoras traigan un incremento en la eficiencia de la gestión del tráfico aéreo que permita hacer frente al incremento previsto en la demanda de transporte aéreo. Para aplicar el concepto de operaciones basadas en trayectoria, el usuario del espacio aéreo (la aerolínea, piloto, u operador) y el proveedor del servicio de navegación aérea deben negociar las trayectorias mediante un proceso de toma de decisiones colaborativo. En esta negociación, es necesaria una forma adecuada de compartir dichas trayectorias. Compartir la trayectoria completa requeriría un gran ancho de banda, y la trayectoria compartida podría invalidarse si cambiase la predicción meteorológica. En su lugar, podría compartirse una descripción de la trayectoria independiente de las condiciones meteorológicas, de manera que la trayectoria real se pudiese calcular a partir de dicha descripción. Esta descripción de la trayectoria debería ser fácil de procesar usando un programa de ordenador —ya que parte del proceso de toma de decisiones estará automatizado—, pero también fácil de entender para un operador humano —que será el que supervise el proceso y tome las decisiones oportunas—. Esta tesis presenta una serie de lenguajes formales que pueden usarse para este propósito. Estos lenguajes proporcionan los medios para describir trayectorias de aviones durante todas las fases de vuelo, desde la maniobra de push-back (remolcado hasta la calle de rodaje), hasta la llegada a la terminal del aeropuerto de destino. También permiten describir trayectorias tanto de aeronaves tripuladas como no tripuladas, incluyendo aviones de ala fija y cuadricópteros. Algunos de estos lenguajes están estrechamente relacionados entre sí, y organizados en una jerarquía. Uno de los lenguajes fundamentales de esta jerarquía, llamado aircraft intent description language (AIDL), ya había sido desarrollado con anterioridad a esta tesis. Este lenguaje fue derivado de las ecuaciones del movimiento de los aviones de ala fija, y puede utilizarse para describir sin ambigüedad trayectorias de este tipo de aeronaves. Una variante de este lenguaje, denominada quadrotor AIDL (QR-AIDL), ha sido desarrollada en esta tesis para permitir describir trayectorias de cuadricópteros con el mismo nivel de detalle. Seguidamente, otro lenguaje, denominado intent composite description language (ICDL), se apoya en los dos lenguajes anteriores, ofreciendo más flexibilidad para describir algunas partes de la trayectoria y dejar otras sin especificar. El ICDL se usa para proporcionar descripciones genéricas de maniobras comunes, que después se particularizan y combinan para formar descripciones complejas de un vuelo. Otro lenguaje puede construirse a partir del ICDL, denominado flight intent description language (FIDL). El FIDL especifica requisitos de alto nivel sobre las trayectorias —incluyendo restricciones y objetivos—, pero puede utilizar características del ICDL para proporcionar niveles de detalle arbitrarios en las distintas partes de un vuelo. Tanto el ICDL como el FIDL han sido desarrollados en colaboración con Boeing Research & Technology Europe (BR&TE). También se ha desarrollado un lenguaje para definir misiones en las que interactúan varias aeronaves, el mission intent description language (MIDL). Este lenguaje se basa en el FIDL y mantiene todo su poder expresivo, a la vez que proporciona nuevas semánticas para describir tareas, restricciones y objetivos relacionados con la misión. En ATM, los movimientos de un avión en la superficie de aeropuerto también tienen que ser monitorizados y gestionados. Otro lenguaje formal ha sido diseñado con este propósito, llamado surface movement description language (SMDL). Este lenguaje no pertenece a la jerarquía de lenguajes descrita en el párrafo anterior, y se basa en las clearances (autorizaciones del controlador) utilizadas durante las operaciones en superficie de aeropuerto. También proporciona medios para expresar incertidumbre y posibilidad de cambios en las distintas partes de la trayectoria. Finalmente, esta tesis explora las aplicaciones de estos lenguajes a la predicción de trayectorias y a la planificación de misiones. El concepto de trajectory language processing engine (TLPE) se usa en ambas aplicaciones. Un TLPE es una función de ATM cuya principal entrada y salida se expresan en cualquiera de los lenguajes incluidos en la jerarquía descrita en esta tesis. El proceso de predicción de trayectorias puede definirse como una combinación de TLPEs, cada uno de los cuales realiza una pequeña sub-tarea. Se le ha dado especial importancia a uno de estos TLPEs, que se encarga de generar el perfil horizontal, vertical y de configuración de la trayectoria. En particular, esta tesis presenta un método novedoso para la generación del perfil vertical. El proceso de planificar una misión también se puede ver como un TLPE donde la entrada se expresa en MIDL y la salida consiste en cierto número de trayectorias —una por cada aeronave disponible— descritas utilizando FIDL. Se ha formulado este problema utilizando programación entera mixta. Además, dado que encontrar caminos óptimos entre distintos puntos es un problema fundamental en la planificación de misiones, también se propone un algoritmo de búsqueda de caminos. Este algoritmo permite calcular rápidamente caminos cuasi-óptimos que esquivan todos los obstáculos en un entorno urbano. Los diferentes lenguajes formales definidos en esta tesis pueden utilizarse como una especificación estándar para la difusión de información entre distintos actores de la gestión del tráfico aéreo. En conjunto, estos lenguajes permiten describir trayectorias con el nivel de detalle necesario en cada aplicación, y se pueden utilizar para aumentar el nivel de automatización explotando esta información utilizando sistemas de soporte a la toma de decisiones. La aplicación de estos lenguajes a algunas funciones básicas de estos sistemas, como la predicción de trayectorias, han sido analizadas. ABSTRACT Future air traffic management (ATM) will require a paradigm shift from today’s mainly tactical ATM to trajectory-based operations (TBOs). An increase in the level of automation will also relieve humans —air traffic control officers (ATCOs), flight crew, etc.— from many of the tasks they perform today. Humans will still be central in this future ATM, as decision-makers and managers. These two improvements (TBOs and increased automation) are expected to provide the increase in ATM performance that will allow coping with the expected increase in air transport demand. Under TBOs, trajectories are negotiated between the airspace user (an airline, pilot, or operator) and the air navigation service provider (ANSP) using a collaborative decision making (CDM) process. A suitable method for sharing aircraft trajectories is necessary for this negotiation. Sharing a whole trajectory would require a high amount of bandwidth, and the shared trajectory might become invalid if the weather forecast changed. Instead, a description of the trajectory, decoupled from the weather conditions, could be shared, so that the actual trajectory could be computed from this trajectory description. This trajectory description should be easy to process using a computing program —as some of the CDM processes will be automated— but also easy to understand for a human operator —who will be supervising the process and making decisions. This thesis presents a series of formal languages that can be used for this purpose. These languages provide the means to describe aircraft trajectories during all phases of flight, from push back to arrival at the gate. They can also describe trajectories of both manned and unmanned aircraft, including fixedwing and some rotary-wing aircraft (quadrotors). Some of these languages are tightly interrelated and organized in a language hierarchy. One of the key languages in this hierarchy, the aircraft intent description language (AIDL), had already been developed prior to this thesis. This language was derived from the equations of motion of fixed-wing aircraft, and can provide an unambiguous description of fixed-wing aircraft trajectories. A variant of this language, the quadrotor AIDL (QR-AIDL), is developed in this thesis to allow describing a quadrotor aircraft trajectory with the same level of detail. Then, the intent composite description language (ICDL) is built on top of these two languages, providing more flexibility to describe some parts of the trajectory while leaving others unspecified. The ICDL is used to provide generic descriptions of common aircraft manoeuvres, which can be particularized and combined to form complex descriptions of flight. Another language is built on top of the ICDL, the flight intent description language (FIDL). The FIDL specifies high-level requirements on trajectories —including constraints and objectives—, but can use features of the ICDL to provide arbitrary levels of detail in different parts of the flight. The ICDL and FIDL have been developed in collaboration with Boeing Research & Technology Europe (BR&TE). Also, the mission intent description language (MIDL) has been developed to allow describing missions involving multiple aircraft. This language is based on the FIDL and keeps all its expressive power, while it also provides new semantics for describing mission tasks, mission objectives, and constraints involving several aircraft. In ATM, the movement of aircraft while on the airport surface also has to be monitored and managed. Another formal language has been designed for this purpose, denoted surface movement description language (SMDL). This language does not belong to the language hierarchy described above, and it is based on the clearances used in airport surface operations. Means to express uncertainty and mutability of different parts of the trajectory are also provided. Finally, the applications of these languages to trajectory prediction and mission planning are explored in this thesis. The concept of trajectory language processing engine (TLPE) is used in these two applications. A TLPE is an ATM function whose main input and output are expressed in any of the languages in the hierarchy described in this thesis. A modular trajectory predictor is defined as a combination of multiple TLPEs, each of them performing a small subtask. Special attention is given to the TLPE that builds the horizontal, vertical, and configuration profiles of the trajectory. In particular, a novel method for the generation of the vertical profile is presented. The process of planning a mission can also be seen as a TLPE, where the main input is expressed in the MIDL and the output consists of a number of trajectory descriptions —one for each aircraft available in the mission— expressed in the FIDL. A mixed integer linear programming (MILP) formulation for the problem of assigning mission tasks to the available aircraft is provided. In addition, since finding optimal paths between locations is a key problem to mission planning, a novel path finding algorithm is presented. This algorithm can compute near-shortest paths avoiding all obstacles in an urban environment in very short times. The several formal languages described in this thesis can serve as a standard specification to share trajectory information among different actors in ATM. In combination, these languages can describe trajectories with the necessary level of detail for any application, and can be used to increase automation by exploiting this information using decision support tools (DSTs). Their applications to some basic functions of DSTs, such as trajectory prediction, have been analized.

A new way of teaching different subjects in a foreign language in the Building Engineering Degree at the Universidad Politécnica.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The European Union has been promoting linguistic diversity for many years as one of its main educational goals. This is an element that facilitates student mobility and student exchanges between different universities and countries and enriches the education of young undergraduates. In particular, a higher degree of competence in the English language is becoming essential for engineers, architects and researchers in general, as English has become the lingua franca that opens up horizons to internationalisation and the transfer of knowledge in today’s world. Many experts point to the Integrated Approach to Contents and Foreign Languages System as being an option that has certain benefits over the traditional method of teaching a second language that is exclusively based on specific subjects. This system advocates teaching the different subjects in the syllabus in a language other than one’s mother tongue, without prioritising knowledge of the language over the subject. This was the idea that in the 2009/10 academic year gave rise to the Second Language Integration Programme (SLI Programme) at the Escuela Arquitectura Técnica in the Universidad Politécnica Madrid (EUATM-UPM), just at the beginning of the tuition of the new Building Engineering Degree, which had been adapted to the European Higher Education Area (EHEA) model. This programme is an interdisciplinary initiative for the set of subjects taught during the semester and is coordinated through the Assistant Director Office for Educational Innovation. The SLI Programme has a dual goal; to familiarise students with the specific English terminology of the subject being taught, and at the same time improve their communication skills in English. A total of thirty lecturers are taking part in the teaching of eleven first year subjects and twelve in the second year, with around 120 students who have voluntarily enrolled in a special group in each semester. During the 2010/2011 academic year the degree of acceptance and the results of the SLI Programme have been monitored. Tools have been designed to aid interdisciplinary coordination and to analyse satisfaction, such as coordination records and surveys. The results currently available refer to the first and second year and are divided into specific aspects of the different subjects involved and into general aspects of the ongoing experience.

Application of Ultraintense Lasers to validate materials for laser fusion: production of ions and other relevant species

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Justification of the need and demand of experimental facilities to test and validate materials for first wall in laser fusion reactors - Characteristics of the laser fusion products - Current ?possible? facilities for tests Ultraintense Lasers as ?complete? solution facility - Generation of ion pulses - Generation of X-ray pulses - Generation of other relevant particles (electrons, neutrons..)

OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web 1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs. These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools. Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate. However, linguistic annotation tools have still some limitations, which can be summarised as follows: 1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.). 2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts. 3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc. A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved. In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool. Therefore, it would be quite useful to find a way to (i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools; (ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate. Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned. Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section. 2. GOALS OF THE PRESENT WORK As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based triples, as in the usual Semantic Web languages (namely RDF(S) and OWL), in order for the model to be considered suitable for the Semantic Web. Besides, to be useful for the Semantic Web, this model should provide a way to automate the annotation of web pages. As for the present work, this requirement involved reusing the linguistic annotation tools purchased by the OEG research group (http://www.oeg-upm.net), but solving beforehand (or, at least, minimising) some of their limitations. Therefore, this model had to minimise these limitations by means of the integration of several linguistic annotation tools into a common architecture. Since this integration required the interoperation of tools and their annotations, ontologies were proposed as the main technological component to make them effectively interoperate. From the very beginning, it seemed that the formalisation of the elements and the knowledge underlying linguistic annotations within an appropriate set of ontologies would be a great step forward towards the formulation of such a model (henceforth referred to as OntoTag). Obviously, first, to combine the results of the linguistic annotation tools that operated at the same level, their annotation schemas had to be unified (or, preferably, standardised) in advance. This entailed the unification (id. standardisation) of their tags (both their representation and their meaning), and their format or syntax. Second, to merge the results of the linguistic annotation tools operating at different levels, their respective annotation schemas had to be (a) made interoperable and (b) integrated. And third, in order for the resulting annotations to suit the Semantic Web, they had to be specified by means of an ontology-based vocabulary, and structured by means of ontology-based triples, as hinted above. Therefore, a new annotation scheme had to be devised, based both on ontologies and on this type of triples, which allowed for the combination and the integration of the annotations of any set of linguistic annotation tools. This annotation scheme was considered a fundamental part of the model proposed here, and its development was, accordingly, another major objective of the present work. All these goals, aims and objectives could be re-stated more clearly as follows: Goal 1: Development of a set of ontologies for the formalisation of the linguistic knowledge relating linguistic annotation. Sub-goal 1.1: Ontological formalisation of the EAGLES (1996a; 1996b) de facto standards for morphosyntactic and syntactic annotation, in a way that helps respect the triple structure recommended for annotations in these works (which is isomorphic to the triple structures used in the context of the Semantic Web). Sub-goal 1.2: Incorporation into this preliminary ontological formalisation of other existing standards and standard proposals relating the levels mentioned above, such as those currently under development within ISO/TC 37 (the ISO Technical Committee dealing with Terminology, which deals also with linguistic resources and annotations). Sub-goal 1.3: Generalisation and extension of the recommendations in EAGLES (1996a; 1996b) and ISO/TC 37 to the semantic level, for which no ISO/TC 37 standards have been developed yet. Sub-goal 1.4: Ontological formalisation of the generalisations and/or extensions obtained in the previous sub-goal as generalisations and/or extensions of the corresponding ontology (or ontologies). Sub-goal 1.5: Ontological formalisation of the knowledge required to link, combine and unite the knowledge represented in the previously developed ontology (or ontologies). Goal 2: Development of OntoTag’s annotation scheme, a standard-based abstract scheme for the hybrid (linguistically-motivated and ontological-based) annotation of texts. Sub-goal 2.1: Development of the standard-based morphosyntactic annotation level of OntoTag’s scheme. This level should include, and possibly extend, the recommendations of EAGLES (1996a) and also the recommendations included in the ISO/MAF (2008) standard draft. Sub-goal 2.2: Development of the standard-based syntactic annotation level of the hybrid abstract scheme. This level should include, and possibly extend, the recommendations of EAGLES (1996b) and the ISO/SynAF (2010) standard draft. Sub-goal 2.3: Development of the standard-based semantic annotation level of OntoTag’s (abstract) scheme. Sub-goal 2.4: Development of the mechanisms for a convenient integration of the three annotation levels already mentioned. These mechanisms should take into account the recommendations included in the ISO/LAF (2009) standard draft. Goal 3: Design of OntoTag’s (abstract) annotation architecture, an abstract architecture for the hybrid (semantic) annotation of texts (i) that facilitates the integration and interoperation of different linguistic annotation tools, and (ii) whose results comply with OntoTag’s annotation scheme. Sub-goal 3.1: Specification of the decanting processes that allow for the classification and separation, according to their corresponding levels, of the results of the linguistic tools annotating at several different levels. Sub-goal 3.2: Specification of the standardisation processes that allow (a) complying with the standardisation requirements of OntoTag’s annotation scheme, as well as (b) combining the results of those linguistic tools that share some level of annotation. Sub-goal 3.3: Specification of the merging processes that allow for the combination of the output annotations and the interoperation of those linguistic tools that share some level of annotation. Sub-goal 3.4: Specification of the merge processes that allow for the integration of the results and the interoperation of those tools performing their annotations at different levels. Goal 4: Generation of OntoTagger’s schema, a concrete instance of OntoTag’s abstract scheme for a concrete set of linguistic annotations. These linguistic annotations result from the tools and the resources available in the research group, namely • Bitext’s DataLexica (http://www.bitext.com/EN/datalexica.asp), • LACELL’s (POS) tagger (http://www.um.es/grupos/grupo-lacell/quees.php), • Connexor’s FDG (http://www.connexor.eu/technology/machinese/glossary/fdg/), and • EuroWordNet (Vossen et al., 1998). This schema should help evaluate OntoTag’s underlying hypotheses, stated below. Consequently, it should implement, at least, those levels of the abstract scheme dealing with the annotations of the set of tools considered in this implementation. This includes the morphosyntactic, the syntactic and the semantic levels. Goal 5: Implementation of OntoTagger’s configuration, a concrete instance of OntoTag’s abstract architecture for this set of linguistic tools and annotations. This configuration (1) had to use the schema generated in the previous goal; and (2) should help support or refute the hypotheses of this work as well (see the next section). Sub-goal 5.1: Implementation of the decanting processes that facilitate the classification and separation of the results of those linguistic resources that provide annotations at several different levels (on the one hand, LACELL’s tagger operates at the morphosyntactic level and, minimally, also at the semantic level; on the other hand, FDG operates at the morphosyntactic and the syntactic levels and, minimally, at the semantic level as well). Sub-goal 5.2: Implementation of the standardisation processes that allow (i) specifying the results of those linguistic tools that share some level of annotation according to the requirements of OntoTagger’s schema, as well as (ii) combining these shared level results. In particular, all the tools selected perform morphosyntactic annotations and they had to be conveniently combined by means of these processes. Sub-goal 5.3: Implementation of the merging processes that allow for the combination (and possibly the improvement) of the annotations and the interoperation of the tools that share some level of annotation (in particular, those relating the morphosyntactic level, as in the previous sub-goal). Sub-goal 5.4: Implementation of the merging processes that allow for the integration of the different standardised and combined annotations aforementioned, relating all the levels considered. Sub-goal 5.5: Improvement of the semantic level of this configuration by adding a named entity recognition, (sub-)classification and annotation subsystem, which also uses the named entities annotated to populate a domain ontology, in order to provide a concrete application of the present work in the two areas involved (the Semantic Web and Corpus Linguistics). 3. MAIN RESULTS: ASSESSMENT OF ONTOTAG’S UNDERLYING HYPOTHESES The model developed in the present thesis tries to shed some light on (i) whether linguistic annotation tools can effectively interoperate; (ii) whether their results can be combined and integrated; and, if they can, (iii) how they can, respectively, interoperate and be combined and integrated. Accordingly, several hypotheses had to be supported (or rejected) by the development of the OntoTag model and OntoTagger (its implementation). The hypotheses underlying OntoTag are surveyed below. Only one of the hypotheses (H.6) was rejected; the other five could be confirmed. H.1 The annotations of different levels (or layers) can be integrated into a sort of overall, comprehensive, multilayer and multilevel annotation, so that their elements can complement and refer to each other. • CONFIRMED by the development of: o OntoTag’s annotation scheme, o OntoTag’s annotation architecture, o OntoTagger’s (XML, RDF, OWL) annotation schemas, o OntoTagger’s configuration. H.2 Tool-dependent annotations can be mapped onto a sort of tool-independent annotations and, thus, can be standardised. • CONFIRMED by means of the standardisation phase incorporated into OntoTag and OntoTagger for the annotations yielded by the tools. H.3 Standardisation should ease: H.3.1: The interoperation of linguistic tools. H.3.2: The comparison, combination (at the same level and layer) and integration (at different levels or layers) of annotations. • H.3 was CONFIRMED by means of the development of OntoTagger’s ontology-based configuration: o Interoperation, comparison, combination and integration of the annotations of three different linguistic tools (Connexor’s FDG, Bitext’s DataLexica and LACELL’s tagger); o Integration of EuroWordNet-based, domain-ontology-based and named entity annotations at the semantic level. o Integration of morphosyntactic, syntactic and semantic annotations. H.4 Ontologies and Semantic Web technologies (can) play a crucial role in the standardisation of linguistic annotations, by providing consensual vocabularies and standardised formats for annotation (e.g., RDF triples). • CONFIRMED by means of the development of OntoTagger’s RDF-triple-based annotation schemas. H.5 The rate of errors introduced by a linguistic tool at a given level, when annotating, can be reduced automatically by contrasting and combining its results with the ones coming from other tools, operating at the same level. However, these other tools might be built following a different technological (stochastic vs. rule-based, for example) or theoretical (dependency vs. HPS-grammar-based, for instance) approach. • CONFIRMED by the results yielded by the evaluation of OntoTagger. H.6 Each linguistic level can be managed and annotated independently. • REJECTED: OntoTagger’s experiments and the dependencies observed among the morphosyntactic annotations, and between them and the syntactic annotations. In fact, Hypothesis H.6 was already rejected when OntoTag’s ontologies were developed. We observed then that several linguistic units stand on an interface between levels, belonging thereby to both of them (such as morphosyntactic units, which belong to both the morphological level and the syntactic level). Therefore, the annotations of these levels overlap and cannot be handled independently when merged into a unique multileveled annotation. 4. OTHER MAIN RESULTS AND CONTRIBUTIONS First, interoperability is a hot topic for both the linguistic annotation community and the whole Computer Science field. The specification (and implementation) of OntoTag’s architecture for the combination and integration of linguistic (annotation) tools and annotations by means of ontologies shows a way to make these different linguistic annotation tools and annotations interoperate in practice. Second, as mentioned above, the elements involved in linguistic annotation were formalised in a set (or network) of ontologies (OntoTag’s linguistic ontologies). • On the one hand, OntoTag’s network of ontologies consists of − The Linguistic Unit Ontology (LUO), which includes a mostly hierarchical formalisation of the different types of linguistic elements (i.e., units) identifiable in a written text; − The Linguistic Attribute Ontology (LAO), which includes also a mostly hierarchical formalisation of the different types of features that characterise the linguistic units included in the LUO; − The Linguistic Value Ontology (LVO), which includes the corresponding formalisation of the different values that the attributes in the LAO can take; − The OIO (OntoTag’s Integration Ontology), which  Includes the knowledge required to link, combine and unite the knowledge represented in the LUO, the LAO and the LVO;  Can be viewed as a knowledge representation ontology that describes the most elementary vocabulary used in the area of annotation. • On the other hand, OntoTag’s ontologies incorporate the knowledge included in the different standards and recommendations for linguistic annotation released so far, such as those developed within the EAGLES and the SIMPLE European projects or by the ISO/TC 37 committee: − As far as morphosyntactic annotations are concerned, OntoTag’s ontologies formalise the terms in the EAGLES (1996a) recommendations and their corresponding terms within the ISO Morphosyntactic Annotation Framework (ISO/MAF, 2008) standard; − As for syntactic annotations, OntoTag’s ontologies incorporate the terms in the EAGLES (1996b) recommendations and their corresponding terms within the ISO Syntactic Annotation Framework (ISO/SynAF, 2010) standard draft; − Regarding semantic annotations, OntoTag’s ontologies generalise and extend the recommendations in EAGLES (1996a; 1996b) and, since no stable standards or standard drafts have been released for semantic annotation by ISO/TC 37 yet, they incorporate the terms in SIMPLE (2000) instead; − The terms coming from all these recommendations and standards were supplemented by those within the ISO Data Category Registry (ISO/DCR, 2008) and also of the ISO Linguistic Annotation Framework (ISO/LAF, 2009) standard draft when developing OntoTag’s ontologies. Third, we showed that the combination of the results of tools annotating at the same level can yield better results (both in precision and in recall) than each tool separately. In particular, 1. OntoTagger clearly outperformed two of the tools integrated into its configuration, namely DataLexica and FDG in all the combination sub-phases in which they overlapped (i.e. POS tagging, lemma annotation and morphological feature annotation). As far as the remaining tool is concerned, i.e. LACELL’s tagger, it was also outperformed by OntoTagger in POS tagging and lemma annotation, and it did not behave better than OntoTagger in the morphological feature annotation layer. 2. As an immediate result, this implies that a) This type of combination architecture configurations can be applied in order to improve significantly the accuracy of linguistic annotations; and b) Concerning the morphosyntactic level, this could be regarded as a way of constructing more robust and more accurate POS tagging systems. Fourth, Semantic Web annotations are usually performed by humans or else by machine learning systems. Both of them leave much to be desired: the former, with respect to their annotation rate; the latter, with respect to their (average) precision and recall. In this work, we showed how linguistic tools can be wrapped in order to annotate automatically Semantic Web pages using ontologies. This entails their fast, robust and accurate semantic annotation. As a way of example, as mentioned in Sub-goal 5.5, we developed a particular OntoTagger module for the recognition, classification and labelling of named entities, according to the MUC and ACE tagsets (Chinchor, 1997; Doddington et al., 2004). These tagsets were further specified by means of a domain ontology, namely the Cinema Named Entities Ontology (CNEO). This module was applied to the automatic annotation of ten different web pages containing cinema reviews (that is, around 5000 words). In addition, the named entities annotated with this module were also labelled as instances (or individuals) of the classes included in the CNEO and, then, were used to populate this domain ontology. • The statistical results obtained from the evaluation of this particular module of OntoTagger can be summarised as follows. On the one hand, as far as recall (R) is concerned, (R.1) the lowest value was 76,40% (for file 7); (R.2) the highest value was 97, 50% (for file 3); and (R.3) the average value was 88,73%. On the other hand, as far as the precision rate (P) is concerned, (P.1) its minimum was 93,75% (for file 4); (R.2) its maximum was 100% (for files 1, 5, 7, 8, 9, and 10); and (R.3) its average value was 98,99%. • These results, which apply to the tasks of named entity annotation and ontology population, are extraordinary good for both of them. They can be explained on the basis of the high accuracy of the annotations provided by OntoTagger at the lower levels (mainly at the morphosyntactic level). However, they should be conveniently qualified, since they might be too domain- and/or language-dependent. It should be further experimented how our approach works in a different domain or a different language, such as French, English, or German. • In any case, the results of this application of Human Language Technologies to Ontology Population (and, accordingly, to Ontological Engineering) seem very promising and encouraging in order for these two areas to collaborate and complement each other in the area of semantic annotation. Fifth, as shown in the State of the Art of this work, there are different approaches and models for the semantic annotation of texts, but all of them focus on a particular view of the semantic level. Clearly, all these approaches and models should be integrated in order to bear a coherent and joint semantic annotation level. OntoTag shows how (i) these semantic annotation layers could be integrated together; and (ii) they could be integrated with the annotations associated to other annotation levels. Sixth, we identified some recommendations, best practices and lessons learned for annotation standardisation, interoperation and merge. They show how standardisation (via ontologies, in this case) enables the combination, integration and interoperation of different linguistic tools and their annotations into a multilayered (or multileveled) linguistic annotation, which is one of the hot topics in the area of Linguistic Annotation. And last but not least, OntoTag’s annotation scheme and OntoTagger’s annotation schemas show a way to formalise and annotate coherently and uniformly the different units and features associated to the different levels and layers of linguistic annotation. This is a great scientific step ahead towards the global standardisation of this area, which is the aim of ISO/TC 37 (in particular, Subcommittee 4, dealing with the standardisation of linguistic annotations and resources).

An overview of the ciao multiparadigm language and program development environment and its design philosophy

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe some of the novel aspects and motivations behind the design and implementation of the Ciao multiparadigm programming system. An important aspect of Ciao is that it provides the programmer with a large number of useful features from different programming paradigms and styles, and that the use of each of these features can be turned on and off at will for each program module. Thus, a given module may be using e.g. higher order functions and constraints, while another module may be using objects, predicates, and concurrency. Furthermore, the language is designed to be extensible in a simple and modular way. Another important aspect of Ciao is its programming environment, which provides a powerful preprocessor (with an associated assertion language) capable of statically finding non-trivial bugs, verifying that programs comply with specifications, and performing many types of program optimizations. Such optimizations produce code that is highly competitive with other dynamic languages or, when the highest levéis of optimization are used, even that of static languages, all while retaining the interactive development environment of a dynamic language. The environment also includes a powerful auto-documenter. The paper provides an informal overview of the language and program development environment. It aims at illustrating the design philosophy rather than at being exhaustive, which would be impossible in the format of a paper, pointing instead to the existing literature on the system.

Some design issues in the visualization of constraint logic program execution

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visualization of program executions has been found useful in applications which include education and debugging. However, traditional visualization techniques often fall short of expectations or are altogether inadequate for new programming paradigms, such as Constraint Logic Programming (CLP), whose declarative and operational semantics differ in some crucial ways from those of other paradigms. In particular, traditional ideas regarding flow control and the behavior of data often cannot be lifted in a straightforward way to (C)LP from other families of programming languages. In this paper we discuss techniques for visualizing program execution and data evolution in CLP. We briefly review some previously proposed visualization paradigms, and also propose a number of (to our knowledge) novel ones. The graphical representations have been chosen based on the perceived needs of a programmer trying to analyze the behavior and characteristics of an execution. In particular, we concéntrate on the representation of the program execution behavior (control), the runtime valúes of the variables, and the runtime constraints. Given our interest in visualizing large executions, we also pay attention to abstraction techniques, Le., techniques which are intended to help in reducing the complexity of the visual information.

An overview of ciao and its design philosophy

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We provide an overall description of the Ciao multiparadigm programming system emphasizing some of the novel aspects and motivations behind its design and implementation. An important aspect of Ciao is that, in addition to supporting logic programming (and, in particular, Prolog), it provides the programmer with a large number of useful features from different programming paradigms and styles and that the use of each of these features (including those of Prolog) can be turned on and off at will for each program module. Thus, a given module may be using, e.g., higher order functions and constraints, while another module may be using assignment, predicates, Prolog meta-programming, and concurrency. Furthermore, the language is designed to be extensible in a simple and modular way. Another important aspect of Ciao is its programming environment, which provides a powerful preprocessor (with an associated assertion language) capable of statically finding non-trivial bugs, verifying that programs comply with specifications, and performing many types of optimizations (including automatic parallelization). Such optimizations produce code that is highly competitive with other dynamic languages or, with the (experimental) optimizing compiler, even that of static languages, all while retaining the flexibility and interactive development of a dynamic language. This compilation architecture supports modularity and separate compilation throughout. The environment also includes a powerful autodocumenter and a unit testing framework, both closely integrated with the assertion system. The paper provides an informal overview of the language and program development environment. It aims at illustrating the design philosophy rather than at being exhaustive, which would be impossible in a single journal paper, pointing instead to previous Ciao literature.

An abstract machine based execution model for computer architecture design and efficient implementation of logic programs in parallel.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The term "Logic Programming" refers to a variety of computer languages and execution models which are based on the traditional concept of Symbolic Logic. The expressive power of these languages offers promise to be of great assistance in facing the programming challenges of present and future symbolic processing applications in Artificial Intelligence, Knowledge-based systems, and many other areas of computing. The sequential execution speed of logic programs has been greatly improved since the advent of the first interpreters. However, higher inference speeds are still required in order to meet the demands of applications such as those contemplated for next generation computer systems. The execution of logic programs in parallel is currently considered a promising strategy for attaining such inference speeds. Logic Programming in turn appears as a suitable programming paradigm for parallel architectures because of the many opportunities for parallel execution present in the implementation of logic programs. This dissertation presents an efficient parallel execution model for logic programs. The model is described from the source language level down to an "Abstract Machine" level suitable for direct implementation on existing parallel systems or for the design of special purpose parallel architectures. Few assumptions are made at the source language level and therefore the techniques developed and the general Abstract Machine design are applicable to a variety of logic (and also functional) languages. These techniques offer efficient solutions to several areas of parallel Logic Programming implementation previously considered problematic or a source of considerable overhead, such as the detection and handling of variable binding conflicts in AND-Parallelism, the specification of control and management of the execution tree, the treatment of distributed backtracking, and goal scheduling and memory management issues, etc. A parallel Abstract Machine design is offered, specifying data areas, operation, and a suitable instruction set. This design is based on extending to a parallel environment the techniques introduced by the Warren Abstract Machine, which have already made very fast and space efficient sequential systems a reality. Therefore, the model herein presented is capable of retaining sequential execution speed similar to that of high performance sequential systems, while extracting additional gains in speed by efficiently implementing parallel execution. These claims are supported by simulations of the Abstract Machine on sample programs.

Independence and search space preservation in dynamically scheduled constraint logic languages

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper performs a further generalization of the notion of independence in constraint logic programs to the context of constraint logic programs with dynamic scheduling. The complexity of this new environment made necessary to first formally define the relationship between independence and search space preservation in the context of CLP languages. In particular, we show that search space preservation is, in the context of CLP languages, not only a sufficient but also a necessary condition for ensuring that both the intended solutions and the number of transitions performed do not change. These results are then extended to dynamically scheduled languages and used as the basis for the extension of the concepts of independence. We also propose several a priori sufficient conditions for independence and also give correctness and efficiency results for parallel execution of constraint logic programs based on the proposed notions of independence.

Enhancing the expressiveness of linguistic structures

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the information society large amounts of information are being generated and transmitted constantly, especially in the most natural way for humans, i.e., natural language. Social networks, blogs, forums, and Q&A sites are a dynamic Large Knowledge Repository. So, Web 2.0 contains structured data but still the largest amount of information is expressed in natural language. Linguistic structures for text recognition enable the extraction of structured information from texts. However, the expressiveness of the current structures is limited as they have been designed with a strict order in their phrases, limiting their applicability to other languages and making them more sensible to grammatical errors. To overcome these limitations, in this paper we present a linguistic structure named ?linguistic schema?, with a richer expressiveness that introduces less implicit constraints over annotations.

Implementation of Boundary Elements in Microcomputers

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this chapter we are going to develop some aspects of the implementation of the boundary element method (BEM)in microcomputers. At the moment the BEM is established as a powerful tool for problem-solving and several codes have been developed and maintained on an industrial basis for large computers. It is also well known that one of the more attractive features of the BEM is the reduction of the discretization to the boundary of the domain under study. As drawbacks, we found the non-bandedness of the final matrix, wich is a full asymmetric one, and the computational difficulties related to obtaining the integrals which appear in the influence coefficients. Te reduction in dimensionality is crucial from the point of view of microcomputers, and we believe that it can be used to obtain competitive results against other domain methods. We shall discuss two applications in this chapter. The first one is related to plane linear elastostatic situations, and the second refers to plane potential problems. In the first case we shall present the classical isoparametric BEM approach, using linear elements to represent both the geometry and the variables. The second case shows how to implement a p-adaptive procedure using the BEM. This latter case has not been studied until recently, and we think that the future of the BEM will be related to its development and to the judicious exploitation of the graphics capabilities of modern micros. Some examples will be included to demonstrate the kind of results that can be expected and sections of printouts will show useful details of implementation. In order to broaden their applicability, these printouts have been prepared in Basic, although no doubt other languages may be more appropiate for effective implementation.

Transformation�based implementation and optimization of programs exploiting the basic Andorra model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The characteristics of CC and CLP systems are in principle very dierent However a recent trend towards convergence in the implementation techniques for these systems can be observed While CLP and Prolog systems have been incorporating capabilities to deal with userdened suspension and coroutining CC compilers have been trying to coalesce negrained tasks into coarsergrained sequential threads This convergence of techniques opens up the possibility of having a general purpose kernel language and abstract machine to serve as a compilation target for a variety of userlevel languages We propose a transformation technique directed towards such an objective In particular we report on techniques to support the Andorra computational model essentially emulating the AndorraI system via program transformation into a sequential language with delay primitives The system is automatic comprising an optional program analyzer and a basic transformer to the kernel language It turns out that a simple parallel CLP or Prolog system with dynamic scheduling is sucient as a kernel language for this purpose The preliminary results are quite encouraging performance of the resulting system is comparable to the current AndorraI implementation.

Integrated approach to foreign language in the building engineering degree at the Universidad Politécnica Madrid

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The European Union has been promoting linguistic diversity for many years as one of its main educational goals. This is an element that facilitates student mobility and student exchanges between different universities and countries and enriches the education of young undergraduates. In particular,a higher degree of competence in the English language is becoming essential for engineers, architects and researchers in general, as English has become the lingua franca that opens up horizons to internationalisation and the transfer of knowledge in today’s world. Many experts point to the Integrated Approach to Contents and Foreign Languages System as being an option that has certain benefits over the traditional method of teaching a second language that is exclusively based on specific subjects. This system advocates teaching the different subjects in the syllabus in a language other than one’s mother tongue, without prioritising knowledge of the language over the subject. This was the idea that in the 2009/10 academic year gave rise to the Second Language Integration Programme (SLI Programme) at the Escuela Arquitectura Tecnica in the Universidad Politecnica Madrid (EUATM-UPM), just at the beginning of the tuition of the new Building Engineering Degree, which had been adapted to the European Higher Education Area (EHEA) model. This programme is an interdisciplinary initiative for the set of subjects taught during the semester and is coordinated through the Assistant Director Office for Educational Innovation. The SLI Programme has a dual goal; to familiarise students with the specific English terminology of the subject being taught, and at the same time improve their communication skills in English. A total of thirty lecturers are taking part in the teaching of eleven first year subjects and twelve in the second year, with around 120 students who have voluntarily enrolled in a special group in each semester. During the 2010/2011 academic year the degree of acceptance and the results of the SLI Programme are being monitored. Tools have been designed to aid interdisciplinary coordination and to analyse satisfaction, such as coordination records and surveys. The results currently available refer to the first semester of the year and are divided into specific aspects of the different subjects involved and into general aspects of the ongoing experience.

Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the biggest challenges in speech synthesis is the production of contextually-appropriate naturally sounding synthetic voices. This means that a Text-To-Speech system must be able to analyze a text beyond the sentence limits in order to select, or even modulate, the speaking style according to a broader context. Our current architecture is based on a two-step approach: text genre identification and speaking style synthesis according to the detected discourse genre. For the final implementation, a set of four genres and their corresponding speaking styles were considered: broadcast news, live sport commentaries, interviews and political speeches. In the final TTS evaluation, the four speaking styles were transplanted to the neutral voices of other speakers not included in the training database. When the transplanted styles were compared to the neutral voices, transplantation was significantly preferred and the similarity to the target speaker was as high as 78%.

«
1
2
»