941 resultados para modularisation of ontologies


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Se presenta un panorama y los interrogantes fundamentales de la etapa de la Web 3.0. Se analizan las características actuales de los sistemas bibliográficos estructurados con el modelo entidad-relación. Se definen los niveles conceptual, lógico y físico en los sistemas informáticos; consecuentemente se presentan las características de los FRBR y se obervan las relaciones entre obra y documento en el modelo conceptual FRBR. Se describen los FRBRoo como una interpretación con una lógica de objetos de los mismos requerimientos funcionales. Finalmente se plantean las tendencias a futuro, tales como pasar de las modelizaciones de entidad-relación a la de objetos, la explicitación con anotación semántica consistente, el mapeo de bases bibliográficas existentes y el desarrollo de ontologías para que los sistemas documentales se integren en la Web Semática

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the beginning of the 90s, ontology development was similar to an art: ontology developers did not have clear guidelines on how to build ontologies but only some design criteria to be followed. Work on principles, methods and methodologies, together with supporting technologies and languages, made ontology development become an engineering discipline, the so-called Ontology Engineering. Ontology Engineering refers to the set of activities that concern the ontology development process and the ontology life cycle, the methods and methodologies for building ontologies, and the tool suites and languages that support them. Thanks to the work done in the Ontology Engineering field, the development of ontologies within and between teams has increased and improved, as well as the possibility of reusing ontologies in other developments and in final applications. Currently, ontologies are widely used in (a) Knowledge Engineering, Artificial Intelligence and Computer Science, (b) applications related to knowledge management, natural language processing, e-commerce, intelligent information integration, information retrieval, database design and integration, bio-informatics, education, and (c) the Semantic Web, the Semantic Grid, and the Linked Data initiative. In this paper, we provide an overview of Ontology Engineering, mentioning the most outstanding and used methodologies, languages, and tools for building ontologies. In addition, we include some words on how all these elements can be used in the Linked Data initiative.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web 1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs. These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools. Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate. However, linguistic annotation tools have still some limitations, which can be summarised as follows: 1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.). 2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts. 3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc. A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved. In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool. Therefore, it would be quite useful to find a way to (i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools; (ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate. Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned. Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section. 2. GOALS OF THE PRESENT WORK As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based triples, as in the usual Semantic Web languages (namely RDF(S) and OWL), in order for the model to be considered suitable for the Semantic Web. Besides, to be useful for the Semantic Web, this model should provide a way to automate the annotation of web pages. As for the present work, this requirement involved reusing the linguistic annotation tools purchased by the OEG research group (http://www.oeg-upm.net), but solving beforehand (or, at least, minimising) some of their limitations. Therefore, this model had to minimise these limitations by means of the integration of several linguistic annotation tools into a common architecture. Since this integration required the interoperation of tools and their annotations, ontologies were proposed as the main technological component to make them effectively interoperate. From the very beginning, it seemed that the formalisation of the elements and the knowledge underlying linguistic annotations within an appropriate set of ontologies would be a great step forward towards the formulation of such a model (henceforth referred to as OntoTag). Obviously, first, to combine the results of the linguistic annotation tools that operated at the same level, their annotation schemas had to be unified (or, preferably, standardised) in advance. This entailed the unification (id. standardisation) of their tags (both their representation and their meaning), and their format or syntax. Second, to merge the results of the linguistic annotation tools operating at different levels, their respective annotation schemas had to be (a) made interoperable and (b) integrated. And third, in order for the resulting annotations to suit the Semantic Web, they had to be specified by means of an ontology-based vocabulary, and structured by means of ontology-based triples, as hinted above. Therefore, a new annotation scheme had to be devised, based both on ontologies and on this type of triples, which allowed for the combination and the integration of the annotations of any set of linguistic annotation tools. This annotation scheme was considered a fundamental part of the model proposed here, and its development was, accordingly, another major objective of the present work. All these goals, aims and objectives could be re-stated more clearly as follows: Goal 1: Development of a set of ontologies for the formalisation of the linguistic knowledge relating linguistic annotation. Sub-goal 1.1: Ontological formalisation of the EAGLES (1996a; 1996b) de facto standards for morphosyntactic and syntactic annotation, in a way that helps respect the triple structure recommended for annotations in these works (which is isomorphic to the triple structures used in the context of the Semantic Web). Sub-goal 1.2: Incorporation into this preliminary ontological formalisation of other existing standards and standard proposals relating the levels mentioned above, such as those currently under development within ISO/TC 37 (the ISO Technical Committee dealing with Terminology, which deals also with linguistic resources and annotations). Sub-goal 1.3: Generalisation and extension of the recommendations in EAGLES (1996a; 1996b) and ISO/TC 37 to the semantic level, for which no ISO/TC 37 standards have been developed yet. Sub-goal 1.4: Ontological formalisation of the generalisations and/or extensions obtained in the previous sub-goal as generalisations and/or extensions of the corresponding ontology (or ontologies). Sub-goal 1.5: Ontological formalisation of the knowledge required to link, combine and unite the knowledge represented in the previously developed ontology (or ontologies). Goal 2: Development of OntoTag’s annotation scheme, a standard-based abstract scheme for the hybrid (linguistically-motivated and ontological-based) annotation of texts. Sub-goal 2.1: Development of the standard-based morphosyntactic annotation level of OntoTag’s scheme. This level should include, and possibly extend, the recommendations of EAGLES (1996a) and also the recommendations included in the ISO/MAF (2008) standard draft. Sub-goal 2.2: Development of the standard-based syntactic annotation level of the hybrid abstract scheme. This level should include, and possibly extend, the recommendations of EAGLES (1996b) and the ISO/SynAF (2010) standard draft. Sub-goal 2.3: Development of the standard-based semantic annotation level of OntoTag’s (abstract) scheme. Sub-goal 2.4: Development of the mechanisms for a convenient integration of the three annotation levels already mentioned. These mechanisms should take into account the recommendations included in the ISO/LAF (2009) standard draft. Goal 3: Design of OntoTag’s (abstract) annotation architecture, an abstract architecture for the hybrid (semantic) annotation of texts (i) that facilitates the integration and interoperation of different linguistic annotation tools, and (ii) whose results comply with OntoTag’s annotation scheme. Sub-goal 3.1: Specification of the decanting processes that allow for the classification and separation, according to their corresponding levels, of the results of the linguistic tools annotating at several different levels. Sub-goal 3.2: Specification of the standardisation processes that allow (a) complying with the standardisation requirements of OntoTag’s annotation scheme, as well as (b) combining the results of those linguistic tools that share some level of annotation. Sub-goal 3.3: Specification of the merging processes that allow for the combination of the output annotations and the interoperation of those linguistic tools that share some level of annotation. Sub-goal 3.4: Specification of the merge processes that allow for the integration of the results and the interoperation of those tools performing their annotations at different levels. Goal 4: Generation of OntoTagger’s schema, a concrete instance of OntoTag’s abstract scheme for a concrete set of linguistic annotations. These linguistic annotations result from the tools and the resources available in the research group, namely • Bitext’s DataLexica (http://www.bitext.com/EN/datalexica.asp), • LACELL’s (POS) tagger (http://www.um.es/grupos/grupo-lacell/quees.php), • Connexor’s FDG (http://www.connexor.eu/technology/machinese/glossary/fdg/), and • EuroWordNet (Vossen et al., 1998). This schema should help evaluate OntoTag’s underlying hypotheses, stated below. Consequently, it should implement, at least, those levels of the abstract scheme dealing with the annotations of the set of tools considered in this implementation. This includes the morphosyntactic, the syntactic and the semantic levels. Goal 5: Implementation of OntoTagger’s configuration, a concrete instance of OntoTag’s abstract architecture for this set of linguistic tools and annotations. This configuration (1) had to use the schema generated in the previous goal; and (2) should help support or refute the hypotheses of this work as well (see the next section). Sub-goal 5.1: Implementation of the decanting processes that facilitate the classification and separation of the results of those linguistic resources that provide annotations at several different levels (on the one hand, LACELL’s tagger operates at the morphosyntactic level and, minimally, also at the semantic level; on the other hand, FDG operates at the morphosyntactic and the syntactic levels and, minimally, at the semantic level as well). Sub-goal 5.2: Implementation of the standardisation processes that allow (i) specifying the results of those linguistic tools that share some level of annotation according to the requirements of OntoTagger’s schema, as well as (ii) combining these shared level results. In particular, all the tools selected perform morphosyntactic annotations and they had to be conveniently combined by means of these processes. Sub-goal 5.3: Implementation of the merging processes that allow for the combination (and possibly the improvement) of the annotations and the interoperation of the tools that share some level of annotation (in particular, those relating the morphosyntactic level, as in the previous sub-goal). Sub-goal 5.4: Implementation of the merging processes that allow for the integration of the different standardised and combined annotations aforementioned, relating all the levels considered. Sub-goal 5.5: Improvement of the semantic level of this configuration by adding a named entity recognition, (sub-)classification and annotation subsystem, which also uses the named entities annotated to populate a domain ontology, in order to provide a concrete application of the present work in the two areas involved (the Semantic Web and Corpus Linguistics). 3. MAIN RESULTS: ASSESSMENT OF ONTOTAG’S UNDERLYING HYPOTHESES The model developed in the present thesis tries to shed some light on (i) whether linguistic annotation tools can effectively interoperate; (ii) whether their results can be combined and integrated; and, if they can, (iii) how they can, respectively, interoperate and be combined and integrated. Accordingly, several hypotheses had to be supported (or rejected) by the development of the OntoTag model and OntoTagger (its implementation). The hypotheses underlying OntoTag are surveyed below. Only one of the hypotheses (H.6) was rejected; the other five could be confirmed. H.1 The annotations of different levels (or layers) can be integrated into a sort of overall, comprehensive, multilayer and multilevel annotation, so that their elements can complement and refer to each other. • CONFIRMED by the development of: o OntoTag’s annotation scheme, o OntoTag’s annotation architecture, o OntoTagger’s (XML, RDF, OWL) annotation schemas, o OntoTagger’s configuration. H.2 Tool-dependent annotations can be mapped onto a sort of tool-independent annotations and, thus, can be standardised. • CONFIRMED by means of the standardisation phase incorporated into OntoTag and OntoTagger for the annotations yielded by the tools. H.3 Standardisation should ease: H.3.1: The interoperation of linguistic tools. H.3.2: The comparison, combination (at the same level and layer) and integration (at different levels or layers) of annotations. • H.3 was CONFIRMED by means of the development of OntoTagger’s ontology-based configuration: o Interoperation, comparison, combination and integration of the annotations of three different linguistic tools (Connexor’s FDG, Bitext’s DataLexica and LACELL’s tagger); o Integration of EuroWordNet-based, domain-ontology-based and named entity annotations at the semantic level. o Integration of morphosyntactic, syntactic and semantic annotations. H.4 Ontologies and Semantic Web technologies (can) play a crucial role in the standardisation of linguistic annotations, by providing consensual vocabularies and standardised formats for annotation (e.g., RDF triples). • CONFIRMED by means of the development of OntoTagger’s RDF-triple-based annotation schemas. H.5 The rate of errors introduced by a linguistic tool at a given level, when annotating, can be reduced automatically by contrasting and combining its results with the ones coming from other tools, operating at the same level. However, these other tools might be built following a different technological (stochastic vs. rule-based, for example) or theoretical (dependency vs. HPS-grammar-based, for instance) approach. • CONFIRMED by the results yielded by the evaluation of OntoTagger. H.6 Each linguistic level can be managed and annotated independently. • REJECTED: OntoTagger’s experiments and the dependencies observed among the morphosyntactic annotations, and between them and the syntactic annotations. In fact, Hypothesis H.6 was already rejected when OntoTag’s ontologies were developed. We observed then that several linguistic units stand on an interface between levels, belonging thereby to both of them (such as morphosyntactic units, which belong to both the morphological level and the syntactic level). Therefore, the annotations of these levels overlap and cannot be handled independently when merged into a unique multileveled annotation. 4. OTHER MAIN RESULTS AND CONTRIBUTIONS First, interoperability is a hot topic for both the linguistic annotation community and the whole Computer Science field. The specification (and implementation) of OntoTag’s architecture for the combination and integration of linguistic (annotation) tools and annotations by means of ontologies shows a way to make these different linguistic annotation tools and annotations interoperate in practice. Second, as mentioned above, the elements involved in linguistic annotation were formalised in a set (or network) of ontologies (OntoTag’s linguistic ontologies). • On the one hand, OntoTag’s network of ontologies consists of − The Linguistic Unit Ontology (LUO), which includes a mostly hierarchical formalisation of the different types of linguistic elements (i.e., units) identifiable in a written text; − The Linguistic Attribute Ontology (LAO), which includes also a mostly hierarchical formalisation of the different types of features that characterise the linguistic units included in the LUO; − The Linguistic Value Ontology (LVO), which includes the corresponding formalisation of the different values that the attributes in the LAO can take; − The OIO (OntoTag’s Integration Ontology), which  Includes the knowledge required to link, combine and unite the knowledge represented in the LUO, the LAO and the LVO;  Can be viewed as a knowledge representation ontology that describes the most elementary vocabulary used in the area of annotation. • On the other hand, OntoTag’s ontologies incorporate the knowledge included in the different standards and recommendations for linguistic annotation released so far, such as those developed within the EAGLES and the SIMPLE European projects or by the ISO/TC 37 committee: − As far as morphosyntactic annotations are concerned, OntoTag’s ontologies formalise the terms in the EAGLES (1996a) recommendations and their corresponding terms within the ISO Morphosyntactic Annotation Framework (ISO/MAF, 2008) standard; − As for syntactic annotations, OntoTag’s ontologies incorporate the terms in the EAGLES (1996b) recommendations and their corresponding terms within the ISO Syntactic Annotation Framework (ISO/SynAF, 2010) standard draft; − Regarding semantic annotations, OntoTag’s ontologies generalise and extend the recommendations in EAGLES (1996a; 1996b) and, since no stable standards or standard drafts have been released for semantic annotation by ISO/TC 37 yet, they incorporate the terms in SIMPLE (2000) instead; − The terms coming from all these recommendations and standards were supplemented by those within the ISO Data Category Registry (ISO/DCR, 2008) and also of the ISO Linguistic Annotation Framework (ISO/LAF, 2009) standard draft when developing OntoTag’s ontologies. Third, we showed that the combination of the results of tools annotating at the same level can yield better results (both in precision and in recall) than each tool separately. In particular, 1. OntoTagger clearly outperformed two of the tools integrated into its configuration, namely DataLexica and FDG in all the combination sub-phases in which they overlapped (i.e. POS tagging, lemma annotation and morphological feature annotation). As far as the remaining tool is concerned, i.e. LACELL’s tagger, it was also outperformed by OntoTagger in POS tagging and lemma annotation, and it did not behave better than OntoTagger in the morphological feature annotation layer. 2. As an immediate result, this implies that a) This type of combination architecture configurations can be applied in order to improve significantly the accuracy of linguistic annotations; and b) Concerning the morphosyntactic level, this could be regarded as a way of constructing more robust and more accurate POS tagging systems. Fourth, Semantic Web annotations are usually performed by humans or else by machine learning systems. Both of them leave much to be desired: the former, with respect to their annotation rate; the latter, with respect to their (average) precision and recall. In this work, we showed how linguistic tools can be wrapped in order to annotate automatically Semantic Web pages using ontologies. This entails their fast, robust and accurate semantic annotation. As a way of example, as mentioned in Sub-goal 5.5, we developed a particular OntoTagger module for the recognition, classification and labelling of named entities, according to the MUC and ACE tagsets (Chinchor, 1997; Doddington et al., 2004). These tagsets were further specified by means of a domain ontology, namely the Cinema Named Entities Ontology (CNEO). This module was applied to the automatic annotation of ten different web pages containing cinema reviews (that is, around 5000 words). In addition, the named entities annotated with this module were also labelled as instances (or individuals) of the classes included in the CNEO and, then, were used to populate this domain ontology. • The statistical results obtained from the evaluation of this particular module of OntoTagger can be summarised as follows. On the one hand, as far as recall (R) is concerned, (R.1) the lowest value was 76,40% (for file 7); (R.2) the highest value was 97, 50% (for file 3); and (R.3) the average value was 88,73%. On the other hand, as far as the precision rate (P) is concerned, (P.1) its minimum was 93,75% (for file 4); (R.2) its maximum was 100% (for files 1, 5, 7, 8, 9, and 10); and (R.3) its average value was 98,99%. • These results, which apply to the tasks of named entity annotation and ontology population, are extraordinary good for both of them. They can be explained on the basis of the high accuracy of the annotations provided by OntoTagger at the lower levels (mainly at the morphosyntactic level). However, they should be conveniently qualified, since they might be too domain- and/or language-dependent. It should be further experimented how our approach works in a different domain or a different language, such as French, English, or German. • In any case, the results of this application of Human Language Technologies to Ontology Population (and, accordingly, to Ontological Engineering) seem very promising and encouraging in order for these two areas to collaborate and complement each other in the area of semantic annotation. Fifth, as shown in the State of the Art of this work, there are different approaches and models for the semantic annotation of texts, but all of them focus on a particular view of the semantic level. Clearly, all these approaches and models should be integrated in order to bear a coherent and joint semantic annotation level. OntoTag shows how (i) these semantic annotation layers could be integrated together; and (ii) they could be integrated with the annotations associated to other annotation levels. Sixth, we identified some recommendations, best practices and lessons learned for annotation standardisation, interoperation and merge. They show how standardisation (via ontologies, in this case) enables the combination, integration and interoperation of different linguistic tools and their annotations into a multilayered (or multileveled) linguistic annotation, which is one of the hot topics in the area of Linguistic Annotation. And last but not least, OntoTag’s annotation scheme and OntoTagger’s annotation schemas show a way to formalise and annotate coherently and uniformly the different units and features associated to the different levels and layers of linguistic annotation. This is a great scientific step ahead towards the global standardisation of this area, which is the aim of ISO/TC 37 (in particular, Subcommittee 4, dealing with the standardisation of linguistic annotations and resources).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Lexica and terminology databases play a vital role in many NLP applications, but currently most such resources are published in application-specific formats, or with custom access interfaces, leading to the problem that much of this data is in ‘‘data silos’’ and hence difficult to access. The Semantic Web and in particular the Linked Data initiative provide effective solutions to this problem, as well as possibilities for data reuse by inter-lexicon linking, and incorporation of data categories by dereferencable URIs. The Semantic Web focuses on the use of ontologies to describe semantics on the Web, but currently there is no standard for providing complex lexical information for such ontologies and for describing the relationship between the lexicon and the ontology. We present our model, lemon, which aims to address these gaps

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Semantic Web aims to allow machines to make inferences using the explicit conceptualisations contained in ontologies. By pointing to ontologies, Semantic Web-based applications are able to inter-operate and share common information easily. Nevertheless, multilingual semantic applications are still rare, owing to the fact that most online ontologies are monolingual in English. In order to solve this issue, techniques for ontology localisation and translation are needed. However, traditional machine translation is difficult to apply to ontologies, owing to the fact that ontology labels tend to be quite short in length and linguistically different from the free text paradigm. In this paper, we propose an approach to enhance machine translation of ontologies based on exploiting the well-structured concept descriptions contained in the ontology. In particular, our approach leverages the semantics contained in the ontology by using Cross Lingual Explicit Semantic Analysis (CLESA) for context-based disambiguation in phrase-based Statistical Machine Translation (SMT). The presented work is novel in the sense that application of CLESA in SMT has not been performed earlier to the best of our knowledge.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we present the MultiFarm dataset, which has been designed as a benchmark for multilingual ontology matching. The MultiFarm dataset is composed of a set of ontologies translated in different languages and the corresponding alignments between these ontologies. It is based on the OntoFarm dataset, which has been used successfully for several years in the Ontology Alignment Evaluation Initiative (OAEI). By translating the ontologies of the OntoFarm dataset into eight different languages – Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish – we created a comprehensive set of realistic test cases. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En los años recientes se ha producido un rápido crecimiento del comercio internacional en productos semielaborados que son diseñados, producidos y ensamblados en diferentes localizaciones a lo largo de diferentes países, debido principalmente a los siguientes motivos: el desarrollo de las tecnologías de la información, la reducción de los costes de transporte, la liberalización de los mercados de capitales, la armonización de factores institucionales, la integración económica regional que implica la reducción y la eliminación de las barreras al comercio, el desarrollo económico de los países emergentes, el uso de economías de escala, así como una desregulación del comercio internacional. Todo ello ha incrementado la competencia a nivel mundial en los mercados y ha posibilitado a las compañías tener más facilidad de acceso a potenciales mercados, así como a la adquisición de capacidades y conocimientos en otros países y a la realización de alianzas estratégicas internacionales con terceros, creando un entorno con mayor incertidumbre y más exigente para las compañías que componen una industria, y que tiene consecuencias directas en las operaciones de las compañías y en la organización de su producción. Las compañías, para adaptarse, ser competitivas y beneficiarse de este nuevo escenario globalizado y más competitivo, han externalizado partes del proceso productivo hacia proveedores especializados, creando un nuevo mercado intermedio que divide el proceso productivo, anteriormente integrado en las compañías que conforman una industria, entre dos conjuntos de empresas especializadas en esa industria. Dicho proceso suele ocurrir conservando la industria en que tiene lugar, los mismos servicios y productos, la tecnología empleada y las compañías originales que la conformaban previamente a la desintegración vertical. Todo ello es así debido a que es beneficioso tanto para las compañías originales de la industria como para las nuevas compañías de este mercado intermedio por diversos motivos. La desintegración vertical en una industria tiene unas consecuencias que la transforman completamente, así como la forma de operar de las compañías que la integran, incluso para aquellas que permanecen verticalmente integradas. Una de las características más importantes de esta desintegración vertical en una industria es la posibilidad que tiene una compañía de adquirir a una tercera la primera parte del proceso productivo o un bien semielaborado, que posteriormente será finalizado por la compañía adquiriente con la práctica del outsourcing; así mismo, una compañía puede realizar la primera parte del proceso productivo o un bien semielaborado, que posteriormente será finalizado por una tercera compañía con la práctica de la fragmentación. El principal objetivo de la presente investigación es el estudio de los motivos, los facilitadores, los efectos, las consecuencias y los principales factores significativos, microeconómicos y macroeconómicos, que desencadenan o incrementan la práctica de la desintegración vertical en una industria; para ello, la investigación se divide en dos líneas completamente diferenciadas: el estudio de la práctica del outsourcing y, por otro lado, el estudio de la fragmentación por parte de las compañías que componen la industria del automóvil en España, puesto que se trata de una de las industrias más desintegradas verticalmente y fragmentadas, y este sector posee una gran importancia en la economía del país. En primer lugar, se hace una revisión de la literatura existente relativa a los siguientes aspectos: desintegración vertical, outsourcing, fragmentación, teoría del comercio internacional, historia de la industria del automóvil en España y el uso de las aglomeraciones geográficas y las tecnologías de la información en el sector del automóvil. La metodología empleada en cada uno de ellos ha sido diferente en función de la disponibilidad de los datos y del enfoque de investigación: los factores microeconómicos, utilizando el outsourcing, y los factores macroeconómicos, empleando la fragmentación. En el estudio del outsourcing, se usa un índice basado en las compras externas sobre el valor total de la producción. Así mismo, se estudia su correlación y significación con las variables económicas más importantes que definen a una compañía del sector del automóvil, utilizando la técnica estadística de regresión lineal. Aquellas variables relacionadas con la competencia en el mercado, la externalización de las actividades de menor valor añadido y el incremento de la modularización de las actividades de la cadena de valor, han resultado significativas con la práctica del outsourcing. En el estudio de la fragmentación se seleccionan un conjunto de factores macroeconómicos, comúnmente usados en este tipo de investigaciones, relacionados con las principales magnitudes económicas de un país, y un conjunto de factores macroeconómicos, no comúnmente usados en este tipo de investigaciones, relacionados con la libertad económica y el comercio internacional de un país. Se emplea un modelo de regresión logística para identificar qué factores son significativos en la práctica de la fragmentación. De entre todos los factores usados en el modelo, los relacionados con las economías de escala y los costes de servicio han resultado significativos. Los resultados obtenidos de los test estadísticos realizados en el modelo de regresión logística han resultado satisfactorios; por ello, el modelo propuesto de regresión logística puede ser considerado sólido, fiable y versátil; además, acorde con la realidad. De los resultados obtenidos en el estudio del outsourcing y de la fragmentación, combinados conjuntamente con el estado del arte, se concluye que el principal factor que desencadena la desintegración vertical en la industria del automóvil es la competencia en el mercado de vehículos. Cuanto mayor es la demanda de vehículos, más se reducen los beneficios y la rentabilidad para sus fabricantes. Estos, para ser competitivos, diferencian sus productos de la competencia centrándose en las actividades que mayor valor añadido aportan al producto final, externalizando las actividades de menor valor añadido a proveedores especializados, e incrementando la modularidad de las actividades de la cadena de valor. Las compañías de la industria del automóvil se especializan en alguna o varias de estas actividades modularizadas que, combinadas con el uso de factores facilitadores como las economías de escala, las tecnologías de la información, las ventajas de la globalización económica y la aglomeración geográfica de una industria, incrementan y motivan la desintegración vertical en la industria del automóvil, desencadenando la coespecialización en dos sectores claramente diferenciados: el sector de fabricantes de vehículos y el sector de proveedores especializados. Cada uno de ellos se especializa en unas actividades y en unos productos o servicios específicos de la cadena de valor, lo cual genera las siguientes consecuencias en la industria del automóvil: se reducen los costes de transacción en los productos o servicios intercambiados; se incrementan la relación de dependencia entre fabricantes de vehículos y proveedores especializados, provocando un aumento en la cooperación y la coordinación, acelerando el proceso de aprendizaje, posibilitando a ambos adquirir nuevas capacidades, conocimientos y recursos, y creando nuevas ventajas competitivas para ambos; por último, las barreras de entrada a la industria del automóvil y el número de compañías se ven alteradas cambiando su estructura. Como futura línea de investigación, los fabricantes de vehículos tenderán a centrarse en investigar, diseñar y comercializar el producto o servicio, delegando el ensamblaje en manos de nuevos especialistas en la materia, el contract manufacturer; por ello, sería conveniente investigar qué factores motivantes o facilitadores existen y qué consecuencias tendría la implantación de los contract manufacturer en la industria del automóvil. 1.1. ABSTRACT In recent years there has been a rapid growth of international trade in semi-finished products designed, produced and assembled in different locations across different countries, mainly due to the following reasons: development of information technologies, reduction of transportation costs, liberalisation of capital markets, harmonisation of institutional factors, regional economic integration, which involves the reduction and elimination of trade barriers, economic development of emerging countries, use of economies of scale and deregulation of international trade. All these factors have increased competition in markets at a global level and have allowed companies to gain easier access to potential markets and to the acquisition of skills and knowledge in other countries, as well as to the completion of international strategic alliances with third parties, thus creating a more demanding and uncertain environment for these companies constituting an industry, which has a direct impact on the companies' operations and the organization of their production. In order to adapt, be competitive and benefit from this new and more competitive global scenario, companies have outsourced some parts of their production process to specialist suppliers, generating a new intermediate market which divides the production process, previously integrated in the companies that made up the industry, into two sets of companies specialized in that industry. This process often occurs while preserving the industry where it takes place, its same services and products, the technology used and the original companies that formed it prior to vertical disintegration. This is because it is beneficial for both the industry's original companies and the companies belonging to this new intermediate market, for various reasons. Vertical disintegration has consequences which completely transform the industry where it takes place as well as the modus operandi of the companies that are part of it, even of those who remain vertically integrated. One of the most important features of vertical disintegration of an industry is the possibility for a company to acquire from a third one the first part of the production process or a semi-finished product, which will then be finished by the acquiring company through the practice of outsourcing; also, a company can perform the first part of the production process or a semi-finish product, which will then be completed by a third company through the practice of fragmentation. The main objective of this research is to study the motives, facilitators, effects, consequences and major significant microeconomic and macroeconomic factors that trigger or increase the practice of vertical disintegration in a certain industry; in order to do so, research is divided into two completely differentiated lines: on the one hand, the study of the practise of outsourcing and, on the other, the study of fragmentation by companies constituting the automotive industry in Spain, since this is one of the most vertically disintegrated and fragmented industries and this particular sector is of major significance in this country's economy. First, a review is made of the existing literature, on the following aspects: vertical disintegration, outsourcing, fragmentation, international trade theory, history of the automobile industry in Spain and the use of geographical agglomeration and information technologies in the automotive sector. The methodology used for each of these aspects has been different depending on the availability of data and the research approach: the microeconomic factors, using outsourcing, and the macroeconomic factors, using fragmentation. In the study on outsourcing, an index is used based on external purchases in relation to the total value of production. Likewise, their significance and correlation with the major economic variables that define an automotive company are studied, using the statistical technique of linear regression. Variables related to market competition, outsourcing of lowest value-added activities and increased modularisation of the activities of the value chain have turned out to be significant with the practice of outsourcing. In the study of fragmentation, a set of macroeconomic factors commonly used for this type of research, is selected, related to the main economic indicators of a country, as well as a set of macroeconomic factors, not commonly used for this type of research, which are related to economic freedom and the international trade of a certain country. A logistic regression model is used to identify which factors are significant in the practice of fragmentation. Amongst all factors used in the model, those related to economies of scale and service costs have turned out to be significant. The results obtained from the statistical tests performed on the logistic regression model have been successful; hence, the suggested logistic regression model can be considered to be solid, reliable and versatile; likewise, it is in line with reality. From the results obtained in the study of outsourcing and fragmentation, combined with the state of the art, it is concluded that the main factor that triggers vertical disintegration in the automotive industry is competition within the vehicle market. The greater the vehicle demand, the lower the earnings and profitability for manufacturers. These, in order to be competitive, differentiate their products from the competition by focusing on those activities that contribute with the highest added value to the final product, outsourcing the lower valueadded activities to specialist suppliers, and increasing the modularity of the activities of the value chain. Companies in the automotive industry specialize in one or more of these modularised activities which, combined with the use of enabling factors such as economies of scale, information technologies, the advantages of economic globalisation and the geographical agglomeration of an industry, increase and encourage vertical disintegration in the automotive industry, triggering co-specialization in two clearly distinct sectors: the sector of vehicle manufacturers and the specialist suppliers sector. Each of them specializes in certain activities and specific products or services of the value chain, generating the following consequences in the automotive industry: reduction of transaction costs of the goods or services exchanged; growth of the relationship of dependency between vehicle manufacturers and specialist suppliers, which causes an increase in cooperation and coordination, accelerates the learning process, enables both to acquire new skills, knowledge and resources, and creates new competitive advantages for both; finally, barriers to entry the automotive industry and the number of companies are altered, changing their structure. As a future line of research, vehicle manufacturers will tend to focus on researching, designing and marketing the product or service, delegating the assembly in the hands of new specialists in the field, the contract manufacturer; for this reason, it would be useful to investigate what motivating or facilitating factors exist in this respect and what consequences would the implementation of contract manufacturers have in the automotive industry.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This chapter presents methodological guidelines that allow engineers to reuse generic ontologies. This kind of ontologies represents notions generic across many fields, (is part of, temporal interval, etc.). The guidelines helps the developer (a) to identify the type of generic ontology to be reused, (b) to find out the axioms and definitions that should be reused and (c) to adapt and integrate the generic ontology selected in the domain ontology to be developed. For each task of the methodology, a set of heuristics with examples are presented. We hope that after reading this chapter, you would have acquired some basic ideas on how to take advantage of the great deal of well-founded explicit knowledge that formalizes generic notions such as time concepts and the part of relation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In contrast to other approaches that provide methodological guidance for ontology engineering, the NeOn Methodology does not prescribe a rigid workflow, but instead it suggests a variety of pathways for developing ontologies. The nine scenarios proposed in the methodology cover commonly occurring situations, for example, when available ontologies need to be re-engineered, aligned, modularized, localized to support different languages and cultures, and integrated with ontology design patterns and non-ontological resources, such as folksonomies or thesauri. In addition, the NeOn Methodology framework provides (a) a glossary of processes and activities involved in the development of ontologies, (b) two ontology life cycle models, and (c) a set of methodological guidelines for different processes and activities, which are described (a) functionally, in terms of goals, inputs, outputs, and relevant constraints; (b) procedurally, by means of workflow specifications; and (c) empirically, through a set of illustrative examples.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La seguridad en redes informáticas es un área que ha sido ampliamente estudiada y objeto de una extensa investigación en los últimos años. Debido al continuo incremento en la complejidad y sofisticación de los ataques informáticos, el aumento de su velocidad de difusión, y la lentitud de reacción frente a las intrusiones existente en la actualidad, se hace patente la necesidad de mecanismos de detección y respuesta a intrusiones, que detecten y además sean capaces de bloquear el ataque, y mitiguen su impacto en la medida de lo posible. Los Sistemas de Detección de Intrusiones o IDSs son tecnologías bastante maduras cuyo objetivo es detectar cualquier comportamiento malicioso que ocurra en las redes. Estos sistemas han evolucionado rápidamente en los últimos años convirtiéndose en herramientas muy maduras basadas en diferentes paradigmas, que mejoran su capacidad de detección y le otorgan un alto nivel de fiabilidad. Por otra parte, un Sistema de Respuesta a Intrusiones (IRS) es un componente de seguridad que puede estar presente en la arquitectura de una red informática, capaz de reaccionar frente a los incidentes detectados por un Sistema de Detección de Intrusiones (IDS). Por desgracia, esta tecnología no ha evolucionado al mismo ritmo que los IDSs, y la reacción contra los ataques detectados es lenta y básica, y los sistemas presentan problemas para ejecutar respuestas de forma automática. Esta tesis doctoral trata de hacer frente al problema existente en la reacción automática frente a intrusiones, mediante el uso de ontologías, lenguajes formales de especificación de comportamiento y razonadores semánticos como base de la arquitectura del sistema de un sistema de respuesta automática frente a intrusiones o AIRS. El objetivo de la aproximación es aprovechar las ventajas de las ontologías en entornos heterogéneos, además de su capacidad para especificar comportamiento sobre los objetos que representan los elementos del dominio modelado. Esta capacidad para especificar comportamiento será de gran utilidad para que el AIRS infiera la respuesta óptima frente a una intrusión en el menor tiempo posible. Abstract Security in networks is an area that has been widely studied and has been the focus of extensive research over the past few years. The number of security events is increasing, and they are each time more sophisticated, and quickly spread, and slow reaction against intrusions, there is a need for intrusion detection and response systems to dynamically adapt so as to better detect and respond to attacks in order to mitigate them or reduce their impact. Intrusion Detection Systems (IDSs) are mature technologies whose aim is detecting malicious behavior in the networks. These systems have quickly evolved and there are now very mature tools based on different paradigms (statistic anomaly-based, signature-based and hybrids) with a high level of reliability. On the other hand, Intrusion Response System (IRS) is a security technology able to react against the intrusions detected by IDS. Unfortunately, the state of the art in IRSs is not as mature as with IDSs. The reaction against intrusions is slow and simple, and these systems have difficulty detecting intrusions in real time and triggering automated responses. This dissertation is to address the existing problem in automated reactions against intrusions using ontologies, formal behaviour languages and semantic reasoners as the basis of the architecture of an automated intrusion response systems or AIRS. The aim is to take advantage of ontologies in heterogeneous environments, in addition to its ability to specify behavior of objects representing the elements of the modeling domain. This ability to specify behavior will be useful for the AIRS in the inference process of the optimum response against an intrusion, as quickly as possible.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El progresivo envejecimiento de la población está produciendo una elevada demanda de servicios socio‐asistenciales por parte de las personas mayores para mantener su vida independiente y el consiguiente “envejecimiento activo”. La iniciativa Ambient Assisted Living (AAL) promueve el “envejecimiento activo” a través de las Tecnologías de la Información y las Comunicaciones (TIC) y es en ella donde se centrará el trabajo de esta tesis doctoral. Una característica fundamental de los servicios AAL es su adaptación y personalización a las características y preferencias del usuario y su contexto. Así, el paradigma “context awareness” presenta una gran relevancia en la provisión de servicios AAL y en el soporte a la vida independiente de las personas mayores. Concretamente, la utilización de ontologías permite crear modelos de usuarios y contexto que pueden ser utilizadas para los mecanismos de razonamiento incluidos en los servicios context‐aware. Por otra parte, los usuarios actualmente precisan acceder a un conjunto de servicios desde cualquier red de acceso y desde cualquier dispositivo. Las redes de próxima generación (Next Generation Networks‐NGN) lo hacen posible pues ofrecen una convergencia dispositivo‐red‐servicio. La tecnología IMS (IP Multimedia Subsystem) es una arquitectura que implementa el paradigma NGN y ofrece una serie de servicios de red genéricos llamados servicios habilitadores o enablers que pueden ser reutilizados en cualquier aplicación, soportando mecanismos de interoperabilidad entre aplicaciones y permitiendo un desarrollo robusto, rápido y sencillo. Además, los servicios enablers permiten mecanismos de gestión de la información de usuario para realizar una provisión adaptada del servicio en función de la información del estado del usuario. El objetivo de esta tesis doctoral se centra en establecer un marco de convergencia entre estos dos campos diseñando y desarrollando un conjunto de servicios enablers soportados en una arquitectura IMS implementada para soportar la provisión de aplicaciones AAL bajo el paradigma context‐awareness y la triple convergencia reddispositivo‐ servicio cubriendo así las necesidades y requisitos de las personas mayores. Entre las aportaciones de la presente tesis se destaca la realización de un modelo de plataforma servicios AAL, denominado Residencia Virtual Asistiva, para su provisión en el domicilio de la persona mayor, así como la propuesta de implementación de sus servicios a través de servicios enablers. Por otra parte se define una ontología destinada a modelar servicios AAL así como sus usuarios (personas mayores) para lograr una provisión personalizada y adaptada de servicios AAL. Esta ontología se ha implementado a través del servicio de presencia de la arquitectura IMS para poder crear perfiles de usuario y así poder realizar dicha provisión personalizada. Además, se desarrolla una aplicación de teleconsulta, como ejemplo de servicio AAL, que utiliza una serie de servicios enablers desarrollados para ofrecer funcionalidades avanzadas a la aplicación. Bajo el paradigma contex‐awareness se ha desarrollado y evaluado técnicamente un servicio enabler para ofrecer soporte a la movilidad y a la independencia de las personas mayores con deterioro cognitivo que sufren episodios de desorientación espacial. ABSTRACT The progressive ageing of the population is making elderly people demand sociohealthcare services to maintain an independent living and therefore an “active ageing”. The initiative Ambient Assisted Living (AAL), on which the current PhD thesis is focused, promotes the “active ageing” by means of Information and Communication Technologies (ICT). Essential features of AAL services are the adaptation and personalization to the user’s characteristics and preferences as well as user’s context. Thus, the “context‐awareness” paradigm implies a great importance in the AAL service provision and the elderly independent living support. In particular, the usage of ontologies allows creating user and contexts models to be employed in the reasoning mechanism of context‐aware services. On the other hand, users currently require accessing to a set of services from anywhere and any device. Next‐Generation Networks (NGN) support this need by offering a service‐network‐device convergence. The IP Multimedia Subsystem (IMS) technology is an architecture that implements the NGN paradigm and offers a generic network services know as service enabler which can be reused by any application supporting application interoperability mechanism as well as allowing a simple, fast and robust application development. Furthermore, the service enablers offer user’s information management procedures to achieve and adapt service provision considering the user’s status. The objective of this PhD thesis is focused on establishing a convergence framework between these two previous fields by designing and developing a group of service enablers that will be deployed in an IMS architecture. The enablers developed will support the AAL applications provision from the context‐awareness paradigm and service‐network‐device convergence in order to cover the elderly people’s requirements and needs. Among the contributions achieved in this PhD thesis, the definition of an AAL platform service model, named as “Assited Virtual Nursing Home”, for being deployed in the older adult home is emphasised. In addition, a proposal of service enablers to support the AAL service defined in the model is made. Otherwise, an ontology is defined to model AAL services as well as their users with the aim at achieve a personalized and adapted AAL service. This ontology has been implemented by means of the IMS service presence in order to create users profiles to be used in the personalized AAL services. As an example of AAL service, a teleconsulting application has been developed to employ a group of service enablers developed using a set of advanced functionalities. Considering the context‐paradigm, a service enabler has been developed and technologically evaluated to support the mobility and independence of elderly people with mild cognitive impairment who suffers spatial disorientation episodes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes the adaptation approach of reusable knowledge representation components used in the KSM environment for the formulation and operationalisation of structured knowledge models. Reusable knowledge representation components in KSM are called primitives of representation. A primitive of representation provides: (1) a knowledge representation formalism (2) a set of tasks that use this knowledge together with several problem-solving methods to carry out these tasks (3) a knowledge acquisition module that provides different services to acquire and validate this knowledge (4) an abstract terminology about the linguistic categories included in the representation language associated to the primitive. Primitives of representation usually are domain independent. A primitive of representation can be adapted to support knowledge in a given domain by importing concepts from this domain. The paper describes how this activity can be carried out by mean of a terminological importation. Informally, a terminological importation partially populates an abstract terminology with concepts taken from a given domain. The information provided by the importation can be used by the acquisition and validation facilities to constraint the classes of knowledge that can be described using the representation formalism according to the domain knowledge. KSM provides the LINK-S language to specify terminological importation from a domain terminology to an abstract one. These terminologies are described in KSM by mean of the CONCEL language. Terminological importation is used to adapt reusable primitives of representation in order to increase the usability degree of such components in these domains. In addition, two primitives of representation can share a common vocabulary by importing common domain CONCEL terminologies (conceptual vocabularies). It is a necessary condition to make possible the interoperability between different, heterogeneous knowledge representation components in the framework of complex knowledge - based architectures.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En los últimos años la evolución de la información compartida por internet ha cambiado enormemente, llegando a convertirse en lo que llamamos hoy la Web Semántica. Este término, acuñado en 2004, muestra una manera más “inteligente” de compartir los datos, de tal manera que éstos puedan ser entendibles por una máquina o por cualquier persona en el mundo. Ahora mismo se encuentra en fase de expansión, prueba de ello es la cantidad de grupos de investigación que están actualmente dedicando sus esfuerzos al desarrollo e implementación de la misma y la amplitud de temáticas que tienen sus trabajos. Con la aparición de la Web Semántica, la tendencia de las bases de datos de nueva creación se está empezando a inclinar hacia la creación de ontologías más o menos sencillas que describan las bases de datos y así beneficiarse de las posibilidades de interoperabilidad que aporta. Con el presente trabajo se pretende el estudio de los beneficios que aporta la implementación de una ontología en una base de datos relacional ya creada, los trabajos necesarios para ello y las herramientas necesarias para hacerlo. Para ello se han tomado unos datos de gran interés y, como continuación a su trabajo, se ha implementado la ontología. Estos datos provienen del estudio de un método para la obtención automatizada del linaje de las parcelas registradas en el catastro español. Abstract: In the last years the evolution of the information shared on the Internet has dramatically changed, emerging what is called Semantic Web. This term appeared in 2004, defining a “smarter” way of sharing data. Data that could be understood by machines or by any human around the world. Nowadays, the Semantic Web is in expansion phase, as it can be probed by the amount of research groups working on this approach and the wide thematic range of their work. With the appearance of the Semantic Web, current database technologies are supported by the creation of ontologies which describe them and therefore get a new set of interoperability possibilities from them. This work focuses in the study of the benefits given by the implementation of an ontology in a created relational database, the steps to follow and the tools necessary to get it done. The study has been done by using data of considerable interest, coming from a study of the lineage of parcels registered in the Spanish cadaster. As a continuation of this work an ontology has been implemented.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La tesis que se presenta tiene como propósito la construcción automática de ontologías a partir de textos, enmarcándose en el área denominada Ontology Learning. Esta disciplina tiene como objetivo automatizar la elaboración de modelos de dominio a partir de fuentes información estructurada o no estructurada, y tuvo su origen con el comienzo del milenio, a raíz del crecimiento exponencial del volumen de información accesible en Internet. Debido a que la mayoría de información se presenta en la web en forma de texto, el aprendizaje automático de ontologías se ha centrado en el análisis de este tipo de fuente, nutriéndose a lo largo de los años de técnicas muy diversas provenientes de áreas como la Recuperación de Información, Extracción de Información, Sumarización y, en general, de áreas relacionadas con el procesamiento del lenguaje natural. La principal contribución de esta tesis consiste en que, a diferencia de la mayoría de las técnicas actuales, el método que se propone no analiza la estructura sintáctica superficial del lenguaje, sino que estudia su nivel semántico profundo. Su objetivo, por tanto, es tratar de deducir el modelo del dominio a partir de la forma con la que se articulan los significados de las oraciones en lenguaje natural. Debido a que el nivel semántico profundo es independiente de la lengua, el método permitirá operar en escenarios multilingües, en los que es necesario combinar información proveniente de textos en diferentes idiomas. Para acceder a este nivel del lenguaje, el método utiliza el modelo de las interlinguas. Estos formalismos, provenientes del área de la traducción automática, permiten representar el significado de las oraciones de forma independiente de la lengua. Se utilizará en concreto UNL (Universal Networking Language), considerado como la única interlingua de propósito general que está normalizada. La aproximación utilizada en esta tesis supone la continuación de trabajos previos realizados tanto por su autor como por el equipo de investigación del que forma parte, en los que se estudió cómo utilizar el modelo de las interlinguas en las áreas de extracción y recuperación de información multilingüe. Básicamente, el procedimiento definido en el método trata de identificar, en la representación UNL de los textos, ciertas regularidades que permiten deducir las piezas de la ontología del dominio. Debido a que UNL es un formalismo basado en redes semánticas, estas regularidades se presentan en forma de grafos, generalizándose en estructuras denominadas patrones lingüísticos. Por otra parte, UNL aún conserva ciertos mecanismos de cohesión del discurso procedentes de los lenguajes naturales, como el fenómeno de la anáfora. Con el fin de aumentar la efectividad en la comprensión de las expresiones, el método provee, como otra contribución relevante, la definición de un algoritmo para la resolución de la anáfora pronominal circunscrita al modelo de la interlingua, limitada al caso de pronombres personales de tercera persona cuando su antecedente es un nombre propio. El método propuesto se sustenta en la definición de un marco formal, que ha debido elaborarse adaptando ciertas definiciones provenientes de la teoría de grafos e incorporando otras nuevas, con el objetivo de ubicar las nociones de expresión UNL, patrón lingüístico y las operaciones de encaje de patrones, que son la base de los procesos del método. Tanto el marco formal como todos los procesos que define el método se han implementado con el fin de realizar la experimentación, aplicándose sobre un artículo de la colección EOLSS “Encyclopedia of Life Support Systems” de la UNESCO. ABSTRACT The purpose of this thesis is the automatic construction of ontologies from texts. This thesis is set within the area of Ontology Learning. This discipline aims to automatize domain models from structured or unstructured information sources, and had its origin with the beginning of the millennium, as a result of the exponential growth in the volume of information accessible on the Internet. Since most information is presented on the web in the form of text, the automatic ontology learning is focused on the analysis of this type of source, nourished over the years by very different techniques from areas such as Information Retrieval, Information Extraction, Summarization and, in general, by areas related to natural language processing. The main contribution of this thesis consists of, in contrast with the majority of current techniques, the fact that the method proposed does not analyze the syntactic surface structure of the language, but explores his deep semantic level. Its objective, therefore, is trying to infer the domain model from the way the meanings of the sentences are articulated in natural language. Since the deep semantic level does not depend on the language, the method will allow to operate in multilingual scenarios, where it is necessary to combine information from texts in different languages. To access to this level of the language, the method uses the interlingua model. These formalisms, coming from the area of machine translation, allow to represent the meaning of the sentences independently of the language. In this particular case, UNL (Universal Networking Language) will be used, which considered to be the only interlingua of general purpose that is standardized. The approach used in this thesis corresponds to the continuation of previous works carried out both by the author of this thesis and by the research group of which he is part, in which it is studied how to use the interlingua model in the areas of multilingual information extraction and retrieval. Basically, the procedure defined in the method tries to identify certain regularities at the UNL representation of texts that allow the deduction of the parts of the ontology of the domain. Since UNL is a formalism based on semantic networks, these regularities are presented in the form of graphs, generalizing in structures called linguistic patterns. On the other hand, UNL still preserves certain mechanisms of discourse cohesion from natural languages, such as the phenomenon of the anaphora. In order to increase the effectiveness in the understanding of expressions, the method provides, as another significant contribution, the definition of an algorithm for the resolution of pronominal anaphora limited to the model of the interlingua, in the case of third person personal pronouns when its antecedent is a proper noun. The proposed method is based on the definition of a formal framework, adapting some definitions from Graph Theory and incorporating new ones, in order to locate the notions of UNL expression and linguistic pattern, as well as the operations of pattern matching, which are the basis of the method processes. Both the formal framework and all the processes that define the method have been implemented in order to carry out the experimentation, applying on an article of the "Encyclopedia of Life Support Systems" of the UNESCO-EOLSS collection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El proyecto nace de un proyecto anterior donde se construyó un modelo para representar la información de los estudios superiores mediante una red de ontologías, proporcionando una definición común de conceptos importantes. Este proyecto consiste en desarrollar una herramienta capaz de generar datos educativos, a partir de la red de ontologías mencionadas anteriormente, siguiendo el paradigma de Linked Data [1]. La herramienta deberá extraer datos de diferentes fuentes educativas y transformará dichos datos educativos a datos enlazados (Linked Data). Para llevar a cabo esta labor se ha utilizado GATE Developer [2], es un entorno de desarrollo que proporciona un completo conjunto de herramientas gráficas interactivas para la creación, medición y mantenimiento de componentes de software para el procesamiento del lenguaje humano.---ABSTRACT---The project arises from a previous project in which a model was constructed to represent information of higher education through a network of ontologies, providing a common definition of important concepts. This project is to develop a tool capable of generating educational data from the ontology network mentioned above, following the paradigm of Linked Data [1]. The tool will extract data from different educational sources and transform said data to linked data (linked data). To carry out this work has been used GATE Developer [2]. It is a development environment that provides a comprehensive set of interactive graphical tools for creating, measuring and maintenance of software components for human language processing.