48 resultados para Boolean-like laws. Fuzzy implications. Fuzzy rule based systens. Fuzzy set theories

em Universidad Politécnica de Madrid


Relevância:

100.00% 100.00%

Publicador:

Resumo:

La Diabetes mellitus es una enfermedad caracterizada por la insuficiente o nula producción de insulina por parte del páncreas o la reducida sensibilidad del organismo a esta hormona, que ayuda a que la glucosa llegue a los tejidos y al sistema nervioso para suministrar energía. La Diabetes tiene una mayor prevalencia en los países desarrollados debido a múltiples factores, entre ellos la obesidad, la vida sedentaria, y disfunciones en el sistema endocrino relacionadas con el páncreas. La Diabetes Tipo 1 es una enfermedad crónica e incurable, en la que son destruidas las células beta del páncreas, que producen la insulina, haciéndose necesaria la administración de insulina de forma exógena para controlar los niveles de glucosa en sangre. El paciente debe seguir una terapia con insulina administrada por vía subcutánea, que debe estar adaptada a sus necesidades metabólicas y a sus hábitos de vida. Esta terapia intenta imitar el perfil insulínico de un páncreas sano. La tecnología actual permite abordar el desarrollo del denominado “páncreas endocrino artificial” (PEA), que aportaría precisión, eficacia y seguridad en la aplicación de las terapias con insulina y permitiría una mayor independencia de los pacientes frente a su enfermedad, que en la actualidad están sujetos a una constante toma de decisiones. El PEA consta de un sensor continuo de glucosa, una bomba de infusión de insulina y un algoritmo de control, que calcula la insulina a infusionar utilizando los niveles de glucosa del paciente como información principal. Este trabajo presenta una modificación en el método de control en lazo cerrado propuesto en un proyecto previo. El controlador del que se parte está compuesto por un controlador basal booleano y un controlador borroso postprandial basado en reglas borrosas heredadas del controlador basal. El controlador postprandial administra el 50% del bolo manual (calculado a partir de la cantidad de carbohidratos que el paciente va a consumir) en el instante del aviso de la ingesta y reparte el resto en instantes posteriores. El objetivo es conseguir una regulación óptima del nivel de glucosa en el periodo postprandial. Con el objetivo de reducir las hiperglucemias que se producen en el periodo postprandial se realiza un transporte de insulina, que es un adelanto de la insulina basal del periodo postprandial que se suministrará junto con un porcentaje variable del bolo manual. Este porcentaje estará relacionado con el estado metabólico del paciente previo a la ingesta. Además se modificará la base de conocimiento para adecuar el comportamiento del controlador al periodo postprandial. Este proyecto está enfocado en la mejora del controlador borroso postprandial previo, modificando dos aspectos: la inferencia del controlador postprandial y añadiendo una toma de decisiones automática sobre el % del bolo manual y el transporte. Se ha propuesto un controlador borroso con una nueva inferencia, que no hereda las características del controlado basal, y ha sido adaptado al periodo postprandial. Se ha añadido una inferencia borrosa que modifica la cantidad de insulina a administrar en el momento del aviso de ingesta y la cantidad de insulina basal a transportar del periodo postprandial al bolo manual. La validación del algoritmo se ha realizado mediante experimentos en simulación utilizando una población de diez pacientes sintéticos pertenecientes al Simulador de Padua/Virginia, evaluando los resultados con estadísticos para después compararlos con los obtenidos con el método de control anterior. Tras la evaluación de los resultados se puede concluir que el nuevo controlador postprandial, acompañado de la toma de decisiones automática, realiza un mejor control glucémico en el periodo postprandial, disminuyendo los niveles de las hiperglucemias. ABSTRACT. Diabetes mellitus is a disease characterized by the insufficient or null production of insulin from the pancreas or by a reduced sensitivity to this hormone, which helps glucose get to the tissues and the nervous system to provide energy. Diabetes has more prevalence in developed countries due to multiple factors, including obesity, sedentary lifestyle and endocrine dysfunctions related to the pancreas. Type 1 Diabetes is a chronic, incurable disease in which beta cells in the pancreas that produce insulin are destroyed, and exogenous insulin delivery is required to control blood glucose levels. The patient must follow a therapy with insulin administered by the subcutaneous route that should be adjusted to the metabolic needs and lifestyle of the patient. This therapy tries to imitate the insulin profile of a non-pathological pancreas. Current technology can adress the development of the so-called “endocrine artificial pancreas” (EAP) that would provide accuracy, efficacy and safety in the application of insulin therapies and will allow patients a higher level of independence from their disease. Patients are currently tied to constant decision making. The EAP consists of a continuous glucose sensor, an insulin infusion pump and a control algorithm that computes the insulin amount that has to be infused using the glucose as the main source of information. This work shows modifications to the control method in closed loop proposed in a previous project. The reference controller is composed by a boolean basal controller and a postprandial rule-based fuzzy controller which inherits the rules from the basal controller. The postprandial controller administrates 50% of the bolus (calculated from the amount of carbohydrates that the patient is going to ingest) in the moment of the intake warning, and distributes the remaining in later instants. The goal is to achieve an optimum regulation of the glucose level in the postprandial period. In order to reduce hyperglycemia in the postprandial period an insulin transport is carried out. It consists on a feedforward of the basal insulin from the postprandial period, which will be administered with a variable percentage of the manual bolus. This percentage would be linked with the metabolic state of the patient in moments previous to the intake. Furthermore, the knowledge base is going to be modified in order to fit the controller performance to the postprandial period. This project is focused on the improvement of the previous controller, modifying two aspects: the postprandial controller inference, and the automatic decision making on the percentage of the manual bolus and the transport. A fuzzy controller with a new inference has been proposed and has been adapted to the postprandial period. A fuzzy inference has been added, which modifies both the amount of manual bolus to administrate at the intake warning and the amount of basal insulin to transport to the prandial bolus. The algorithm assessment has been done through simulation experiments using a synthetic population of 10 patients in the UVA/PADOVA simulator, evaluating the results with statistical parameters for further comparison with those obtained with the previous control method. After comparing results it can be concluded that the new postprandial controller, combined with the automatic decision making, carries out a better glycemic control in the postprandial period, decreasing levels of hyperglycemia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses a novel hybrid approach for text categorization that combines a machine learning algorithm, which provides a base model trained with a labeled corpus, with a rule-based expert system, which is used to improve the results provided by the previous classifier, by filtering false positives and dealing with false negatives. The main advantage is that the system can be easily fine-tuned by adding specific rules for those noisy or conflicting categories that have not been successfully trained. We also describe an implementation based on k-Nearest Neighbor and a simple rule language to express lists of positive, negative and relevant (multiword) terms appearing in the input text. The system is evaluated in several scenarios, including the popular Reuters-21578 news corpus for comparison to other approaches, and categorization using IPTC metadata, EUROVOC thesaurus and others. Results show that this approach achieves a precision that is comparable to top ranked methods, with the added value that it does not require a demanding human expert workload to train

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: This study assessed the efficacy of a closed-loop (CL) system consisting of a predictive rule-based algorithm (pRBA) on achieving nocturnal and postprandial normoglycemia in patients with type 1 diabetes mellitus (T1DM). The algorithm is personalized for each patient’s data using two different strategies to control nocturnal and postprandial periods. Research Design and Methods: We performed a randomized crossover clinical study in which 10 T1DM patients treated with continuous subcutaneous insulin infusion (CSII) spent two nonconsecutive nights in the research facility: one with their usual CSII pattern (open-loop [OL]) and one controlled by the pRBA (CL). The CL period lasted from 10 p.m. to 10 a.m., including overnight control, and control of breakfast. Venous samples for blood glucose (BG) measurement were collected every 20 min. Results: Time spent in normoglycemia (BG, 3.9–8.0 mmol/L) during the nocturnal period (12 a.m.–8 a.m.), expressed as median (interquartile range), increased from 66.6% (8.3–75%) with OL to 95.8% (73–100%) using the CL algorithm (P<0.05). Median time in hypoglycemia (BG, <3.9 mmol/L) was reduced from 4.2% (0–21%) in the OL night to 0.0% (0.0–0.0%) in the CL night (P<0.05). Nine hypoglycemic events (<3.9 mmol/L) were recorded with OL compared with one using CL. The postprandial glycemic excursion was not lower when the CL system was used in comparison with conventional preprandial bolus: time in target (3.9–10.0 mmol/L) 58.3% (29.1–87.5%) versus 50.0% (50–100%). Conclusions: A highly precise personalized pRBA obtains nocturnal normoglycemia, without significant hypoglycemia, in T1DM patients. There appears to be no clear benefit of CL over prandial bolus on the postprandial glycemia

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Type 1 diabetes-mellitus implies a life-threatening absolute insulin deficiency. Artificial pancreas (CGM sensor, insulin pump and control algorithm) is promising to outperform current open-loop therapies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La tesis doctoral CONTRIBUCIÓN AL ESTUDIO DE DOS CONCEPTOS BÁSICOS DE LA LÓGICA FUZZY constituye un conjunto de nuevas aportaciones al análisis de dos elementos básicos de la lógica fuzzy: los mecanismos de inferencia y la representación de predicados vagos. La memoria se encuentra dividida en dos partes que corresponden a los dos aspectos señalados. En la Parte I se estudia el concepto básico de «estado lógico borroso». Un estado lógico borroso es un punto fijo de la aplicación generada a partir de la regla de inferencia conocida como modus ponens generalizado. Además, un preorden borroso puede ser representado mediante los preórdenes elementales generados por el conjunto de sus estados lógicos borrosos. El Capítulo 1 está dedicado a caracterizar cuándo dos estados lógicos dan lugar al mismo preorden elemental, obteniéndose también un representante de la clase de todos los estados lógicos que generan el mismo preorden elemental. El Capítulo finaliza con la caracterización del conjunto de estados lógicos borrosos de un preorden elemental. En el Capítulo 2 se obtiene un subconjunto borroso trapezoidal como una clase de una relación de indistinguibilidad. Finalmente, el Capítulo 3 se dedica a estudiar dos tipos de estados lógicos clásicos: los irreducibles y los minimales. En el Capítulo 4, que inicia la Parte II de la memoria, se aborda el problema de obtener la función de compatibilidad de un predicado vago. Se propone un método, basado en el conocimiento del uso del predicado mediante un conjunto de reglas y de ciertos elementos distinguidos, que permite obtener una expresión general de la función de pertenencia generalizada de un subconjunto borroso que realice la función de extensión del predicado borroso. Dicho método permite, en ciertos casos, definir un conjunto de conectivas multivaluadas asociadas al predicado. En el último capítulo se estudia la representación de antónimos y sinónimos en lógica fuzzy a través de auto-morfismos. Se caracterizan los automorfismos sobre el intervalo unidad cuando sobre él se consideran dos operaciones: una t-norma y una t-conorma ambas arquimedianas. The PhD Thesis CONTRIBUCIÓN AL ESTUDIO DE DOS CONCEPTOS BÁSICOS DE LA LÓGICA FUZZY is a contribution to two basic concepts of the Fuzzy Logic. It is divided in two parts, the first is devoted to a mechanism of inference in Fuzzy Logic, and the second to the representation of vague predicates. «Fuzzy Logic State» is the basic concept in Part I. A Fuzzy Logic State is a fixed-point for the mapping giving the Generalized Modus Ponens Rule of inference. Moreover, a fuzzy preordering can be represented by the elementary preorderings generated by its Fuzzy Logic States. Chapter 1 contemplates the identity of elementary preorderings and the selection of representatives for the classes modulo this identity. This chapter finishes with the characterization of the set of Fuzzy Logic States of an elementary preordering. In Chapter 2 a Trapezoidal Fuzzy Set as a class of a relation of Indistinguishability is obtained. Finally, Chapter 3 is devoted to study two types of Classical Logic States: irreducible and minimal. Part II begins with Chapter 4 dealing with the problem of obtaining a Compa¬tibility Function for a vague predicate. When the use of a predicate is known by means of a set of rules and some distinguished elements, a method to obtain the general expression of the Membership Function is presented. This method allows, in some cases, to reach a set of multivalued connectives associated to the predicate. Last Chapter is devoted to the representation of antonyms and synonyms in Fuzzy Logic. When the unit interval [0,1] is endowed with both an archimedean t-norm and a an archi-medean t-conorm, it is showed that the automorphisms' group is just reduced to the identity function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we axiomatically introduce fuzzy multi-measures on bounded lattices. In particular, we make a distinction between four different types of fuzzy set multi-measures on a universe X, considering both the usual or inverse real number ordering of this lattice and increasing or decreasing monotonicity with respect to the number of arguments. We provide results from which we can derive families of measures that hold for the applicable conditions in each case.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work describes a semantic extension for a user-smart object interaction model based on the ECA paradigm (Event-Condition-Action). In this approach, smart objects publish their sensing (event) and action capabilities in the cloud and mobile devices are prepared to retrieve them and act as mediators to configure personalized behaviours for the objects. In this paper, the information handled by this interaction system has been shaped according several semantic models that, together with the integration of an embedded ontological and rule-based reasoner, are exploited in order to (i) automatically detect incompatible ECA rules configurations and to (ii) support complex ECA rules definitions and execution. This semantic extension may significantly improve the management of smart spaces populated with numerous smart objects from mobile personal devices, as it facilitates the configuration of coherent ECA rules.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Los conjuntos borrosos de tipo 2 (T2FSs) fueron introducidos por L.A. Zadeh en 1975 [65], como una extensión de los conjuntos borrosos de tipo 1 (FSs). Mientras que en estos últimos el grado de pertenencia de un elemento al conjunto viene determinado por un valor en el intervalo [0, 1], en el caso de los T2FSs el grado de pertenencia de un elemento es un conjunto borroso en [0,1], es decir, un T2FS queda determinado por una función de pertenencia μ : X → M, donde M = [0, 1][0,1] = Map([0, 1], [0, 1]), es el conjunto de las funciones de [0,1] en [0,1] (ver [39], [42], [43], [61]). Desde que los T2FSs fueron introducidos, se han generalizado a dicho conjunto (ver [39], [42], [43], [61], por ejemplo), a partir del “Principio de Extensión” de Zadeh [65] (ver Teorema 1.1), muchas de las definiciones, operaciones, propiedades y resultados obtenidos en los FSs. Sin embargo, como sucede en cualquier área de investigación, quedan muchas lagunas y problemas abiertos que suponen un reto para cualquiera que quiera hacer un estudio profundo en este campo. A este reto se ha dedicado el presente trabajo, logrando avances importantes en este sentido de “rellenar huecos” existentes en la teoría de los conjuntos borrosos de tipo 2, especialmente en las propiedades de autocontradicción y N-autocontradicción, y en las operaciones de negación, t-norma y t-conorma sobre los T2FSs. Cabe destacar que en [61] se justifica que las operaciones sobre los T2FSs (Map(X,M)) se pueden definir de forma natural a partir de las operaciones sobre M, verificando las mismas propiedades. Por tanto, por ser más fácil, en el presente trabajo se toma como objeto de estudio a M, y algunos de sus subconjuntos, en vez de Map(X,M). En cuanto a la operación de negación, en el marco de los conjuntos borrosos de tipo 2 (T2FSs), usualmente se emplea para representar la negación en M, una operación asociada a la negación estándar en [0,1]. Sin embargo, dicha operación no verifica los axiomas que, intuitivamente, debe verificar cualquier operación para ser considerada negación en el conjunto M. En este trabajo se presentan los axiomas de negación y negación fuerte en los T2FSs. También se define una operación asociada a cualquier negación suprayectiva en [0,1], incluyendo la negación estándar, y se estudia, junto con otras propiedades, si es negación y negación fuerte en L (conjunto de las funciones de M normales y convexas). Además, se comprueba en qué condiciones se cumplen las leyes de De Morgan para un extenso conjunto de pares de operaciones binarias en M. Por otra parte, las propiedades de N-autocontradicción y autocontradicción, han sido suficientemente estudiadas en los conjuntos borrosos de tipo 1 (FSs) y en los conjuntos borrosos intuicionistas de Atanassov (AIFSs). En el presente trabajo se inicia el estudio de las mencionadas propiedades, dentro del marco de los T2FSs cuyos grados de pertenencia están en L. En este sentido, aquí se extienden los conceptos de N-autocontradicción y autocontradicción al conjunto L, y se determinan algunos criterios para verificar tales propiedades. En cuanto a otras operaciones, Walker et al. ([61], [63]) definieron dos familias de operaciones binarias sobre M, y determinaron que, bajo ciertas condiciones, estas operaciones son t-normas (normas triangulares) o t-conormas sobre L. En este trabajo se introducen operaciones binarias sobre M, unas más generales y otras diferentes a las dadas por Walker et al., y se estudian varias propiedades de las mismas, con el objeto de deducir nuevas t-normas y t-conormas sobre L. ABSTRACT Type-2 fuzzy sets (T2FSs) were introduced by L.A. Zadeh in 1975 [65] as an extension of type-1 fuzzy sets (FSs). Whereas for FSs the degree of membership of an element of a set is determined by a value in the interval [0, 1] , the degree of membership of an element for T2FSs is a fuzzy set in [0,1], that is, a T2FS is determined by a membership function μ : X → M, where M = [0, 1][0,1] is the set of functions from [0,1] to [0,1] (see [39], [42], [43], [61]). Later, many definitions, operations, properties and results known on FSs, have been generalized to T2FSs (e.g. see [39], [42], [43], [61]) by employing Zadeh’s Extension Principle [65] (see Theorem 1.1). However, as in any area of research, there are still many open problems which represent a challenge for anyone who wants to make a deep study in this field. Then, we have been dedicated to such challenge, making significant progress in this direction to “fill gaps” (close open problems) in the theory of T2FSs, especially on the properties of self-contradiction and N-self-contradiction, and on the operations of negations, t-norms (triangular norms) and t-conorms on T2FSs. Walker and Walker justify in [61] that the operations on Map(X,M) can be defined naturally from the operations onMand have the same properties. Therefore, we will work onM(study subject), and some subsets of M, as all the results are easily and directly extensible to Map(X,M). About the operation of negation, usually has been employed in the framework of T2FSs, a operation associated to standard negation on [0,1], but such operation does not satisfy the negation axioms on M. In this work, we introduce the axioms that a function inMshould satisfy to qualify as a type-2 negation and strong type-2 negation. Also, we define a operation on M associated to any suprajective negation on [0,1], and analyse, among others properties, if such operation is negation or strong negation on L (all normal and convex functions of M). Besides, we study the De Morgan’s laws, with respect to some binary operations on M. On the other hand, The properties of self-contradiction and N-self-contradiction have been extensively studied on FSs and on the Atanassov’s intuitionistic fuzzy sets (AIFSs). Thereon, in this research we begin the study of the mentioned properties on the framework of T2FSs. In this sense, we give the definitions about self-contradiction and N-self-contradiction on L, and establish the criteria to verify these properties on L. Respect to the t-norms and t-conorms, Walker et al. ([61], [63]) defined two families of binary operations on M and found that, under some conditions, these operations are t-norms or t-conorms on L. In this work we introduce more general binary operations on M than those given by Walker et al. and study which are the minimum conditions necessary for these operations satisfy each of the axioms of the t-norm and t-conorm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Embedded context management in resource-constrained devices (e.g. mobile phones, autonomous sensors or smart objects) imposes special requirements in terms of lightness for data modelling and reasoning. In this paper, we explore the state-of-the-art on data representation and reasoning tools for embedded mobile reasoning and propose a light inference system (LIS) aiming at simplifying embedded inference processes offering a set of functionalities to avoid redundancy in context management operations. The system is part of a service-oriented mobile software framework, conceived to facilitate the creation of context-aware applications—it decouples sensor data acquisition and context processing from the application logic. LIS, composed of several modules, encapsulates existing lightweight tools for ontology data management and rule-based reasoning, and it is ready to run on Java-enabled handheld devices. Data management and reasoning processes are designed to handle a general ontology that enables communication among framework components. Both the applications running on top of the framework and the framework components themselves can configure the rule and query sets in order to retrieve the information they need from LIS. In order to test LIS features in a real application scenario, an ‘Activity Monitor’ has been designed and implemented: a personal health-persuasive application that provides feedback on the user’s lifestyle, combining data from physical and virtual sensors. In this case of use, LIS is used to timely evaluate the user’s activity level, to decide on the convenience of triggering notifications and to determine the best interface or channel to deliver these context-aware alerts.d

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the development of an Advanced Speech Communication System for Deaf People and its field evaluation in a real application domain: the renewal of Driver’s License. The system is composed of two modules. The first one is a Spanish into Spanish Sign Language (LSE: Lengua de Signos Española) translation module made up of a speech recognizer, a natural language translator (for converting a word sequence into a sequence of signs), and a 3D avatar animation module (for playing back the signs). The second module is a Spoken Spanish generator from sign-writing composed of a visual interface (for specifying a sequence of signs), a language translator (for generating the sequence of words in Spanish), and finally, a text to speech converter. For language translation, the system integrates three technologies: an example-based strategy, a rule-based translation method and a statistical translator. This paper also includes a detailed description of the evaluation carried out in the Local Traffic Office in the city of Toledo (Spain) involving real government employees and deaf people. This evaluation includes objective measurements from the system and subjective information from questionnaires. Finally, the paper reports an analysis of the main problems and a discussion about possible solutions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present a novel Radio Frequency Identification (RFID) system for accurate indoor localization. The system is composed of a standard Ultra High Frequency (UHF), ISO-18006C compliant RFID reader, a large set of standard passive RFID tags whose locations are known, and a newly developed tag-like RFID component that is attached to the items that need to be localized. The new semi-passive component, referred to as sensatag (sense-a-tag), has a dual functionality wherein it can sense the communication between the reader and standard tags which are in its proximity, and also communicate with the reader like standard tags using backscatter modulation. Based on the information conveyed by the sensatags to the reader, localization algorithms based on binary sensor principles can be developed. We present results from real measurements that show the accuracy of the proposed system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a novel method to enhance current airport surveillance systems used in Advanced Surveillance Monitoring Guidance and Control Systems (A-SMGCS). The proposed method allows for the automatic calibration of measurement models and enhanced detection of nonideal situations, increasing surveillance products integrity. It is based on the definition of a set of observables from the surveillance processing chain and a rule based expert system aimed to change the data processing methods

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A land classification method was designed for the Community of Madrid (CM), which has lands suitable for either agriculture use or natural spaces. The process started from an extensive previous CM study that contains sets of land attributes with data for 122 types and a minimum-requirements method providing a land quality classification (SQ) for each land. Borrowing some tools from Operations Research (OR) and from Decision Science, that SQ has been complemented by an additive valuation method that involves a more restricted set of 13 representative attributes analysed using Attribute Valuation Functions to obtain a quality index, QI, and by an original composite method that uses a fuzzy set procedure to obtain a combined quality index, CQI, that contains relevant information from both the SQ and the QI methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web 1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs. These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools. Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate. However, linguistic annotation tools have still some limitations, which can be summarised as follows: 1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.). 2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts. 3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc. A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved. In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool. Therefore, it would be quite useful to find a way to (i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools; (ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate. Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned. Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section. 2. GOALS OF THE PRESENT WORK As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based triples, as in the usual Semantic Web languages (namely RDF(S) and OWL), in order for the model to be considered suitable for the Semantic Web. Besides, to be useful for the Semantic Web, this model should provide a way to automate the annotation of web pages. As for the present work, this requirement involved reusing the linguistic annotation tools purchased by the OEG research group (http://www.oeg-upm.net), but solving beforehand (or, at least, minimising) some of their limitations. Therefore, this model had to minimise these limitations by means of the integration of several linguistic annotation tools into a common architecture. Since this integration required the interoperation of tools and their annotations, ontologies were proposed as the main technological component to make them effectively interoperate. From the very beginning, it seemed that the formalisation of the elements and the knowledge underlying linguistic annotations within an appropriate set of ontologies would be a great step forward towards the formulation of such a model (henceforth referred to as OntoTag). Obviously, first, to combine the results of the linguistic annotation tools that operated at the same level, their annotation schemas had to be unified (or, preferably, standardised) in advance. This entailed the unification (id. standardisation) of their tags (both their representation and their meaning), and their format or syntax. Second, to merge the results of the linguistic annotation tools operating at different levels, their respective annotation schemas had to be (a) made interoperable and (b) integrated. And third, in order for the resulting annotations to suit the Semantic Web, they had to be specified by means of an ontology-based vocabulary, and structured by means of ontology-based triples, as hinted above. Therefore, a new annotation scheme had to be devised, based both on ontologies and on this type of triples, which allowed for the combination and the integration of the annotations of any set of linguistic annotation tools. This annotation scheme was considered a fundamental part of the model proposed here, and its development was, accordingly, another major objective of the present work. All these goals, aims and objectives could be re-stated more clearly as follows: Goal 1: Development of a set of ontologies for the formalisation of the linguistic knowledge relating linguistic annotation. Sub-goal 1.1: Ontological formalisation of the EAGLES (1996a; 1996b) de facto standards for morphosyntactic and syntactic annotation, in a way that helps respect the triple structure recommended for annotations in these works (which is isomorphic to the triple structures used in the context of the Semantic Web). Sub-goal 1.2: Incorporation into this preliminary ontological formalisation of other existing standards and standard proposals relating the levels mentioned above, such as those currently under development within ISO/TC 37 (the ISO Technical Committee dealing with Terminology, which deals also with linguistic resources and annotations). Sub-goal 1.3: Generalisation and extension of the recommendations in EAGLES (1996a; 1996b) and ISO/TC 37 to the semantic level, for which no ISO/TC 37 standards have been developed yet. Sub-goal 1.4: Ontological formalisation of the generalisations and/or extensions obtained in the previous sub-goal as generalisations and/or extensions of the corresponding ontology (or ontologies). Sub-goal 1.5: Ontological formalisation of the knowledge required to link, combine and unite the knowledge represented in the previously developed ontology (or ontologies). Goal 2: Development of OntoTag’s annotation scheme, a standard-based abstract scheme for the hybrid (linguistically-motivated and ontological-based) annotation of texts. Sub-goal 2.1: Development of the standard-based morphosyntactic annotation level of OntoTag’s scheme. This level should include, and possibly extend, the recommendations of EAGLES (1996a) and also the recommendations included in the ISO/MAF (2008) standard draft. Sub-goal 2.2: Development of the standard-based syntactic annotation level of the hybrid abstract scheme. This level should include, and possibly extend, the recommendations of EAGLES (1996b) and the ISO/SynAF (2010) standard draft. Sub-goal 2.3: Development of the standard-based semantic annotation level of OntoTag’s (abstract) scheme. Sub-goal 2.4: Development of the mechanisms for a convenient integration of the three annotation levels already mentioned. These mechanisms should take into account the recommendations included in the ISO/LAF (2009) standard draft. Goal 3: Design of OntoTag’s (abstract) annotation architecture, an abstract architecture for the hybrid (semantic) annotation of texts (i) that facilitates the integration and interoperation of different linguistic annotation tools, and (ii) whose results comply with OntoTag’s annotation scheme. Sub-goal 3.1: Specification of the decanting processes that allow for the classification and separation, according to their corresponding levels, of the results of the linguistic tools annotating at several different levels. Sub-goal 3.2: Specification of the standardisation processes that allow (a) complying with the standardisation requirements of OntoTag’s annotation scheme, as well as (b) combining the results of those linguistic tools that share some level of annotation. Sub-goal 3.3: Specification of the merging processes that allow for the combination of the output annotations and the interoperation of those linguistic tools that share some level of annotation. Sub-goal 3.4: Specification of the merge processes that allow for the integration of the results and the interoperation of those tools performing their annotations at different levels. Goal 4: Generation of OntoTagger’s schema, a concrete instance of OntoTag’s abstract scheme for a concrete set of linguistic annotations. These linguistic annotations result from the tools and the resources available in the research group, namely • Bitext’s DataLexica (http://www.bitext.com/EN/datalexica.asp), • LACELL’s (POS) tagger (http://www.um.es/grupos/grupo-lacell/quees.php), • Connexor’s FDG (http://www.connexor.eu/technology/machinese/glossary/fdg/), and • EuroWordNet (Vossen et al., 1998). This schema should help evaluate OntoTag’s underlying hypotheses, stated below. Consequently, it should implement, at least, those levels of the abstract scheme dealing with the annotations of the set of tools considered in this implementation. This includes the morphosyntactic, the syntactic and the semantic levels. Goal 5: Implementation of OntoTagger’s configuration, a concrete instance of OntoTag’s abstract architecture for this set of linguistic tools and annotations. This configuration (1) had to use the schema generated in the previous goal; and (2) should help support or refute the hypotheses of this work as well (see the next section). Sub-goal 5.1: Implementation of the decanting processes that facilitate the classification and separation of the results of those linguistic resources that provide annotations at several different levels (on the one hand, LACELL’s tagger operates at the morphosyntactic level and, minimally, also at the semantic level; on the other hand, FDG operates at the morphosyntactic and the syntactic levels and, minimally, at the semantic level as well). Sub-goal 5.2: Implementation of the standardisation processes that allow (i) specifying the results of those linguistic tools that share some level of annotation according to the requirements of OntoTagger’s schema, as well as (ii) combining these shared level results. In particular, all the tools selected perform morphosyntactic annotations and they had to be conveniently combined by means of these processes. Sub-goal 5.3: Implementation of the merging processes that allow for the combination (and possibly the improvement) of the annotations and the interoperation of the tools that share some level of annotation (in particular, those relating the morphosyntactic level, as in the previous sub-goal). Sub-goal 5.4: Implementation of the merging processes that allow for the integration of the different standardised and combined annotations aforementioned, relating all the levels considered. Sub-goal 5.5: Improvement of the semantic level of this configuration by adding a named entity recognition, (sub-)classification and annotation subsystem, which also uses the named entities annotated to populate a domain ontology, in order to provide a concrete application of the present work in the two areas involved (the Semantic Web and Corpus Linguistics). 3. MAIN RESULTS: ASSESSMENT OF ONTOTAG’S UNDERLYING HYPOTHESES The model developed in the present thesis tries to shed some light on (i) whether linguistic annotation tools can effectively interoperate; (ii) whether their results can be combined and integrated; and, if they can, (iii) how they can, respectively, interoperate and be combined and integrated. Accordingly, several hypotheses had to be supported (or rejected) by the development of the OntoTag model and OntoTagger (its implementation). The hypotheses underlying OntoTag are surveyed below. Only one of the hypotheses (H.6) was rejected; the other five could be confirmed. H.1 The annotations of different levels (or layers) can be integrated into a sort of overall, comprehensive, multilayer and multilevel annotation, so that their elements can complement and refer to each other. • CONFIRMED by the development of: o OntoTag’s annotation scheme, o OntoTag’s annotation architecture, o OntoTagger’s (XML, RDF, OWL) annotation schemas, o OntoTagger’s configuration. H.2 Tool-dependent annotations can be mapped onto a sort of tool-independent annotations and, thus, can be standardised. • CONFIRMED by means of the standardisation phase incorporated into OntoTag and OntoTagger for the annotations yielded by the tools. H.3 Standardisation should ease: H.3.1: The interoperation of linguistic tools. H.3.2: The comparison, combination (at the same level and layer) and integration (at different levels or layers) of annotations. • H.3 was CONFIRMED by means of the development of OntoTagger’s ontology-based configuration: o Interoperation, comparison, combination and integration of the annotations of three different linguistic tools (Connexor’s FDG, Bitext’s DataLexica and LACELL’s tagger); o Integration of EuroWordNet-based, domain-ontology-based and named entity annotations at the semantic level. o Integration of morphosyntactic, syntactic and semantic annotations. H.4 Ontologies and Semantic Web technologies (can) play a crucial role in the standardisation of linguistic annotations, by providing consensual vocabularies and standardised formats for annotation (e.g., RDF triples). • CONFIRMED by means of the development of OntoTagger’s RDF-triple-based annotation schemas. H.5 The rate of errors introduced by a linguistic tool at a given level, when annotating, can be reduced automatically by contrasting and combining its results with the ones coming from other tools, operating at the same level. However, these other tools might be built following a different technological (stochastic vs. rule-based, for example) or theoretical (dependency vs. HPS-grammar-based, for instance) approach. • CONFIRMED by the results yielded by the evaluation of OntoTagger. H.6 Each linguistic level can be managed and annotated independently. • REJECTED: OntoTagger’s experiments and the dependencies observed among the morphosyntactic annotations, and between them and the syntactic annotations. In fact, Hypothesis H.6 was already rejected when OntoTag’s ontologies were developed. We observed then that several linguistic units stand on an interface between levels, belonging thereby to both of them (such as morphosyntactic units, which belong to both the morphological level and the syntactic level). Therefore, the annotations of these levels overlap and cannot be handled independently when merged into a unique multileveled annotation. 4. OTHER MAIN RESULTS AND CONTRIBUTIONS First, interoperability is a hot topic for both the linguistic annotation community and the whole Computer Science field. The specification (and implementation) of OntoTag’s architecture for the combination and integration of linguistic (annotation) tools and annotations by means of ontologies shows a way to make these different linguistic annotation tools and annotations interoperate in practice. Second, as mentioned above, the elements involved in linguistic annotation were formalised in a set (or network) of ontologies (OntoTag’s linguistic ontologies). • On the one hand, OntoTag’s network of ontologies consists of − The Linguistic Unit Ontology (LUO), which includes a mostly hierarchical formalisation of the different types of linguistic elements (i.e., units) identifiable in a written text; − The Linguistic Attribute Ontology (LAO), which includes also a mostly hierarchical formalisation of the different types of features that characterise the linguistic units included in the LUO; − The Linguistic Value Ontology (LVO), which includes the corresponding formalisation of the different values that the attributes in the LAO can take; − The OIO (OntoTag’s Integration Ontology), which  Includes the knowledge required to link, combine and unite the knowledge represented in the LUO, the LAO and the LVO;  Can be viewed as a knowledge representation ontology that describes the most elementary vocabulary used in the area of annotation. • On the other hand, OntoTag’s ontologies incorporate the knowledge included in the different standards and recommendations for linguistic annotation released so far, such as those developed within the EAGLES and the SIMPLE European projects or by the ISO/TC 37 committee: − As far as morphosyntactic annotations are concerned, OntoTag’s ontologies formalise the terms in the EAGLES (1996a) recommendations and their corresponding terms within the ISO Morphosyntactic Annotation Framework (ISO/MAF, 2008) standard; − As for syntactic annotations, OntoTag’s ontologies incorporate the terms in the EAGLES (1996b) recommendations and their corresponding terms within the ISO Syntactic Annotation Framework (ISO/SynAF, 2010) standard draft; − Regarding semantic annotations, OntoTag’s ontologies generalise and extend the recommendations in EAGLES (1996a; 1996b) and, since no stable standards or standard drafts have been released for semantic annotation by ISO/TC 37 yet, they incorporate the terms in SIMPLE (2000) instead; − The terms coming from all these recommendations and standards were supplemented by those within the ISO Data Category Registry (ISO/DCR, 2008) and also of the ISO Linguistic Annotation Framework (ISO/LAF, 2009) standard draft when developing OntoTag’s ontologies. Third, we showed that the combination of the results of tools annotating at the same level can yield better results (both in precision and in recall) than each tool separately. In particular, 1. OntoTagger clearly outperformed two of the tools integrated into its configuration, namely DataLexica and FDG in all the combination sub-phases in which they overlapped (i.e. POS tagging, lemma annotation and morphological feature annotation). As far as the remaining tool is concerned, i.e. LACELL’s tagger, it was also outperformed by OntoTagger in POS tagging and lemma annotation, and it did not behave better than OntoTagger in the morphological feature annotation layer. 2. As an immediate result, this implies that a) This type of combination architecture configurations can be applied in order to improve significantly the accuracy of linguistic annotations; and b) Concerning the morphosyntactic level, this could be regarded as a way of constructing more robust and more accurate POS tagging systems. Fourth, Semantic Web annotations are usually performed by humans or else by machine learning systems. Both of them leave much to be desired: the former, with respect to their annotation rate; the latter, with respect to their (average) precision and recall. In this work, we showed how linguistic tools can be wrapped in order to annotate automatically Semantic Web pages using ontologies. This entails their fast, robust and accurate semantic annotation. As a way of example, as mentioned in Sub-goal 5.5, we developed a particular OntoTagger module for the recognition, classification and labelling of named entities, according to the MUC and ACE tagsets (Chinchor, 1997; Doddington et al., 2004). These tagsets were further specified by means of a domain ontology, namely the Cinema Named Entities Ontology (CNEO). This module was applied to the automatic annotation of ten different web pages containing cinema reviews (that is, around 5000 words). In addition, the named entities annotated with this module were also labelled as instances (or individuals) of the classes included in the CNEO and, then, were used to populate this domain ontology. • The statistical results obtained from the evaluation of this particular module of OntoTagger can be summarised as follows. On the one hand, as far as recall (R) is concerned, (R.1) the lowest value was 76,40% (for file 7); (R.2) the highest value was 97, 50% (for file 3); and (R.3) the average value was 88,73%. On the other hand, as far as the precision rate (P) is concerned, (P.1) its minimum was 93,75% (for file 4); (R.2) its maximum was 100% (for files 1, 5, 7, 8, 9, and 10); and (R.3) its average value was 98,99%. • These results, which apply to the tasks of named entity annotation and ontology population, are extraordinary good for both of them. They can be explained on the basis of the high accuracy of the annotations provided by OntoTagger at the lower levels (mainly at the morphosyntactic level). However, they should be conveniently qualified, since they might be too domain- and/or language-dependent. It should be further experimented how our approach works in a different domain or a different language, such as French, English, or German. • In any case, the results of this application of Human Language Technologies to Ontology Population (and, accordingly, to Ontological Engineering) seem very promising and encouraging in order for these two areas to collaborate and complement each other in the area of semantic annotation. Fifth, as shown in the State of the Art of this work, there are different approaches and models for the semantic annotation of texts, but all of them focus on a particular view of the semantic level. Clearly, all these approaches and models should be integrated in order to bear a coherent and joint semantic annotation level. OntoTag shows how (i) these semantic annotation layers could be integrated together; and (ii) they could be integrated with the annotations associated to other annotation levels. Sixth, we identified some recommendations, best practices and lessons learned for annotation standardisation, interoperation and merge. They show how standardisation (via ontologies, in this case) enables the combination, integration and interoperation of different linguistic tools and their annotations into a multilayered (or multileveled) linguistic annotation, which is one of the hot topics in the area of Linguistic Annotation. And last but not least, OntoTag’s annotation scheme and OntoTagger’s annotation schemas show a way to formalise and annotate coherently and uniformly the different units and features associated to the different levels and layers of linguistic annotation. This is a great scientific step ahead towards the global standardisation of this area, which is the aim of ISO/TC 37 (in particular, Subcommittee 4, dealing with the standardisation of linguistic annotations and resources).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The most successful unfolding rules used nowadays in the partial evaluation of logic programs are based on well quasi orders (wqo) applied over (covering) ancestors, i.e., a subsequence of the atoms selected during a derivation. Ancestor (sub)sequences are used to increase the specialization power of unfolding while still guaranteeing termination and also to reduce the number of atoms for which the wqo has to be checked. Unfortunately, maintaining the structure of the ancestor relation during unfolding introduces significant overhead. We propose an efficient, practical local unfolding rule based on the notion of covering ancestors which can be used in combination with a wqo and allows a stack-based implementation without losing any opportunities for specialization. Using our technique, certain non-leftmost unfoldings are allowed as long as local unfolding is performed, i.e., we cover depth-first strategies. To deal with practical programs, we propose assertion-based techniques which allow our approach to treat programs that include (Prolog) built-ins and external predicates in a very extensible manner, for the case of leftmost unfolding. Finally, we report on our mplementation of these techniques embedded in a practical partial evaluator, which shows that our techniques, in addition to dealing with practical programs, are also significantly more efficient in time and somewhat more efficient in memory than traditional tree-based implementations. To appear in Theory and Practice of Logic Programming (TPLP).