59 resultados para Simplification of Ontologies
em Universidad Politécnica de Madrid
Resumo:
Ontologies and taxonomies are widely used to organize concepts providing the basis for activities such as indexing, and as background knowledge for NLP tasks. As such, translation of these resources would prove useful to adapt these systems to new languages. However, we show that the nature of these resources is significantly different from the "free-text" paradigm used to train most statistical machine translation systems. In particular, we see significant differences in the linguistic nature of these resources and such resources have rich additional semantics. We demonstrate that as a result of these linguistic differences, standard SMT methods, in particular evaluation metrics, can produce poor performance. We then look to the task of leveraging these semantics for translation, which we approach in three ways: by adapting the translation system to the domain of the resource; by examining if semantics can help to predict the syntactic structure used in translation; and by evaluating if we can use existing translated taxonomies to disambiguate translations. We present some early results from these experiments, which shed light on the degree of success we may have with each approach
Resumo:
In this paper we define the notion of an axiom dependency hypergraph, which explicitly represents how axioms are included into a module by the algorithm for computing locality-based modules. A locality-based module of an ontology corresponds to a set of connected nodes in the hypergraph, and atoms of an ontology to strongly connected components. Collapsing the strongly connected components into single nodes yields a condensed hypergraph that comprises a representation of the atomic decomposition of the ontology. To speed up the condensation of the hypergraph, we first reduce its size by collapsing the strongly connected components of its graph fragment employing a linear time graph algorithm. This approach helps to significantly reduce the time needed for computing the atomic decomposition of an ontology. We provide an experimental evaluation for computing the atomic decomposition of large biomedical ontologies. We also demonstrate a significant improvement in the time needed to extract locality-based modules from an axiom dependency hypergraph and its condensed version.
Resumo:
Los hipergrafos dirigidos se han empleado en problemas relacionados con lógica proposicional, bases de datos relacionales, linguística computacional y aprendizaje automático. Los hipergrafos dirigidos han sido también utilizados como alternativa a los grafos (bipartitos) dirigidos para facilitar el estudio de las interacciones entre componentes de sistemas complejos que no pueden ser fácilmente modelados usando exclusivamente relaciones binarias. En este contexto, este tipo de representación es conocida como hiper-redes. Un hipergrafo dirigido es una generalización de un grafo dirigido especialmente adecuado para la representación de relaciones de muchos a muchos. Mientras que una arista en un grafo dirigido define una relación entre dos de sus nodos, una hiperarista en un hipergrafo dirigido define una relación entre dos conjuntos de sus nodos. La conexión fuerte es una relación de equivalencia que divide el conjunto de nodos de un hipergrafo dirigido en particiones y cada partición define una clase de equivalencia conocida como componente fuertemente conexo. El estudio de los componentes fuertemente conexos de un hipergrafo dirigido puede ayudar a conseguir una mejor comprensión de la estructura de este tipo de hipergrafos cuando su tamaño es considerable. En el caso de grafo dirigidos, existen algoritmos muy eficientes para el cálculo de los componentes fuertemente conexos en grafos de gran tamaño. Gracias a estos algoritmos, se ha podido averiguar que la estructura de la WWW tiene forma de “pajarita”, donde más del 70% del los nodos están distribuidos en tres grandes conjuntos y uno de ellos es un componente fuertemente conexo. Este tipo de estructura ha sido también observada en redes complejas en otras áreas como la biología. Estudios de naturaleza similar no han podido ser realizados en hipergrafos dirigidos porque no existe algoritmos capaces de calcular los componentes fuertemente conexos de este tipo de hipergrafos. En esta tesis doctoral, hemos investigado como calcular los componentes fuertemente conexos de un hipergrafo dirigido. En concreto, hemos desarrollado dos algoritmos para este problema y hemos determinado que son correctos y cuál es su complejidad computacional. Ambos algoritmos han sido evaluados empíricamente para comparar sus tiempos de ejecución. Para la evaluación, hemos producido una selección de hipergrafos dirigidos generados de forma aleatoria inspirados en modelos muy conocidos de grafos aleatorios como Erdos-Renyi, Newman-Watts-Strogatz and Barabasi-Albert. Varias optimizaciones para ambos algoritmos han sido implementadas y analizadas en la tesis. En concreto, colapsar los componentes fuertemente conexos del grafo dirigido que se puede construir eliminando ciertas hiperaristas complejas del hipergrafo dirigido original, mejora notablemente los tiempos de ejecucion de los algoritmos para varios de los hipergrafos utilizados en la evaluación. Aparte de los ejemplos de aplicación mencionados anteriormente, los hipergrafos dirigidos han sido también empleados en el área de representación de conocimiento. En concreto, este tipo de hipergrafos se han usado para el cálculo de módulos de ontologías. Una ontología puede ser definida como un conjunto de axiomas que especifican formalmente un conjunto de símbolos y sus relaciones, mientras que un modulo puede ser entendido como un subconjunto de axiomas de la ontología que recoge todo el conocimiento que almacena la ontología sobre un conjunto especifico de símbolos y sus relaciones. En la tesis nos hemos centrado solamente en módulos que han sido calculados usando la técnica de localidad sintáctica. Debido a que las ontologías pueden ser muy grandes, el cálculo de módulos puede facilitar las tareas de re-utilización y mantenimiento de dichas ontologías. Sin embargo, analizar todos los posibles módulos de una ontología es, en general, muy costoso porque el numero de módulos crece de forma exponencial con respecto al número de símbolos y de axiomas de la ontología. Afortunadamente, los axiomas de una ontología pueden ser divididos en particiones conocidas como átomos. Cada átomo representa un conjunto máximo de axiomas que siempre aparecen juntos en un modulo. La decomposición atómica de una ontología es definida como un grafo dirigido de tal forma que cada nodo del grafo corresponde con un átomo y cada arista define una dependencia entre una pareja de átomos. En esta tesis introducimos el concepto de“axiom dependency hypergraph” que generaliza el concepto de descomposición atómica de una ontología. Un modulo en una ontología correspondería con un componente conexo en este tipo de hipergrafos y un átomo de una ontología con un componente fuertemente conexo. Hemos adaptado la implementación de nuestros algoritmos para que funcionen también con axiom dependency hypergraphs y poder de esa forma calcular los átomos de una ontología. Para demostrar la viabilidad de esta idea, hemos incorporado nuestros algoritmos en una aplicación que hemos desarrollado para la extracción de módulos y la descomposición atómica de ontologías. A la aplicación la hemos llamado HyS y hemos estudiado sus tiempos de ejecución usando una selección de ontologías muy conocidas del área biomédica, la mayoría disponibles en el portal de Internet NCBO. Los resultados de la evaluación muestran que los tiempos de ejecución de HyS son mucho mejores que las aplicaciones más rápidas conocidas. ABSTRACT Directed hypergraphs are an intuitive modelling formalism that have been used in problems related to propositional logic, relational databases, computational linguistic and machine learning. Directed hypergraphs are also presented as an alternative to directed (bipartite) graphs to facilitate the study of the interactions between components of complex systems that cannot naturally be modelled as binary relations. In this context, they are known as hyper-networks. A directed hypergraph is a generalization of a directed graph suitable for representing many-to-many relationships. While an edge in a directed graph defines a relation between two nodes of the graph, a hyperedge in a directed hypergraph defines a relation between two sets of nodes. Strong-connectivity is an equivalence relation that induces a partition of the set of nodes of a directed hypergraph into strongly-connected components. These components can be collapsed into single nodes. As result, the size of the original hypergraph can significantly be reduced if the strongly-connected components have many nodes. This approach might contribute to better understand how the nodes of a hypergraph are connected, in particular when the hypergraphs are large. In the case of directed graphs, there are efficient algorithms that can be used to compute the strongly-connected components of large graphs. For instance, it has been shown that the macroscopic structure of the World Wide Web can be represented as a “bow-tie” diagram where more than 70% of the nodes are distributed into three large sets and one of these sets is a large strongly-connected component. This particular structure has been also observed in complex networks in other fields such as, e.g., biology. Similar studies cannot be conducted in a directed hypergraph because there does not exist any algorithm for computing the strongly-connected components of the hypergraph. In this thesis, we investigate ways to compute the strongly-connected components of directed hypergraphs. We present two new algorithms and we show their correctness and computational complexity. One of these algorithms is inspired by Tarjan’s algorithm for directed graphs. The second algorithm follows a simple approach to compute the stronglyconnected components. This approach is based on the fact that two nodes of a graph that are strongly-connected can also reach the same nodes. In other words, the connected component of each node is the same. Both algorithms are empirically evaluated to compare their performances. To this end, we have produced a selection of random directed hypergraphs inspired by existent and well-known random graphs models like Erd˝os-Renyi and Newman-Watts-Strogatz. Besides the application examples that we mentioned earlier, directed hypergraphs have also been employed in the field of knowledge representation. In particular, they have been used to compute the modules of an ontology. An ontology is defined as a collection of axioms that provides a formal specification of a set of terms and their relationships; and a module is a subset of an ontology that completely captures the meaning of certain terms as defined in the ontology. In particular, we focus on the modules computed using the notion of syntactic locality. As ontologies can be very large, the computation of modules facilitates the reuse and maintenance of these ontologies. Analysing all modules of an ontology, however, is in general not feasible as the number of modules grows exponentially in the number of terms and axioms of the ontology. Nevertheless, the modules can succinctly be represented using the Atomic Decomposition of an ontology. Using this representation, an ontology can be partitioned into atoms, which are maximal sets of axioms that co-occur in every module. The Atomic Decomposition is then defined as a directed graph such that each node correspond to an atom and each edge represents a dependency relation between two atoms. In this thesis, we introduce the notion of an axiom dependency hypergraph which is a generalization of the atomic decomposition of an ontology. A module in the ontology corresponds to a connected component in the hypergraph, and the atoms of the ontology to the strongly-connected components. We apply our algorithms for directed hypergraphs to axiom dependency hypergraphs and in this manner, we compute the atoms of an ontology. To demonstrate the viability of this approach, we have implemented the algorithms in the application HyS which computes the modules of ontologies and calculate their atomic decomposition. In the thesis, we provide an experimental evaluation of HyS with a selection of large and prominent biomedical ontologies, most of which are available in the NCBO Bioportal. HyS outperforms state-of-the-art implementations in the tasks of extracting modules and computing the atomic decomposition of these ontologies.
Resumo:
A growing number of ontologies are already available thanks to development initiatives in many different fields. In such ontology developments, developers must tackle a wide range of difficulties and handicaps, which can result in the appearance of anomalies in the resulting ontologies. Therefore, ontology evaluation plays a key role in ontology development projects. OOPS! is an on-line tool that automatically detects pitfalls, considered as potential errors or problems, and thus may help ontology developers to improve their ontologies. To gain insight in the existence of pitfalls and to assess whether there are differences among ontologies developed by novices, a random set of already scanned ontologies, and existing well-known ones, data of 406 OWL ontologies were analysed on OOPS!’s 21 pitfalls, of which 24 ontologies were also examined manually on the detected pitfalls. The various analyses performed show only minor differences between the three sets of ontologies, therewith providing a general landscape of pitfalls in ontologies.
Resumo:
In the context of the Semantic Web, natural language descriptions associated with ontologies have proven to be of major importance not only to support ontology developers and adopters, but also to assist in tasks such as ontology mapping, information extraction, or natural language generation. In the state-of-the-art we find some attempts to provide guidelines for URI local names in English, and also some disagreement on the use of URIs for describing ontology elements. When trying to extrapolate these ideas to a multilingual scenario, some of these approaches fail to provide a valid solution. On the basis of some real experiences in the translation of ontologies from English into Spanish, we provide a preliminary set of guidelines for naming and labeling ontologies in a multilingual scenario.
Resumo:
Abstract Web 2.0 applications enabled users to classify information resources using their own vocabularies. The bottom-up nature of these user-generated classification systems have turned them into interesting knowledge sources, since they provide a rich terminology generated by potentially large user communities. Previous research has shown that it is possible to elicit some emergent semantics from the aggregation of individual classifications in these systems. However the generation of ontologies from them is still an open research problem. In this thesis we address the problem of how to tap into user-generated classification systems for building domain ontologies. Our objective is to design a method to develop domain ontologies from user-generated classifications systems. To do so, we rely on ontologies in the Web of Data to formalize the semantics of the knowledge collected from the classification system. Current ontology development methodologies have recognized the importance of reusing knowledge from existing resources. Thus, our work is framed within the NeOn methodology scenario for building ontologies by reusing and reengineering non-ontological resources. The main contributions of this work are: An integrated method to develop ontologies from user-generated classification systems. With this method we extract a domain terminology from the classification system and then we formalize the semantics of this terminology by reusing ontologies in the Web of Data. Identification and adaptation of existing techniques for implementing the activities in the method so that they can fulfill the requirements of each activity. A novel study about emerging semantics in user-generated lists. Resumen La web 2.0 permitió a los usuarios clasificar recursos de información usando su propio vocabulario. Estos sistemas de clasificación generados por usuarios son recursos interesantes para la extracción de conocimiento debido principalmente a que proveen una extensa terminología generada por grandes comunidades de usuarios. Se ha demostrado en investigaciones previas que es posible obtener una semántica emergente de estos sistemas. Sin embargo la generación de ontologías a partir de ellos es todavía un problema de investigación abierto. Esta tesis trata el problema de cómo aprovechar los sistemas de clasificación generados por usuarios en la construcción de ontologías de dominio. Así el objetivo de la tesis es diseñar un método para desarrollar ontologías de dominio a partir de sistemas de clasificación generados por usuarios. El método propuesto reutiliza conceptualizaciones existentes en ontologías publicadas en la Web de Datos para formalizar la semántica del conocimiento que se extrae del sistema de clasificación. Por tanto, este trabajo está enmarcado dentro del escenario para desarrollar ontologías mediante la reutilización y reingeniería de recursos no ontológicos que se ha definido en la Metodología NeOn. Las principales contribuciones de este trabajo son: Un método integrado para desarrollar una ontología de dominio a partir de sistemas de clasificación generados por usuarios. En este método se extrae una terminología de dominio del sistema de clasificación y posteriormente se formaliza su semántica reutilizando ontologías en la Web de Datos. La identificación y adaptación de un conjunto de técnicas para implementar las actividades propuestas en el método de tal manera que puedan cumplir automáticamente los requerimientos de cada actividad. Un novedoso estudio acerca de la semántica emergente en las listas generadas por usuarios en la Web.
Resumo:
Modeling and prediction of the overall elastic–plastic response and local damage mechanisms in heterogeneous materials, in particular particle reinforced composites, is a very complex problem. Microstructural complexities such as the inhomogeneous spatial distribution of particles, irregular morphology of the particles, and anisotropy in particle orientation after secondary processing, such as extrusion, significantly affect deformation behavior. We have studied the effect of particle/matrix interface debonding in SiC particle reinforced Al alloy matrix composites with (a) actual microstructure consisting of angular SiC particles and (b) idealized ellipsoidal SiC particles. Tensile deformation in SiC particle reinforced Al matrix composites was modeled using actual microstructures reconstructed from serial sectioning approach. Interfacial debonding was modeled using user-defined cohesive zone elements. Modeling with the actual microstructure (versus idealized ellipsoids) has a significant influence on: (a) localized stresses and strains in particle and matrix, and (b) far-field strain at which localized debonding takes place. The angular particles exhibited higher degree of load transfer and are more sensitive to interfacial debonding. Larger decreases in stress are observed in the angular particles, because of the flat surfaces, normal to the loading axis, which bear load. Furthermore, simplification of particle morphology may lead to erroneous results.
Resumo:
La influencia de la aerodinámica en el diseño de los trenes de alta velocidad, unida a la necesidad de resolver nuevos problemas surgidos con el aumento de la velocidad de circulación y la reducción de peso del vehículo, hace evidente el interés de plantear un estudio de optimización que aborde tales puntos. En este contexto, se presenta en esta tesis la optimización aerodinámica del testero de un tren de alta velocidad, llevada a cabo mediante el uso de métodos de optimización avanzados. Entre estos métodos, se ha elegido aquí a los algoritmos genéticos y al método adjunto como las herramientas para llevar a cabo dicha optimización. La base conceptual, las características y la implementación de los mismos se detalla a lo largo de la tesis, permitiendo entender los motivos de su elección, y las consecuencias, en términos de ventajas y desventajas que cada uno de ellos implican. El uso de los algorimos genéticos implica a su vez la necesidad de una parametrización geométrica de los candidatos a óptimo y la generación de un modelo aproximado que complementa al método de optimización. Estos puntos se describen de modo particular en el primer bloque de la tesis, enfocada a la metodología seguida en este estudio. El segundo bloque se centra en la aplicación de los métodos a fin de optimizar el comportamiento aerodinámico del tren en distintos escenarios. Estos escenarios engloban los casos más comunes y también algunos de los más exigentes a los que hace frente un tren de alta velocidad: circulación en campo abierto con viento frontal o viento lateral, y entrada en túnel. Considerando el caso de viento frontal en campo abierto, los dos métodos han sido aplicados, permitiendo una comparación de las diferentes metodologías, así como el coste computacional asociado a cada uno, y la minimización de la resistencia aerodinámica conseguida en esa optimización. La posibilidad de evitar parametrizar la geometría y, por tanto, reducir el coste computacional del proceso de optimización es la característica más significativa de los métodos adjuntos, mientras que en el caso de los algoritmos genéticos se destaca la simplicidad y capacidad de encontrar un óptimo global en un espacio de diseño multi-modal o de resolver problemas multi-objetivo. El caso de viento lateral en campo abierto considera nuevamente los dos métoxi dos de optimización anteriores. La parametrización se ha simplificado en este estudio, lo que notablemente reduce el coste numérico de todo el estudio de optimización, a la vez que aún recoge las características geométricas más relevantes en un tren de alta velocidad. Este análisis ha permitido identificar y cuantificar la influencia de cada uno de los parámetros geométricos incluídos en la parametrización, y se ha observado que el diseño de la arista superior a barlovento es fundamental, siendo su influencia mayor que la longitud del testero o que la sección frontal del mismo. Finalmente, se ha considerado un escenario más a fin de validar estos métodos y su capacidad de encontrar un óptimo global. La entrada de un tren de alta velocidad en un túnel es uno de los casos más exigentes para un tren por el pico de sobrepresión generado, el cual afecta a la confortabilidad del pasajero, así como a la estabilidad del vehículo y al entorno próximo a la salida del túnel. Además de este problema, otro objetivo a minimizar es la resistencia aerodinámica, notablemente superior al caso de campo abierto. Este problema se resuelve usando algoritmos genéticos. Dicho método permite obtener un frente de Pareto donde se incluyen el conjunto de óptimos que minimizan ambos objetivos. ABSTRACT Aerodynamic design of trains influences several aspects of high-speed trains performance in a very significant level. In this situation, considering also that new aerodynamic problems have arisen due to the increase of the cruise speed and lightness of the vehicle, it is evident the necessity of proposing an optimization study concerning the train aerodynamics. Thus, the aerodynamic optimization of the nose shape of a high-speed train is presented in this thesis. This optimization is based on advanced optimization methods. Among these methods, genetic algorithms and the adjoint method have been selected. A theoretical description of their bases, the characteristics and the implementation of each method is detailed in this thesis. This introduction permits understanding the causes of their selection, and the advantages and drawbacks of their application. The genetic algorithms requirethe geometrical parameterization of any optimal candidate and the generation of a metamodel or surrogate model that complete the optimization process. These points are addressed with a special attention in the first block of the thesis, focused on the methodology considered in this study. The second block is referred to the use of these methods with the purpose of optimizing the aerodynamic performance of a high-speed train in several scenarios. These scenarios englobe the most representative operating conditions of high-speed trains, and also some of the most exigent train aerodynamic problems: front wind and cross-wind situations in open air, and the entrance of a high-speed train in a tunnel. The genetic algorithms and the adjoint method have been applied in the minimization of the aerodynamic drag on the train with front wind in open air. The comparison of these methods allows to evaluate the methdology and computational cost of each one, as well as the resulting minimization of the aerodynamic drag. Simplicity and robustness, the straightforward realization of a multi-objective optimization, and the capability of searching a global optimum are the main attributes of genetic algorithm. However, the requirement of geometrically parameterize any optimal candidate is a significant drawback that is avoided with the use of the adjoint method. This independence of the number of design variables leads to a relevant reduction of the pre-processing and computational cost. Considering the cross-wind stability, both methods are used again for the minimization of the side force. In this case, a simplification of the geometric parameterization of the train nose is adopted, what dramatically reduces the computational cost of the optimization process. Nevertheless, some of the most important geometrical characteristics are still described with this simplified parameterization. This analysis identifies and quantifies the influence of each design variable on the side force on the train. It is observed that the A-pillar roundness is the most demanding design parameter, with a more important effect than the nose length or the train cross-section area. Finally, a third scenario is considered for the validation of these methods in the aerodynamic optimization of a high-speed train. The entrance of a train in a tunnel is one of the most exigent train aerodynamic problems. The aerodynamic consequences of high-speed trains running in a tunnel are basically resumed in two correlated phenomena, the generation of pressure waves and an increase in aerodynamic drag. This multi-objective optimization problem is solved with genetic algorithms. The result is a Pareto front where a set of optimal solutions that minimize both objectives.
Resumo:
Multiple indicators are of interest in smart cities at different scales and for different stakeholders. In open environments, such as The Web, or when indicator information has to be interchanged across systems, contextual information (e.g., unit of measurement, measurement method) should be transmitted together with the data and the lack of such information might cause undesirable effects. Describing the data by means of ontologies increases interoperability among datasets and applications. However, methodological guidance is crucial during ontology development in order to transform the art of modeling in an engineering activity. In the current paper, we present a methodological approach for modelling data about Key Performance Indicators and their context with an application example of such guidelines.
Resumo:
La tesis doctoral se centra en la posibilidad de entender que la práctica de arquitectura puede encontrar en las prácticas comunicativas un apoyo instrumental, que sobrepasa cualquier simplificación clásica del uso de los medios como una mera aplicación superficial, post-producida o sencillamente promocional. A partir de esta premisa se exponen casos del último cuarto del siglo XX y se detecta que amenazas como el riesgo de la banalización, la posible saturación de la imagen pública o la previsible asociación incorrecta con otros individuos en presentaciones grupales o por temáticas, han podido influir en un crecimiento notable de la adquisición de control, por parte de los arquitectos, en sus oportunidades mediáticas. Esto es, como si la arquitectura hubiera empezado a superar y optimizar algo inevitable, que las fórmulas expositivas y las publicaciones, o más bien del exponer(se) y publicar(se), son herramientas disponibles para activar algún tipo de gestión intelectual de la comunicación e información circulante sobre si misma. Esta práctica de “autoedición” se analiza en un periodo concreto de la trayectoria de OMA -Office for Metropolitan Architecture-, estudio considerado pionero en el uso eficiente, oportunista y personalizado de los medios. Así, la segunda parte de la tesis se ocupa del análisis de su conocida monografía S,M,L,XL (1995), un volumen que contó con gran participación por parte de sus protagonistas durante la edición, y de cuyo proceso de producción apenas se había investigado. Esta publicación señaló un punto de inflexión en su género alterando todo formato y restricciones anteriores, y se ha convertido en un volumen emblemático para la disciplina que ninguna réplica posterior ha podido superar. Aquí se presenta a su vez como el desencadenante de la construcción de un “gran evento” que concluye en la transformación de la identidad de OMA en 10 años, paradójicamente entre el nacimiento de la Fundación Groszstadt y el arranque de la actividad de AMO, dos entidades paralelas clave anexas a OMA. Este planteamiento deviene de cómo la investigación desvela que S,M,L,XL es una pieza más, central pero no independiente, dentro de una suma de acciones e individuos, así como otras publicaciones, exposiciones, eventos y también artículos ensayados y proyectos, en particular Bigness, Generic City, Euralille y los concursos de 1989. Son significativos aspectos como la apertura a una autoría múltiple, encabezada por Rem Koolhaas y el diseñador gráfico Bruce Mau, acompañados en los agradecimientos de la editora Jennifer Sigler y cerca de una centena de nombres, cuyas aportaciones no necesariamente se basan en la construcción de fragmentos del libro. La supresión de ciertos límites permite superar también las tareas inicialmente relevantes en la edición de una publicación. Un objetivo general de la tesis es también la reflexión sobre relaciones anteriormente cuestionadas, como la establecida entre la arquitectura y los mercados o la economía. Tomando como punto de partida la idea de “design intelligence” sugerida por Michael Speaks (2001), se extrae de sus argumentos que lo esencial es el hallazgo de la singularidad o inteligencia propia de cada estudio de arquitectura o diseño. Asimismo se explora si en la construcción de ese tipo de fórmulas magistrales se alojaban también combinaciones de interés y productivas entre asuntos como la eficiencia y la creatividad, o la organización y las ideas. En esta dinámica de relaciones bidireccionales, y en ese presente de exceso de información, se fundamenta la propuesta de una equivalencia más evidenciada entre la “socialización” del trabajo del arquitecto, al compartirlo públicamente e introducir nuevas conversaciones, y la relación inversa a partir del trabajo sobre la “socialización” misma. Como si la consciencia sobre el uso de los medios pudiera ser efectivamente instrumental, y contribuir al desarrollo de la práctica de arquitectura, desde una perspectiva idealmente comprometida e intelectual. ABSTRACT The dissertation argues the possibility to understand that the practice of architecture can find an instrumental support in the practices of communication, overcoming any classical simplification of the use of media, generally reduced to superficial treatments or promotional efforts. Thus some cases of the last decades of the 20th century are presented. Some threats detected, such as the risk of triviality, the saturation of the public image or the foreseeable wrong association among individuals when they are introduced as part of thematic groups, might have encouraged a noticeable increase of command taken by architects when there is chance to intervene in a media environment. In other words, it can be argued that architecture has started to overcome and optimize the inevitable, the fact that exhibition formulas and publications, or simply the practice of (self)exhibition or (self)publication, are tools at our disposal for the activation of any kind of intellectual management of communication and circulating information about itself. This practice of “self-edition” is analyzed in a specific timeframe of OMA’s trajectory, an office that is considered as a ground-breaking actor in the efficient and opportunistic use of media. Then the second part of the thesis dissects their monograph S,M,L,XL (1995), a volume in which its main characters were deeply involved in terms of edition and design, a process barely analyzed up to now. This publication marked a turning point in its own genre, disrupting old formats and traditional restrictions. It became such an emblematic volume for the discipline that none of the following attempts of replica has ever been able to improve this precedent. Here, the book is also presented as the element that triggers the construction of a “big event” that concludes in the transformation of OMA identity in 10 years. Paradoxically, between the birth of the Groszstadt Foundation and the early steps of AMO, both two entities parallel and connected to OMA. This positions emerge from how the research unveils that S,M,L,XL is one more piece, a key one but not an unrelated element, within a sum of actions and individuals, as well as other publications, exhibitions, articles and projects, in particular Bigness, Generic City, Euralille and the competitions of 1989. Among the remarkable innovations of the monograph, there is an outstanding openness to a regime of multiple authorship, headed by Rem Koolhaas and the graphic designer Bruce Mau, who share the acknowledgements page with the editor, Jennifer Sigler, and almost 100 people, not necessarily responsible for specific fragments of the book. In this respect, the dissolution of certain limits made possible that the expected tasks in the edition of a publication could be trespassed. A general goal of the thesis is also to open a debate on typically questioned relations, particularly between architecture and markets or economy. Using the idea of “design intelligence”, outlined by Michael Speaks in 2001, the thesis pulls out its essence, basically the interest in detecting the singularity, or particular intelligence of every office of architecture and design. Then it explores if in the construction of this kind of ingenious formulas one could find interesting and useful combinations among issues like efficiency and creativity, or organization and ideas. This dynamic of bidirectional relations, rescued urgently at this present moment of excess of information, is based on the proposal for a more evident equivalence between the “socialization” of the work in architecture, anytime it is shared in public, and the opposite concept, the work on the proper act of “socialization” itself. As if a new awareness of the capacities of the use of media could turn it into an instrumental force, capable of contributing to the development of the practice of architecture, from an ideally committed and intelectual perspective.
Resumo:
In the beginning of the 90s, ontology development was similar to an art: ontology developers did not have clear guidelines on how to build ontologies but only some design criteria to be followed. Work on principles, methods and methodologies, together with supporting technologies and languages, made ontology development become an engineering discipline, the so-called Ontology Engineering. Ontology Engineering refers to the set of activities that concern the ontology development process and the ontology life cycle, the methods and methodologies for building ontologies, and the tool suites and languages that support them. Thanks to the work done in the Ontology Engineering field, the development of ontologies within and between teams has increased and improved, as well as the possibility of reusing ontologies in other developments and in final applications. Currently, ontologies are widely used in (a) Knowledge Engineering, Artificial Intelligence and Computer Science, (b) applications related to knowledge management, natural language processing, e-commerce, intelligent information integration, information retrieval, database design and integration, bio-informatics, education, and (c) the Semantic Web, the Semantic Grid, and the Linked Data initiative. In this paper, we provide an overview of Ontology Engineering, mentioning the most outstanding and used methodologies, languages, and tools for building ontologies. In addition, we include some words on how all these elements can be used in the Linked Data initiative.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
Lexica and terminology databases play a vital role in many NLP applications, but currently most such resources are published in application-specific formats, or with custom access interfaces, leading to the problem that much of this data is in ‘‘data silos’’ and hence difficult to access. The Semantic Web and in particular the Linked Data initiative provide effective solutions to this problem, as well as possibilities for data reuse by inter-lexicon linking, and incorporation of data categories by dereferencable URIs. The Semantic Web focuses on the use of ontologies to describe semantics on the Web, but currently there is no standard for providing complex lexical information for such ontologies and for describing the relationship between the lexicon and the ontology. We present our model, lemon, which aims to address these gaps
Resumo:
Semantic Web aims to allow machines to make inferences using the explicit conceptualisations contained in ontologies. By pointing to ontologies, Semantic Web-based applications are able to inter-operate and share common information easily. Nevertheless, multilingual semantic applications are still rare, owing to the fact that most online ontologies are monolingual in English. In order to solve this issue, techniques for ontology localisation and translation are needed. However, traditional machine translation is difficult to apply to ontologies, owing to the fact that ontology labels tend to be quite short in length and linguistically different from the free text paradigm. In this paper, we propose an approach to enhance machine translation of ontologies based on exploiting the well-structured concept descriptions contained in the ontology. In particular, our approach leverages the semantics contained in the ontology by using Cross Lingual Explicit Semantic Analysis (CLESA) for context-based disambiguation in phrase-based Statistical Machine Translation (SMT). The presented work is novel in the sense that application of CLESA in SMT has not been performed earlier to the best of our knowledge.
Resumo:
In this paper we present the MultiFarm dataset, which has been designed as a benchmark for multilingual ontology matching. The MultiFarm dataset is composed of a set of ontologies translated in different languages and the corresponding alignments between these ontologies. It is based on the OntoFarm dataset, which has been used successfully for several years in the Ontology Alignment Evaluation Initiative (OAEI). By translating the ontologies of the OntoFarm dataset into eight different languages – Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish – we created a comprehensive set of realistic test cases. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism.