36 resultados para Integrated semantic resources
em Universidad Politécnica de Madrid
Resumo:
We present a methodology for legacy language resource adaptation that generates domain-specific sentiment lexicons organized around domain entities described with lexical information and sentiment words described in the context of these entities. We explain the steps of the methodology and we give a working example of our initial results. The resulting lexicons are modelled as Linked Data resources by use of established formats for Linguistic Linked Data (lemon, NIF) and for linked sentiment expressions (Marl), thereby contributing and linking to existing Language Resources in the Linguistic Linked Open Data cloud.
Resumo:
This paper presents a Focused Crawler in order to Get Semantic Web Resources (CSR). Structured data web are available in formats such as Extensible Markup Language (XML), Resource Description Framework (RDF) and Ontology Web Language (OWL) that can be used for processing. One of the main challenges for performing a manual search and download semantic web resources is that this task consumes a lot of time. Our research work propose a focused crawler which allow to download these resources automatically and store them on disk in order to have a collection that will be used for data processing. CRS consists of three layers: (a) The User Interface Layer, (b) The Focus Crawler Layer and (c) The Base Crawler Layer. CSR uses as a selection policie the Shark-Search method. CSR was conducted with two experiments. The first one starts on December 15 2012 at 7:11 am and ends on December 16 2012 at 4:01 were obtained 448,123,537 bytes of data. The CSR ends by itself after to analyze 80,4375 seeds with an unlimited depth. CSR got 16,576 semantic resources files where the 89 % was RDF, the 10 % was XML and the 1% was OWL. The second one was based on the Web Data Commons work of the Research Group Data and Web Science at the University of Mannheim and the Institute AIFB at the Karlsruhe Institute of Technology. This began at 4:46 am of June 2 2013 and 1:37 am June 9 2013. After 162.51 hours of execution the result was 285,279 semantic resources where predominated the XML resources with 99 % and OWL and RDF with 1 % each one.
Resumo:
Following the Integrated Water Resources Management approach, the European Water Framework Directive demands Member States to develop water management plans at the catchment level. Those plans have to integrate the different interests and must be developed with stakeholder participation. To face these requirements, managers need tools to assess the impacts of possible management alternatives on natural and socio-economic systems. These tools should ideally be able to address the complexity and uncertainties of the water system, while serving as a platform for stakeholder participation. The objective of our research was to develop a participatory integrated assessment model, based on the combination of a crop model, an economic model and a participatory Bayesian network, with an application in the middle Guadiana sub-basin, in Spain. The methodology is intended to capture the complexity of water management problems, incorporating the relevant sectors, as well as the relevant scales involved in water management decision making. The integrated model has allowed us testing different management, market and climate change scenarios and assessing the impacts of such scenarios on the natural system (crops), on the socio-economic system (farms) and on the environment (water resources). Finally, this integrated assessment modelling process has allowed stakeholder participation, complying with the main requirements of current European water laws.
Resumo:
Extracting opinions and emotions from text is becoming increasingly important, especially since the advent of micro-blogging and social networking. Opinion mining is particularly popular and now gathers many public services, datasets and lexical resources. Unfortunately, there are few available lexical and semantic resources for emotion recognition that could foster the development of new emotion aware services and applications. The diversity of theories of emotion and the absence of a common vocabulary are two of the main barriers to the development of such resources. This situation motivated the creation of Onyx, a semantic vocabulary of emotions with a focus on lexical resources and emotion analysis services. It follows a linguistic Linked Data approach, it is aligned with the Provenance Ontology, and it has been integrated with the Lexicon Model for Ontologies (lemon), a popular RDF model for representing lexical entries. This approach also means a new and interesting way to work with different theories of emotion. As part of this work, Onyx has been aligned with EmotionML and WordNet-Affect.
Resumo:
In spite of the increasing presence of Semantic Web Facilities, only a limited amount of the available resources in the Internet provide a semantic access. Recent initiatives such as the emerging Linked Data Web are providing semantic access to available data by porting existing resources to the semantic web using different technologies, such as database-semantic mapping and scraping. Nevertheless, existing scraping solutions are based on ad-hoc solutions complemented with graphical interfaces for speeding up the scraper development. This article proposes a generic framework for web scraping based on semantic technologies. This framework is structured in three levels: scraping services, semantic scraping model and syntactic scraping. The first level provides an interface to generic applications or intelligent agents for gathering information from the web at a high level. The second level defines a semantic RDF model of the scraping process, in order to provide a declarative approach to the scraping task. Finally, the third level provides an implementation of the RDF scraping model for specific technologies. The work has been validated in a scenario that illustrates its application to mashup technologies
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
Lexica and terminology databases play a vital role in many NLP applications, but currently most such resources are published in application-specific formats, or with custom access interfaces, leading to the problem that much of this data is in ‘‘data silos’’ and hence difficult to access. The Semantic Web and in particular the Linked Data initiative provide effective solutions to this problem, as well as possibilities for data reuse by inter-lexicon linking, and incorporation of data categories by dereferencable URIs. The Semantic Web focuses on the use of ontologies to describe semantics on the Web, but currently there is no standard for providing complex lexical information for such ontologies and for describing the relationship between the lexicon and the ontology. We present our model, lemon, which aims to address these gaps
Resumo:
One of the challenges facing the current web is the efficient use of all the available information. The Web 2.0 phenomenon has favored the creation of contents by average users, and thus the amount of information that can be found for diverse topics has grown exponentially in the last years. Initiatives such as linked data are helping to build the Semantic Web, in which a set of standards are proposed for the exchange of data among heterogeneous systems. However, these standards are sometimes not used, and there are still plenty of websites that require naive techniques to discover their contents and services. This paper proposes an integrated framework for content and service discovery and extraction. The framework is divided into several layers where the discovery of contents and services is made in a representational stateless transfer system such as the web. It employs several web mining techniques as well as feature-oriented modeling for the discovery of cross-cutting features in web resources. The framework is used in a scenario of electronic newspapers. An intelligent agent crawls the web for related news, and uses services and visits links automatically according to its goal. This scenario illustrates how the discovery is made at different levels and how the use of semantics helps implement an agent that performs high-level tasks.
Resumo:
The Semantic Web is an extension of the traditional Web in which meaning of information is well defined, thus allowing a better interaction between people and computers. To accomplish its goals, mechanisms are required to make explicit the semantics of Web resources, to be automatically processed by software agents (this semantics being described by means of online ontologies). Nevertheless, issues arise caused by the semantic heterogeneity that naturally happens on the Web, namely redundancy and ambiguity. For tackling these issues, we present an approach to discover and represent, in a non-redundant way, the intended meaning of words in Web applications, while taking into account the (often unstructured) context in which they appear. To that end, we have developed novel ontology matching, clustering, and disambiguation techniques. Our work is intended to help bridge the gap between syntax and semantics for the Semantic Web construction
Resumo:
Water is a vital resource, but also a critical limiting factor for economic and social development in many parts of the world. The recent rapid growth in human population and water use for social and economic development is increasing the pressure on water resources and the environment, as well as leading to growing conflicts among competing water use sectors (agriculture, urban, tourism, industry) and regions (Gleick et al., 2009; World Bank, 2006). In Spain, as in many other arid and semi-arid regions affected by drought and wide climate variability, irrigated agriculture is responsible for most consumptive water use and plays an important role in sustaining rural livelihoods (Varela-Ortega, 2007). Historically, the evolution of irrigation has been based on publicly-funded irrigation development plans that promoted economic growth and improved the socio-economic conditions of rural farmers in agrarian Spain, but increased environmental damage and led to excessive and inefficient exploitation of water resources (Garrido and Llamas, 2010; Varela-Ortega et al., 2010). Currently, water policies in Spain focus on rehabilitating and improving the efficiency of irrigation systems, and are moving from technocratic towards integrated water management strategies driven by the European Union (EU) Water Framework Directive (WFD).
Resumo:
This paper describes a novel architecture to introduce automatic annotation and processing of semantic sensor data within context-aware applications. Based on the well-known state-charts technologies, and represented using W3C SCXML language combined with Semantic Web technologies, our architecture is able to provide enriched higher-level semantic representations of user’s context. This capability to detect and model relevant user situations allows a seamless modeling of the actual interaction situation, which can be integrated during the design of multimodal user interfaces (also based on SCXML) for them to be adequately adapted. Therefore, the final result of this contribution can be described as a flexible context-aware SCXML-based architecture, suitable for both designing a wide range of multimodal context-aware user interfaces, and implementing the automatic enrichment of sensor data, making it available to the entire Semantic Sensor Web
Resumo:
Ontologies and taxonomies are widely used to organize concepts providing the basis for activities such as indexing, and as background knowledge for NLP tasks. As such, translation of these resources would prove useful to adapt these systems to new languages. However, we show that the nature of these resources is significantly different from the "free-text" paradigm used to train most statistical machine translation systems. In particular, we see significant differences in the linguistic nature of these resources and such resources have rich additional semantics. We demonstrate that as a result of these linguistic differences, standard SMT methods, in particular evaluation metrics, can produce poor performance. We then look to the task of leveraging these semantics for translation, which we approach in three ways: by adapting the translation system to the domain of the resource; by examining if semantics can help to predict the syntactic structure used in translation; and by evaluating if we can use existing translated taxonomies to disambiguate translations. We present some early results from these experiments, which shed light on the degree of success we may have with each approach
Resumo:
In arid countries worldwide, social conflicts between irrigation-based human development and the conservation of aquatic ecosystems are widespread and attract many public debates. This research focuses on the analysis of water and agricultural policies aimed at conserving groundwater resources and maintaining rurallivelihoods in a basin in Spain's central arid region. Intensive groundwater mining for irrigation has caused overexploitation of the basin's large aquifer, the degradation of reputed wetlands and has given rise to notable social conflicts over the years. With the aim of tackling the multifaceted socio-ecological interactions of complex water systems, the methodology used in this study consists in a novel integration into a common platform of an economic optimization model and a hydrology model WEAP (Water Evaluation And Planning system). This robust tool is used to analyze the spatial and temporal effects of different water and agricultural policies under different climate scenarios. It permits the prediction of different climate and policy outcomes across farm types (water stress impacts and adaptation), at basin's level (aquifer recovery), and along the policies’ implementation horizon (short and long run). Results show that the region's current quota-based water policies may contribute to reduce water consumption in the farms but will not be able to recover the aquifer and will inflict income losses to the rural communities. This situation would worsen in case of drought. Economies of scale and technology are evidenced as larger farms with cropping diversification and those equipped with modern irrigation will better adapt to water stress conditions. However, the long-term sustainability of the aquifer and the maintenance of rurallivelihoods will be attained only if additional policy measures are put in place such as the control of illegal abstractions and the establishing of a water bank. Within the policy domain, the research contributes to the new sustainable development strategy of the EU by concluding that, in water-scarce regions, effective integration of water and agricultural policies is essential for achieving the water protection objectives of the EU policies. Therefore, the design and enforcement of well-balanced region-specific polices is a major task faced by policy makers for achieving successful water management that will ensure nature protection and human development at tolerable social costs. From a methodological perspective, this research initiative contributes to better address hydrological questions as well as economic and social issues in complex water and human systems. Its integrated vision provides a valuable illustration to inform water policy and management decisions within contexts of water-related conflicts worldwide.
Resumo:
Cloud computing is one the most relevant computing paradigms available nowadays. Its adoption has increased during last years due to the large investment and research from business enterprises and academia institutions. Among all the services cloud providers usually offer, Infrastructure as a Service has reached its momentum for solving HPC problems in a more dynamic way without the need of expensive investments. The integration of a large number of providers is a major goal as it enables the improvement of the quality of the selected resources in terms of pricing, speed, redundancy, etc. In this paper, we propose a system architecture, based on semantic solutions, to build an interoperable scheduler for federated clouds that works with several IaaS (Infrastructure as a Service) providers in a uniform way. Based on this architecture we implement a proof-of-concept prototype and test it with two different cloud solutions to provide some experimental results about the viability of our approach.
Resumo:
Virtualized Infrastructures are a promising way for providing flexible and dynamic computing solutions for resourceconsuming tasks. Scientific Workflows are one of these kind of tasks, as they need a large amount of computational resources during certain periods of time. To provide the best infrastructure configuration for a workflow it is necessary to explore as many providers as possible taking into account different criteria like Quality of Service, pricing, response time, network latency, etc. Moreover, each one of these new resources must be tuned to provide the tools and dependencies required by each of the steps of the workflow. Working with different infrastructure providers, either public or private using their own concepts and terms, and with a set of heterogeneous applications requires a framework for integrating all the information about these elements. This work proposes semantic technologies for describing and integrating all the information about the different components of the overall system and a set of policies created by the user. Based on this information a scheduling process will be performed to generate an infrastructure configuration defining the set of virtual machines that must be run and the tools that must be deployed on them.