22 resultados para Web, Search Engine, Overlap
em Universidad Politécnica de Madrid
Resumo:
Este Proyecto Fin de Carrera (PFC) tiene como objetivos el análisis, diseño e implementación de un sistema web que permita a los usuarios familiarizarse con el Índice de Desarrollo Humano (IDH), publicado anualmente por Naciones Unidas, ofreciendo un servicio de gestión y descarga de una aplicación móvil relacionada con dicho índice. La aplicación móvil es un juego educativo basado en preguntas sobre el IDH de los países, desarrollada en paralelo con este proyecto. El servicio web implementado en este proyecto facilita tanto la descarga, administración y actualización de contenidos como la interacción entre los usuarios. El sistema está formado por un servidor web, una base de datos de usuarios y contenidos y un portal web desde el cual puede descargarse la aplicación móvil, realizar consultas sobre estadísticas de juego y conocer el IDH sin necesidad de jugar. El buscador avanzado que ha sido desarrollado para conocer el IDH permite al usuario adquirir destrezas y entrenarse por sí solo para mejorar sus resultados de juego. Los administradores del sistema tienen la capacidad de gestionar el contenido del portal, los usuarios que solicitan darse de alta y la funcionalidad ofrecida, es decir, actualización del juego, foros y noticias. La instalación del sistema implementado en un servidor web ha permitido su verificación exitosa así como la provisión del servicio de información y sensibilización sobre el IDH, actualizado mediante la información de Naciones Unidas, motivación original del proyecto. ABSTRACT This Final Year Project takes as targets the analysis, design and implementation of a web system that allows to the users to familiarize with the Human Development Index (HDI), published annually by United Nations, offering a service of management and download a mobile application associated with that index. The mobile application is an educational game based on questions on the IDH of the countries, developed in parallel with this project. The web service implemented by means of this Project facilitates download, administration and update of contents and the interaction between the users across the cooperative game. The system consists of a web server, a database of users and content and a web portal from which you can download the mobile application, perform queries on game statistics, or discover the HDI without need for play. The advanced search engine that has been developed for the HDI allows the user to purchase and train for skills to improve their game results. System administrators have the ability to manage the content of the portal, users requesting register and the functionality offered, i.e., update to the game, forums and news. The installation of the system that was implemented has allowed successful verification and the provision of an information and awareness on the HDI, updated with the information from the United Nations, original motivation of the project.
Resumo:
This paper presents the main results of the eContent HARMOS project. The project has developed a webbased educational system for professional musicians. The main idea of the project consists of recording master classes taught by highly recognised maestros and annotate this multimedia material using an educational musical taxonomy and automatic annotation tools. Users of the system access a multi-criteria search engine that allows them to find and play video segments according to a combination of criteria, which include instrument, teacher, composer, composition, movement and pedagogical concept. In order to preserve teachers and students rights, a DRM and protection system has been developed. The system is being publicly exploited. This model preserves musical heritage, since these valuable master classes are usually not recorded and it also provides a sustainable model for musical institutions.
Resumo:
El autor de este proyecto es miembro reciente de la asociación SoloBoulder, dedicada a la modalidad de escalada boulder, noticias y actualidad, contenido multimedia, promoción de un equipo de escaladores y defensa de valores medioambientales en la montaña. El principal canal de distribución de contenidos es una página web existente previa a este proyecto. La asociación ha detectado una escasez y mala calidad de recursos en internet en cuanto a guías de zonas donde poder practicar el boulder. Tal circunstancia impulsa la iniciativa de este proyecto fin de carrera. El objetivo general es el desarrollo de una nueva aplicación que proporcione a los usuarios a nivel mundial una guía interactiva de boulder y otros puntos de interés, una red social que permita la creación cooperativa y orgánica de contenido, y servicios web para el consumo de la información desde otras plataformas u organizaciones. El nuevo software desarrollado es independiente de la página web de SoloBoulder previa. No obstante, ambas partes se integran bajo el mismo domino web y aspecto. La nueva aplicación ofrece a escaladores y turistas un servicio informativo e interactivo de calidad, con el que se espera aumentar el número de visitas en todo el sitio web y poder ampliar la difusión de valores medioambientales, diversificar las zonas de boulder y regular las masificadas, favorecer el deporte y brindar al escalador una oportunidad de autopromoción personal. Una gran motivación para el autor también es el proceso de investigación y formación en tecnologías, patrones arquitecturales de diseño y metodologías de trabajo adaptadas a las tendencias actuales en la ingeniería de software, con especial curiosidad hacia el mundo web. A este respecto podemos destacar: metodología de trabajo en proyectos, análisis de proyectos, arquitecturas de software, diseño de software, bases de datos, programación y buenas prácticas, seguridad, interfaz gráfica web, diseño gráfico, Web Performance Optimization, Search Engine Optimization, etc. En resumen, este proyecto constituye un aprendizaje y puesta en práctica de diversos conocimientos adquiridos durante la ejecución del mismo, así como afianzamiento de materias estudiadas en la carrera. Además, el producto desarrollado ofrece un servicio de calidad a los usuarios y favorece el deporte y la autopromoción del escalador. ABSTRACT. The author of this Project is recent member of the association SoloBoulder, dedicated to a rock climbing discipline called bouldering, news, multimedia content, promotion of a team of climbers and defense of environmental values in the mountain. The main content distribution channel is a web page existing previous to this project. The association has detected scarcity and bad quality of resources on the internet about guides of bouldering areas. This circumstance motivates the initiative of this project. The general objective is the development of a new application which provides a worldwide, interactive bouldering guide, including other points of interest, a social network which allows the cooperative and organic creation of content, and web services for consumption of information from other platforms or organizations. The new software developed is independent of the previous SoloBoulder web page. However, both parts are integrated under the same domain and appearance. The new application offers to climbers and tourists a quality informative and interactive service, with which we hope to increase the number of visits in the whole web site and be able to expand the dissemination of environmental values, diversify boulder areas and regulate the overcrowded ones, encourage sport and offer to the climber an opportunity of self-promotion. A strong motivation for the author is also the process of investigation and education in technologies, architectural design patterns and working methodologies adapted to the actual trends in software engineering, with special curiosity about the web world. In this regard we could highlight: project working methodologies, project analysis, software architectures, software design, data bases, programming and good practices, security, graphic web interface, graphic design, Web Performance Optimization, Search Engine Optimization, etc. To sum up, this project constitutes learning and practice of diverse knowledge acquired during its execution, as well as consolidation of subjects studied in the degree. In addition, the product developed offers a quality service to the users and favors the sport and the selfpromotion of the climber.
Resumo:
Biomedical researchers and clinicians working with molecular technologies in routine clinical practice often need to review the available literature to gather information regarding specific sequences of nucleic acids. This includes, for instance, finding articles related to a concrete DNA sequence, or identifying empirically-validated primer/probe sequences to evaluate the presence of different micro-organisms. Unfortunately, these hard and time-consuming tasks often need to be manually performed by researchers themselves since no publicly available biomedical literature search engine, e.g. PubMed, PubMed Central (PMC), etc., provides the required search functionalities. In this article, we describe PubDNA Finder, a web service that enables users to perform advanced searches on PubMed Central-indexed full text articles with sequences of nucleic acids
Resumo:
Over the last few decades, the ever-increasing output of scientific publications has led to new challenges to keep up to date with the literature. In the biomedical area, this growth has introduced new requirements for professionals, e.g., physicians, who have to locate the exact papers that they need for their clinical and research work amongst a huge number of publications. Against this backdrop, novel information retrieval methods are even more necessary. While web search engines are widespread in many areas, facilitating access to all kinds of information, additional tools are required to automatically link information retrieved from these engines to specific biomedical applications. In the case of clinical environments, this also means considering aspects such as patient data security and confidentiality or structured contents, e.g., electronic health records (EHRs). In this scenario, we have developed a new tool to facilitate query building to retrieve scientific literature related to EHRs. Results: We have developed CDAPubMed, an open-source web browser extension to integrate EHR features in biomedical literature retrieval approaches. Clinical users can use CDAPubMed to: (i) load patient clinical documents, i.e., EHRs based on the Health Level 7-Clinical Document Architecture Standard (HL7-CDA), (ii) identify relevant terms for scientific literature search in these documents, i.e., Medical Subject Headings (MeSH), automatically driven by the CDAPubMed configuration, which advanced users can optimize to adapt to each specific situation, and (iii) generate and launch literature search queries to a major search engine, i.e., PubMed, to retrieve citations related to the EHR under examination. Conclusions: CDAPubMed is a platform-independent tool designed to facilitate literature searching using keywords contained in specific EHRs. CDAPubMed is visually integrated, as an extension of a widespread web browser, within the standard PubMed interface. It has been tested on a public dataset of HL7-CDA documents, returning significantly fewer citations since queries are focused on characteristics identified within the EHR. For instance, compared with more than 200,000 citations retrieved by breast neoplasm, fewer than ten citations were retrieved when ten patient features were added using CDAPubMed. This is an open source tool that can be freely used for non-profit purposes and integrated with other existing systems.
Resumo:
Enriching knowledge bases with multimedia information makes it possible to complement textual descriptions with visual and audio information. Such complementary information can help users to understand the meaning of assertions, and in general improve the user experience with the knowledge base. In this paper we address the problem of how to enrich ontology instances with candidate images retrieved from existing Web search engines. DBpedia has evolved into a major hub in the Linked Data cloud, interconnecting millions of entities organized under a consistent ontology. Our approach taps into the Wikipedia corpus to gather context information for DBpedia instances and takes advantage of image tagging information when this is available to calculate semantic relatedness between instances and candidate images. We performed experiments with focus on the particularly challenging problem of highly ambiguous names. Both methods presented in this work outperformed the baseline. Our best method leveraged context words from Wikipedia, tags from Flickr and type information from DBpedia to achieve an average precision of 80%.
Resumo:
Enabling Subject Matter Experts (SMEs) to formulate knowledge without the intervention of Knowledge Engineers (KEs) requires providing SMEs with methods and tools that abstract the underlying knowledge representation and allow them to focus on modeling activities. Bridging the gap between SME-authored models and their representation is challenging, especially in the case of complex knowledge types like processes, where aspects like frame management, data, and control flow need to be addressed. In this paper, we describe how SME-authored process models can be provided with an operational semantics and grounded in a knowledge representation language like F-logic in order to support process-related reasoning. The main results of this work include a formalism for process representation and a mechanism for automatically translating process diagrams into executable code following such formalism. From all the process models authored by SMEs during evaluation 82% were well-formed, all of which executed correctly. Additionally, the two optimizations applied to the code generation mechanism produced a performance improvement at reasoning time of 25% and 30% with respect to the base case, respectively.
Resumo:
xxx
Resumo:
La web vive un proceso de cambio constante, basado en una interacción mayor del usuario. A partir de la actual corriente de paradigmas y tecnologías asociadas a la web 2.0, han surgido una serie de estándares de gran utilidad, que cubre la necesidad de los desarrollos actuales de la red. Entre estos se incluyen los componentes web, etiquetas HTML definidas por el usuario que cubren una función concreta dentro de una página. Existe una necesidad de medir la calidad de dichos desarrollos, para discernir si el concepto de componente web supone un cambio revolucionario en el desarrollo de la web 2.0. Para ello, es necesario realizar una explotación de componentes web, considerada como la medición de calidad basada en métricas y definición de un modelo de interconexión de componentes. La plataforma PicBit surge como respuesta a estas cuestiones. Consiste en una plataforma social de construcción de perfiles basada en estos elementos. Desde la perspectiva del usuario final se trata de una herramienta para crear perfiles y comunidades sociales, mientras que desde una perspectiva académica, la plataforma consiste en un entorno de pruebas o sandbox de componentes web. Para ello, será necesario implementar el extremo servidor de dicha plataforma, enfocado a la labor de explotación, por medio de la definición de una interfaz REST de operaciones y un sistema para la recolección de eventos de usuario en la plataforma. Gracias a esta plataforma se podrán discernir qué parámetros influyen positivamente en la experiencia de uso de un componente, así como descubrir el futuro potencial de este tipo de desarrollos.---ABSTRACT---The web evolves into a more interactive platform. From the actual version of the web, named as web 2.0, many paradigms and standards have arisen. One of those standards is web components, a set of concepts to define new HTML tags that covers a specific function inside a web page. It is necessary to measure the quality of this kind of software development, and the aim behind this approach is to determine if this new set of concepts would survive in the actual web paradigm. To achieve this, it is described a model to analyse components, in the terms of quality measure and interconnection model description. PicBit consists of a social platform to use web components. From the point of view of the final user, this platform is a tool to create social profiles using components, whereas from the point of view of technicians, it consists of a sandbox of web components. Thanks to this platform, we will be able to discover those parameters that have a positive effect in the user experience and to discover the potential of this new set of standards into the web 2.0.
Resumo:
Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to congure the annotations to their specic needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation condence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.
Resumo:
It has taken more than a decade of intense technical and market developments for mobile Internet to take off as a mass phenomenon. And it has arrived with great intensity: an avalanche of mobile content and applications is now overrunning us. Similar to its wired counterpart, wireless Web users will continuously demand access to data and content in an efficient and user-friendly manner.
Resumo:
Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
This PhD thesis contributes to the problem of resource and service discovery in the context of the composable web. In the current web, mashup technologies allow developers reusing services and contents to build new web applications. However, developers face a problem of information flood when searching for appropriate services or resources for their combination. To contribute to overcoming this problem, a framework is defined for the discovery of services and resources. In this framework, three levels are defined for performing discovery at content, discovery and agente levels. The content level involves the information available in web resources. The web follows the Representational Stateless Transfer (REST) architectural style, in which resources are returned as representations from servers to clients. These representations usually employ the HyperText Markup Language (HTML), which, along with Content Style Sheets (CSS), describes the markup employed to render representations in a web browser. Although the use of SemanticWeb standards such as Resource Description Framework (RDF) make this architecture suitable for automatic processes to use the information present in web resources, these standards are too often not employed, so automation must rely on processing HTML. This process, often referred as Screen Scraping in the literature, is the content discovery according to the proposed framework. At this level, discovery rules indicate how the different pieces of data in resources’ representations are mapped onto semantic entities. By processing discovery rules on web resources, semantically described contents can be obtained out of them. The service level involves the operations that can be performed on the web. The current web allows users to perform different tasks such as search, blogging, e-commerce, or social networking. To describe the possible services in RESTful architectures, a high-level feature-oriented service methodology is proposed at this level. This lightweight description framework allows defining service discovery rules to identify operations in interactions with REST resources. The discovery is thus performed by applying discovery rules to contents discovered in REST interactions, in a novel process called service probing. Also, service discovery can be performed by modelling services as contents, i.e., by retrieving Application Programming Interface (API) documentation and API listings in service registries such as ProgrammableWeb. For this, a unified model for composable components in Mashup-Driven Development (MDD) has been defined after the analysis of service repositories from the web. The agent level involves the orchestration of the discovery of services and contents. At this level, agent rules allow to specify behaviours for crawling and executing services, which results in the fulfilment of a high-level goal. Agent rules are plans that allow introspecting the discovered data and services from the web and the knowledge present in service and content discovery rules to anticipate the contents and services to be found on specific resources from the web. By the definition of plans, an agent can be configured to target specific resources. The discovery framework has been evaluated on different scenarios, each one covering different levels of the framework. Contenidos a la Carta project deals with the mashing-up of news from electronic newspapers, and the framework was used for the discovery and extraction of pieces of news from the web. Similarly, in Resulta and VulneraNET projects the discovery of ideas and security knowledge in the web is covered, respectively. The service level is covered in the OMELETTE project, where mashup components such as services and widgets are discovered from component repositories from the web. The agent level is applied to the crawling of services and news in these scenarios, highlighting how the semantic description of rules and extracted data can provide complex behaviours and orchestrations of tasks in the web. The main contributions of the thesis are the unified framework for discovery, which allows configuring agents to perform automated tasks. Also, a scraping ontology has been defined for the construction of mappings for scraping web resources. A novel first-order logic rule induction algorithm is defined for the automated construction and maintenance of these mappings out of the visual information in web resources. Additionally, a common unified model for the discovery of services is defined, which allows sharing service descriptions. Future work comprises the further extension of service probing, resource ranking, the extension of the Scraping Ontology, extensions of the agent model, and contructing a base of discovery rules. Resumen La presente tesis doctoral contribuye al problema de descubrimiento de servicios y recursos en el contexto de la web combinable. En la web actual, las tecnologías de combinación de aplicaciones permiten a los desarrolladores reutilizar servicios y contenidos para construir nuevas aplicaciones web. Pese a todo, los desarrolladores afrontan un problema de saturación de información a la hora de buscar servicios o recursos apropiados para su combinación. Para contribuir a la solución de este problema, se propone un marco de trabajo para el descubrimiento de servicios y recursos. En este marco, se definen tres capas sobre las que se realiza descubrimiento a nivel de contenido, servicio y agente. El nivel de contenido involucra a la información disponible en recursos web. La web sigue el estilo arquitectónico Representational Stateless Transfer (REST), en el que los recursos son devueltos como representaciones por parte de los servidores a los clientes. Estas representaciones normalmente emplean el lenguaje de marcado HyperText Markup Language (HTML), que, unido al estándar Content Style Sheets (CSS), describe el marcado empleado para mostrar representaciones en un navegador web. Aunque el uso de estándares de la web semántica como Resource Description Framework (RDF) hace apta esta arquitectura para su uso por procesos automatizados, estos estándares no son empleados en muchas ocasiones, por lo que cualquier automatización debe basarse en el procesado del marcado HTML. Este proceso, normalmente conocido como Screen Scraping en la literatura, es el descubrimiento de contenidos en el marco de trabajo propuesto. En este nivel, un conjunto de reglas de descubrimiento indican cómo los diferentes datos en las representaciones de recursos se corresponden con entidades semánticas. Al procesar estas reglas sobre recursos web, pueden obtenerse contenidos descritos semánticamente. El nivel de servicio involucra las operaciones que pueden ser llevadas a cabo en la web. Actualmente, los usuarios de la web pueden realizar diversas tareas como búsqueda, blogging, comercio electrónico o redes sociales. Para describir los posibles servicios en arquitecturas REST, se propone en este nivel una metodología de alto nivel para descubrimiento de servicios orientada a funcionalidades. Este marco de descubrimiento ligero permite definir reglas de descubrimiento de servicios para identificar operaciones en interacciones con recursos REST. Este descubrimiento es por tanto llevado a cabo al aplicar las reglas de descubrimiento sobre contenidos descubiertos en interacciones REST, en un nuevo procedimiento llamado sondeo de servicios. Además, el descubrimiento de servicios puede ser llevado a cabo mediante el modelado de servicios como contenidos. Es decir, mediante la recuperación de documentación de Application Programming Interfaces (APIs) y listas de APIs en registros de servicios como ProgrammableWeb. Para ello, se ha definido un modelo unificado de componentes combinables para Mashup-Driven Development (MDD) tras el análisis de repositorios de servicios de la web. El nivel de agente involucra la orquestación del descubrimiento de servicios y contenidos. En este nivel, las reglas de nivel de agente permiten especificar comportamientos para el rastreo y ejecución de servicios, lo que permite la consecución de metas de mayor nivel. Las reglas de los agentes son planes que permiten la introspección sobre los datos y servicios descubiertos, así como sobre el conocimiento presente en las reglas de descubrimiento de servicios y contenidos para anticipar contenidos y servicios por encontrar en recursos específicos de la web. Mediante la definición de planes, un agente puede ser configurado para descubrir recursos específicos. El marco de descubrimiento ha sido evaluado sobre diferentes escenarios, cada uno cubriendo distintos niveles del marco. El proyecto Contenidos a la Carta trata de la combinación de noticias de periódicos digitales, y en él el framework se ha empleado para el descubrimiento y extracción de noticias de la web. De manera análoga, en los proyectos Resulta y VulneraNET se ha llevado a cabo un descubrimiento de ideas y de conocimientos de seguridad, respectivamente. El nivel de servicio se cubre en el proyecto OMELETTE, en el que componentes combinables como servicios y widgets se descubren en repositorios de componentes de la web. El nivel de agente se aplica al rastreo de servicios y noticias en estos escenarios, mostrando cómo la descripción semántica de reglas y datos extraídos permiten proporcionar comportamientos complejos y orquestaciones de tareas en la web. Las principales contribuciones de la tesis son el marco de trabajo unificado para descubrimiento, que permite configurar agentes para realizar tareas automatizadas. Además, una ontología de extracción ha sido definida para la construcción de correspondencias y extraer información de recursos web. Asimismo, un algoritmo para la inducción de reglas de lógica de primer orden se ha definido para la construcción y el mantenimiento de estas correspondencias a partir de la información visual de recursos web. Adicionalmente, se ha definido un modelo común y unificado para el descubrimiento de servicios que permite la compartición de descripciones de servicios. Como trabajos futuros se considera la extensión del sondeo de servicios, clasificación de recursos, extensión de la ontología de extracción y la construcción de una base de reglas de descubrimiento.
Resumo:
The goal of the W3C's Media Annotation Working Group (MAWG) is to promote interoperability between multimedia metadata formats on the Web. As experienced by everybody, audiovisual data is omnipresent on today's Web. However, different interaction interfaces and especially diverse metadata formats prevent unified search, access, and navigation. MAWG has addressed this issue by developing an interlingua ontology and an associated API. This article discusses the rationale and core concepts of the ontology and API for media resources. The specifications developed by MAWG enable interoperable contextualized and semantic annotation and search, independent of the source metadata format, and connecting multimedia data to the Linked Data cloud. Some demonstrators of such applications are also presented in this article.