60 resultados para Knowledge retrieval, Ontology, User information needs, User profiles, Information retrieval
em Universidad Politécnica de Madrid
Resumo:
Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts.
Resumo:
Aeronautical charts underlie the representation of aeronautic geographic information that supports pilots in flight. Nevertheless, charts become complex due to the high density of data and the different kinds that support each phase of flight. These features make difficult using them on board. After conducting a study that aims to understand and to evaluate pilot’s needs related to Geographic Information, it is proposed a solution to implement a platform based on geographic information standards (OGC, ISO) and supported by a distributed Web architecture. This platform facilitates the use, retrieval, updating of information and its exchange among different institutions through private and public users. As a first element to ensure interoperability and the harmonisation of information, we propose an aeronautical metadata profile that sets guidelines and elements for its description. This profile meets the standards set by ICAO, Eurocontrol and ISO. The platform offers three levels of access to data through different types of devices and user profiles. This paper suggests an alternative and reliable way for distributing aeronautical geoinformation, focusing on specific functions or displaying and querying.
Resumo:
Enriching knowledge bases with multimedia information makes it possible to complement textual descriptions with visual and audio information. Such complementary information can help users to understand the meaning of assertions, and in general improve the user experience with the knowledge base. In this paper we address the problem of how to enrich ontology instances with candidate images retrieved from existing Web search engines. DBpedia has evolved into a major hub in the Linked Data cloud, interconnecting millions of entities organized under a consistent ontology. Our approach taps into the Wikipedia corpus to gather context information for DBpedia instances and takes advantage of image tagging information when this is available to calculate semantic relatedness between instances and candidate images. We performed experiments with focus on the particularly challenging problem of highly ambiguous names. Both methods presented in this work outperformed the baseline. Our best method leveraged context words from Wikipedia, tags from Flickr and type information from DBpedia to achieve an average precision of 80%.
Resumo:
Aeronautical charts underlie the representation of aeronautic geographic information that supports pilots in flight. Nevertheless, the charts become complex due to the high density of data and the different kinds of charts that support each phase of flight. These features make difficult using them on board. After conducting a study, with civil Spaniard pilots, that aims to understand and to evaluate their needs related to Geographic Information, it is proposed a solution to implement a platform based on geographic information standards (OGC, ISO) and supported by a distributed Web architecture. This platform facilitates the use, retrieval, updating of information and its exchange among different institutions through private and public users. As a first element to ensure interoperability of information, we suggest an aeronautical metadata profile that sets guidelines and elements for its description. The metadata profile meets the standards set by ICAO, Eurocontrol and ISO. The platform offers three levels of access to data through different types of devices and user profiles. Thus, aeronautical institutions could edit data while pilot is on board accessing digital aeronautical charts through a laptop or Table PC. This paper suggests an alternative and reliable way for distributing aeronautical geoinformation, focusing on specific functions or displaying and querying.
Resumo:
Los servicios telemáticos han transformando la mayoría de nuestras actividades cotidianas y ofrecen oportunidades sin precedentes con características como, por ejemplo, el acceso ubicuo, la disponibilidad permanente, la independencia del dispositivo utilizado, la multimodalidad o la gratuidad, entre otros. No obstante, los beneficios que destacan en cuanto se reflexiona sobre estos servicios, tienen como contrapartida una serie de riesgos y amenazas no tan obvios, ya que éstos se nutren de y tratan con datos personales, lo cual suscita dudas respecto a la privacidad de las personas. Actualmente, las personas que asumen el rol de usuarios de servicios telemáticos generan constantemente datos digitales en distintos proveedores. Estos datos reflejan parte de su intimidad, de sus características particulares, preferencias, intereses, relaciones sociales, hábitos de consumo, etc. y lo que es más controvertido, toda esta información se encuentra bajo la custodia de distintos proveedores que pueden utilizarla más allá de las necesidades y el control del usuario. Los datos personales y, en particular, el conocimiento sobre los usuarios que se puede extraer a partir de éstos (modelos de usuario) se han convertido en un nuevo activo económico para los proveedores de servicios. De este modo, estos recursos se pueden utilizar para ofrecer servicios centrados en el usuario basados, por ejemplo, en la recomendación de contenidos, la personalización de productos o la predicción de su comportamiento, lo cual permite a los proveedores conectar con los usuarios, mantenerlos, involucrarlos y en definitiva, fidelizarlos para garantizar el éxito de un modelo de negocio. Sin embargo, dichos recursos también pueden utilizarse para establecer otros modelos de negocio que van más allá de su procesamiento y aplicación individual por parte de un proveedor y que se basan en su comercialización y compartición con otras entidades. Bajo esta perspectiva, los usuarios sufren una falta de control sobre los datos que les refieren, ya que esto depende de la voluntad y las condiciones impuestas por los proveedores de servicios, lo cual implica que habitualmente deban enfrentarse ante la disyuntiva de ceder sus datos personales o no acceder a los servicios telemáticos ofrecidos. Desde el sector público se trata de tomar medidas que protejan a los usuarios con iniciativas y legislaciones que velen por su privacidad y que aumenten el control sobre sus datos personales, a la vez que debe favorecer el desarrollo económico propiciado por estos proveedores de servicios. En este contexto, esta tesis doctoral propone una arquitectura y modelo de referencia para un ecosistema de intercambio de datos personales centrado en el usuario que promueve la creación, compartición y utilización de datos personales y modelos de usuario entre distintos proveedores, al mismo tiempo que ofrece a los usuarios las herramientas necesarias para ejercer su control en cuanto a la cesión y uso de sus recursos personales y obtener, en su caso, distintos incentivos o contraprestaciones económicas. Las contribuciones originales de la tesis son la especificación y diseño de una arquitectura que se apoya en un proceso de modelado distribuido que se ha definido en el marco de esta investigación. Éste se basa en el aprovechamiento de recursos que distintas entidades (fuentes de datos) ofrecen para generar modelos de usuario enriquecidos que cubren las necesidades específicas de terceras entidades, considerando la participación del usuario y el control sobre sus recursos personales (datos y modelos de usuario). Lo anterior ha requerido identificar y caracterizar las fuentes de datos con potencial de abastecer al ecosistema, determinar distintos patrones para la generación de modelos de usuario a partir de datos personales distribuidos y heterogéneos y establecer una infraestructura para la gestión de identidad y privacidad que permita a los usuarios expresar sus preferencias e intereses respecto al uso y compartición de sus recursos personales. Además, se ha definido un modelo de negocio de referencia que sustenta las investigaciones realizadas y que ha sido particularizado en dos ámbitos de aplicación principales, en concreto, el sector de publicidad en redes sociales y el sector financiero para la implantación de nuevos servicios. Finalmente, cabe destacar que las contribuciones de esta tesis han sido validadas en el contexto de distintos proyectos de investigación industrial aplicada y también en el marco de proyectos fin de carrera que la autora ha tutelado o en los que ha colaborado. Los resultados obtenidos han originado distintos méritos de investigación como dos patentes en explotación, la publicación de un artículo en una revista con índice de impacto y diversos artículos en congresos internacionales de relevancia. Algunos de éstos han sido galardonados con premios de distintas instituciones, así como en las conferencias donde han sido presentados. ABSTRACT Information society services have changed most of our daily activities, offering unprecedented opportunities with certain characteristics, such as: ubiquitous access, permanent availability, device independence, multimodality and free-of-charge services, among others. However, all the positive aspects that emerge when thinking about these services have as counterpart not-so-obvious threats and risks, because they feed from and use personal data, thus creating concerns about peoples’ privacy. Nowadays, people that play the role of user of services are constantly generating digital data in different service providers. These data reflect part of their intimacy, particular characteristics, preferences, interests, relationships, consumer behavior, etc. Controversy arises because this personal information is stored and kept by the mentioned providers that can use it beyond the user needs and control. Personal data and, in particular, the knowledge about the user that can be obtained from them (user models) have turned into a new economic asset for the service providers. In this way, these data and models can be used to offer user centric services based, for example, in content recommendation, tailored-products or user behavior, all of which allows connecting with the users, keeping them more engaged and involved with the provider, finally reaching customer loyalty in order to guarantee the success of a business model. However, these resources can be used to establish a different kind of business model; one that does not only processes and individually applies personal data, but also shares and trades these data with other entities. From that perspective, the users lack control over their referred data, because it depends from the conditions imposed by the service providers. The consequence is that the users often face the following dilemma: either giving up their personal data or not using the offered services. The Public Sector takes actions in order to protect the users approving, for example, laws and legal initiatives that reinforce privacy and increase control over personal data, while at the same time the authorities are also key players in the economy development that derives from the information society services. In this context, this PhD Dissertation proposes an architecture and reference model to achieve a user-centric personal data ecosystem that promotes the creation, sharing and use of personal data and user models among different providers, while offering users the tools to control who can access which data and why and if applicable, to obtain different incentives. The original contributions obtained are the specification and design of an architecture that supports a distributed user modelling process defined by this research. This process is based on leveraging scattered resources of heterogeneous entities (data sources) to generate on-demand enriched user models that fulfill individual business needs of third entities, considering the involvement of users and the control over their personal resources (data and user models). This has required identifying and characterizing data sources with potential for supplying resources, defining different generation patterns to produce user models from scattered and heterogeneous data, and establishing identity and privacy management infrastructures that allow users to set their privacy preferences regarding the use and sharing of their resources. Moreover, it has also been proposed a reference business model that supports the aforementioned architecture and this has been studied for two application fields: social networks advertising and new financial services. Finally, it has to be emphasized that the contributions obtained in this dissertation have been validated in the context of several national research projects and master thesis that the author has directed or has collaborated with. Furthermore, these contributions have produced different scientific results such as two patents and different publications in relevant international conferences and one magazine. Some of them have been awarded with different prizes.
Resumo:
Sensor networks are increasingly becoming one of the main sources of Big Data on the Web. However, the observations that they produce are made available with heterogeneous schemas, vocabularies and data formats, making it difficult to share and reuse these data for other purposes than those for which they were originally set up. In this thesis we address these challenges, considering how we can transform streaming raw data to rich ontology-based information that is accessible through continuous queries for streaming data. Our main contribution is an ontology-based approach for providing data access and query capabilities to streaming data sources, allowing users to express their needs at a conceptual level, independent of implementation and language-specific details. We introduce novel query rewriting and data translation techniques that rely on mapping definitions relating streaming data models to ontological concepts. Specific contributions include: • The syntax and semantics of the SPARQLStream query language for ontologybased data access, and a query rewriting approach for transforming SPARQLStream queries into streaming algebra expressions. • The design of an ontology-based streaming data access engine that can internally reuse an existing data stream engine, complex event processor or sensor middleware, using R2RML mappings for defining relationships between streaming data models and ontology concepts. Concerning the sensor metadata of such streaming data sources, we have investigated how we can use raw measurements to characterize streaming data, producing enriched data descriptions in terms of ontological models. Our specific contributions are: • A representation of sensor data time series that captures gradient information that is useful to characterize types of sensor data. • A method for classifying sensor data time series and determining the type of data, using data mining techniques, and a method for extracting semantic sensor metadata features from the time series.
Resumo:
The emergence of cloud datacenters enhances the capability of online data storage. Since massive data is stored in datacenters, it is necessary to effectively locate and access interest data in such a distributed system. However, traditional search techniques only allow users to search images over exact-match keywords through a centralized index. These techniques cannot satisfy the requirements of content based image retrieval (CBIR). In this paper, we propose a scalable image retrieval framework which can efficiently support content similarity search and semantic search in the distributed environment. Its key idea is to integrate image feature vectors into distributed hash tables (DHTs) by exploiting the property of locality sensitive hashing (LSH). Thus, images with similar content are most likely gathered into the same node without the knowledge of any global information. For searching semantically close images, the relevance feedback is adopted in our system to overcome the gap between low-level features and high-level features. We show that our approach yields high recall rate with good load balance and only requires a few number of hops.
Resumo:
OntoTag - A Linguistic and Ontological Annotation Model Suitable for the Semantic Web
1. INTRODUCTION. LINGUISTIC TOOLS AND ANNOTATIONS: THEIR LIGHTS AND SHADOWS
Computational Linguistics is already a consolidated research area. It builds upon the results of other two major ones, namely Linguistics and Computer Science and Engineering, and it aims at developing computational models of human language (or natural language, as it is termed in this area). Possibly, its most well-known applications are the different tools developed so far for processing human language, such as machine translation systems and speech recognizers or dictation programs.
These tools for processing human language are commonly referred to as linguistic tools. Apart from the examples mentioned above, there are also other types of linguistic tools that perhaps are not so well-known, but on which most of the other applications of Computational Linguistics are built. These other types of linguistic tools comprise POS taggers, natural language parsers and semantic taggers, amongst others. All of them can be termed linguistic annotation tools.
Linguistic annotation tools are important assets. In fact, POS and semantic taggers (and, to a lesser extent, also natural language parsers) have become critical resources for the computer applications that process natural language. Hence, any computer application that has to analyse a text automatically and ‘intelligently’ will include at least a module for POS tagging. The more an application needs to ‘understand’ the meaning of the text it processes, the more linguistic tools and/or modules it will incorporate and integrate.
However, linguistic annotation tools have still some limitations, which can be summarised as follows:
1. Normally, they perform annotations only at a certain linguistic level (that is, Morphology, Syntax, Semantics, etc.).
2. They usually introduce a certain rate of errors and ambiguities when tagging. This error rate ranges from 10 percent up to 50 percent of the units annotated for unrestricted, general texts.
3. Their annotations are most frequently formulated in terms of an annotation schema designed and implemented ad hoc.
A priori, it seems that the interoperation and the integration of several linguistic tools into an appropriate software architecture could most likely solve the limitations stated in (1). Besides, integrating several linguistic annotation tools and making them interoperate could also minimise the limitation stated in (2). Nevertheless, in the latter case, all these tools should produce annotations for a common level, which would have to be combined in order to correct their corresponding errors and inaccuracies. Yet, the limitation stated in (3) prevents both types of integration and interoperation from being easily achieved.
In addition, most high-level annotation tools rely on other lower-level annotation tools and their outputs to generate their own ones. For example, sense-tagging tools (operating at the semantic level) often use POS taggers (operating at a lower level, i.e., the morphosyntactic) to identify the grammatical category of the word or lexical unit they are annotating. Accordingly, if a faulty or inaccurate low-level annotation tool is to be used by other higher-level one in its process, the errors and inaccuracies of the former should be minimised in advance. Otherwise, these errors and inaccuracies would be transferred to (and even magnified in) the annotations of the high-level annotation tool.
Therefore, it would be quite useful to find a way to
(i) correct or, at least, reduce the errors and the inaccuracies of lower-level linguistic tools;
(ii) unify the annotation schemas of different linguistic annotation tools or, more generally speaking, make these tools (as well as their annotations) interoperate.
Clearly, solving (i) and (ii) should ease the automatic annotation of web pages by means of linguistic tools, and their transformation into Semantic Web pages (Berners-Lee, Hendler and Lassila, 2001). Yet, as stated above, (ii) is a type of interoperability problem. There again, ontologies (Gruber, 1993; Borst, 1997) have been successfully applied thus far to solve several interoperability problems. Hence, ontologies should help solve also the problems and limitations of linguistic annotation tools aforementioned.
Thus, to summarise, the main aim of the present work was to combine somehow these separated approaches, mechanisms and tools for annotation from Linguistics and Ontological Engineering (and the Semantic Web) in a sort of hybrid (linguistic and ontological) annotation model, suitable for both areas. This hybrid (semantic) annotation model should (a) benefit from the advances, models, techniques, mechanisms and tools of these two areas; (b) minimise (and even solve, when possible) some of the problems found in each of them; and (c) be suitable for the Semantic Web. The concrete goals that helped attain this aim are presented in the following section.
2. GOALS OF THE PRESENT WORK
As mentioned above, the main goal of this work was to specify a hybrid (that is, linguistically-motivated and ontology-based) model of annotation suitable for the Semantic Web (i.e. it had to produce a semantic annotation of web page contents). This entailed that the tags included in the annotations of the model had to (1) represent linguistic concepts (or linguistic categories, as they are termed in ISO/DCR (2008)), in order for this model to be linguistically-motivated; (2) be ontological terms (i.e., use an ontological vocabulary), in order for the model to be ontology-based; and (3) be structured (linked) as a collection of ontology-based
Resumo:
The products and services designed for Smart Cities provide the necessary tools to improve the management of modern cities in a more efficient way. These tools need to gather citizens’ information about their activity, preferences, habits, etc. opening up the possibility of tracking them. Thus, privacy and security policies must be developed in order to satisfy and manage the legislative heterogeneity surrounding the services provided and comply with the laws of the country where they are provided. This paper presents one of the possible solutions to manage this heterogeneity, bearing in mind these types of networks, such as Wireless Sensor Networks, have important resource limitations. A knowledge and ontology management system is proposed to facilitate the collaboration between the business, legal and technological areas. This will ease the implementation of adequate specific security and privacy policies for a given service. All these security and privacy policies are based on the information provided by the deployed platforms and by expert system processing.
Resumo:
In the context of the Semantic Web, natural language descriptions associated with ontologies have proven to be of major importance not only to support ontology developers and adopters, but also to assist in tasks such as ontology mapping, information extraction, or natural language generation. In the state-of-the-art we find some attempts to provide guidelines for URI local names in English, and also some disagreement on the use of URIs for describing ontology elements. When trying to extrapolate these ideas to a multilingual scenario, some of these approaches fail to provide a valid solution. On the basis of some real experiences in the translation of ontologies from English into Spanish, we provide a preliminary set of guidelines for naming and labeling ontologies in a multilingual scenario.
Resumo:
We review the evolution, state of the art and future lines of research on the sources, transport pathways, and sinks of particulate trace elements in urban terrestrial environments to include the atmosphere, soils, and street and indoor dusts. Such studies reveal reductions in the emissions of some elements of historical concern such as Pb, with interest consequently focusing on other toxic trace elements such as As, Cd, Hg, Zn, and Cu. While establishment of levels of these elements is important in assessing the potential impacts of human society on the urban environment, it is also necessary to apply this knowledge in conjunction with information on the toxicity of those trace elements and the degree of exposure of human receptors to an assessment of whether such contamination represents a real risk to the city’s inhabitants and therefore how this risk can be addressed.
Resumo:
The Web has witnessed an enormous growth in the amount of semantic information published in recent years. This growth has been stimulated to a large extent by the emergence of Linked Data. Although this brings us a big step closer to the vision of a Semantic Web, it also raises new issues such as the need for dealing with information expressed in different natural languages. Indeed, although the Web of Data can contain any kind of information in any language, it still lacks explicit mechanisms to automatically reconcile such information when it is expressed in different languages. This leads to situations in which data expressed in a certain language is not easily accessible to speakers of other languages. The Web of Data shows the potential for being extended to a truly multilingual web as vocabularies and data can be published in a language-independent fashion, while associated language-dependent (linguistic) information supporting the access across languages can be stored separately. In this sense, the multilingual Web of Data can be realized in our view as a layer of services and resources on top of the existing Linked Data infrastructure adding i) linguistic information for data and vocabularies in different languages, ii) mappings between data with labels in different languages, and iii) services to dynamically access and traverse Linked Data across different languages. In this article we present this vision of a multilingual Web of Data. We discuss challenges that need to be addressed to make this vision come true and discuss the role that techniques such as ontology localization, ontology mapping, and cross-lingual ontology-based information access and presentation will play in achieving this. Further, we propose an initial architecture and describe a roadmap that can provide a basis for the implementation of this vision.
Resumo:
At the present time almost all map libraries on the Internet are image collections generated by the digitization of early maps. This type of graphics files provides researchers with the possibility of accessing and visualizing historical cartographic information keeping in mind that this information has a degree of quality that depends upon elements such as the accuracy of the digitization process and proprietary constraints (e.g. visualization, resolution downloading options, copyright, use constraints). In most cases, access to these map libraries is useful only as a first approach and it is not possible to use those maps for scientific work due to the sparse tools available to measure, match, analyze and/or combine those resources with different kinds of cartography. This paper presents a method to enrich virtual map rooms and provide historians and other professional with a tool that let them to make the most of libraries in the digital era.
Resumo:
In this paper, we propose a solution to an NP-complete problem, namely the "3-colorability problem", based on a network of polarized processors. Our solution is uniform and time efficient.
Resumo:
This paper aims to analyse the use of anabolic drugs among Greek students participating in school championships of physical education (PE). In order to do it, a survey was conducted during the 2008 to 2009 academic year in suburban, urban and metropolitan areas in Greece. The sample was 2,535 high school students from the 10 to 12th grade, participating in the school physical education championships. The results showed that 9.6% of boys and 3.7% of girls reported that they had used anabolic drugs sometime in the past whereas 11.2% boys and 4.8% girls reported that they would intend to use them in the future. This confirms that anabolic steroids are an important problem among adolescents, and educational programs should increase their knowledge about these drugs. Information should come not only from the state, but also from coaches, teachers, trainers and parents.