932 resultados para Retrieval


Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Consumer reviews, opinions and shared experiences in the use of a product is a powerful source of information about consumer preferences that can be used in recommender systems. Despite the importance and value of such information, there is no comprehensive mechanism that formalizes the opinions selection and retrieval process and the utilization of retrieved opinions due to the difficulty of extracting information from text data. In this paper, a new recommender system that is built on consumer product reviews is proposed. A prioritizing mechanism is developed for the system. The proposed approach is illustrated using the case study of a recommender system for digital cameras

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Los nuevos Planes de Estudio adaptados al Espacio Europeo de la Educación Superior ya están implantándose en nuestras universidades y esto ha supuesto plantearnos qué cambios debemos incorporar como docentes, para que nuestros estudiantes consigan las competencias, habilidades y destrezas necesarias para ser profesionales competentes en un futuro cercano

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This class introduces basics of web mining and information retrieval including, for example, an introduction to the Vector Space Model and Text Mining. Guest Lecturer: Dr. Michael Granitzer Optional: Modeling the Internet and the Web: Probabilistic Methods and Algorithms, Pierre Baldi, Paolo Frasconi, Padhraic Smyth, Wiley, 2003 (Chapter 4, Text Analysis)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sobre la base del estudio de esquemas de comunicación asincrónicos, el presente artículo propone un framework basado en servicios web para la recuperación de documentación educativa en forma colaborativa en ambientes distribuidos, y más concretamente 1) propone describir explícitamente la intención de los mensajes entre procesos utilizando para ello performatives del estándar FIPA, y sobre la base de dicha descripción 2) presenta un conjunto de métricas que evalúan la gestión de la información y 3) define dos algoritmos que mejoran la calidad del servicio del intercambio de información en función de la mejora del proceso de recuperación de documentación educativa.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La gestión de información constituye un proceso fundamental en el desarrollo de la sociedad, y la documentación educativa constituye un pilar fundamental en este contexto. El presente trabajo tiene como objetivo fundamental el fundamentar el impacto directo que tiene la gestión de la semántica sobre: 1) la organización de la documentación educativa, 2) las facilidades de recuperación de información al utilizar sistemas más inteligentes de búsquedas de información basados en la semántica, además del 3) soporte a las herramientas de integración entre diferentes servicios. En este artículo se muestran los resultados alcanzados en el ámbito de Educative, proyecto de gestión integral de documentación educativa en el ámbito universitario en pos de una enseñanza universitaria más integral en el contexto cubano.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

These slides support students in understanding how to respond to the challenge of: "I’ve been told not to use Google or Wikipedia to research my essay. What else is there?" The powerpoint guides students in how to identify high quality, up to date and relevant resources on the web that they can reliably draw upon for their academic assignments. The slides were created by the subject liaison librarian who supports the School of Electronics and Computer Science at the UNiversity of Southampton, Fiona Nichols.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Finding journal articles from full text sources such as IEEEXplore, ACM and LNCS (Lecture Noters in Computer Science)

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Real-time geoparsing of social media streams (e.g. Twitter, YouTube, Instagram, Flickr, FourSquare) is providing a new 'virtual sensor' capability to end users such as emergency response agencies (e.g. Tsunami early warning centres, Civil protection authorities) and news agencies (e.g. Deutsche Welle, BBC News). Challenges in this area include scaling up natural language processing (NLP) and information retrieval (IR) approaches to handle real-time traffic volumes, reducing false positives, creating real-time infographic displays useful for effective decision support and providing support for trust and credibility analysis using geosemantics. I will present in this seminar on-going work by the IT Innovation Centre over the last 4 years (TRIDEC and REVEAL FP7 projects) in building such systems, and highlights our research towards improving trustworthy and credible of crisis map displays and real-time analytics for trending topics and influential social networks during major news worthy events.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract Big data nowadays is a fashionable topic, independently of what people mean when they use this term. But being big is just a matter of volume, although there is no clear agreement in the size threshold. On the other hand, it is easy to capture large amounts of data using a brute force approach. So the real goal should not be big data but to ask ourselves, for a given problem, what is the right data and how much of it is needed. For some problems this would imply big data, but for the majority of the problems much less data will and is needed. In this talk we explore the trade-offs involved and the main problems that come with big data using the Web as case study: scalability, redundancy, bias, noise, spam, and privacy. Speaker Biography Ricardo Baeza-Yates Ricardo Baeza-Yates is VP of Research for Yahoo Labs leading teams in United States, Europe and Latin America since 2006 and based in Sunnyvale, California, since August 2014. During this time he has lead the labs in Barcelona and Santiago de Chile. Between 2008 and 2012 he also oversaw the Haifa lab. He is also part time Professor at the Dept. of Information and Communication Technologies of the Universitat Pompeu Fabra, in Barcelona, Spain. During 2005 he was an ICREA research professor at the same university. Until 2004 he was Professor and before founder and Director of the Center for Web Research at the Dept. of Computing Science of the University of Chile (in leave of absence until today). He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989. Before he obtained two masters (M.Sc. CS & M.Eng. EE) and the electronics engineer degree from the University of Chile in Santiago. He is co-author of the best-seller Modern Information Retrieval textbook, published in 1999 by Addison-Wesley with a second enlarged edition in 2011, that won the ASIST 2012 Book of the Year award. He is also co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 500 other publications. From 2002 to 2004 he was elected to the board of governors of the IEEE Computer Society and in 2012 he was elected for the ACM Council. He has received the Organization of American States award for young researchers in exact sciences (1993), the Graham Medal for innovation in computing given by the University of Waterloo to distinguished ex-alumni (2007), the CLEI Latin American distinction for contributions to CS in the region (2009), and the National Award of the Chilean Association of Engineers (2010), among other distinctions. In 2003 he was the first computer scientist to be elected to the Chilean Academy of Sciences and since 2010 is a founding member of the Chilean Academy of Engineering. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Shape complexity has recently received attention from different fields, such as computer vision and psychology. In this paper, integral geometry and information theory tools are applied to quantify the shape complexity from two different perspectives: from the inside of the object, we evaluate its degree of structure or correlation between its surfaces (inner complexity), and from the outside, we compute its degree of interaction with the circumscribing sphere (outer complexity). Our shape complexity measures are based on the following two facts: uniformly distributed global lines crossing an object define a continuous information channel and the continuous mutual information of this channel is independent of the object discretisation and invariant to translations, rotations, and changes of scale. The measures introduced in this paper can be potentially used as shape descriptors for object recognition, image retrieval, object localisation, tumour analysis, and protein docking, among others

Relevância:

10.00% 10.00%

Publicador:

Resumo:

When publishing information on the web, one expects it to reach all the people that could be interested in. This is mainly achieved with general purpose indexing and search engines like Google which is the most used today. In the particular case of geographic information (GI) domain, exposing content to mainstream search engines is a complex task that needs specific actions. In many occasions it is convenient to provide a web site with a specially tailored search engine. Such is the case for on-line dictionaries (wikipedia, wordreference), stores (amazon, ebay), and generally all those holding thematic databases. Due to proliferation of these engines, A9.com proposed a standard interface called OpenSearch, used by modern web browsers to manage custom search engines. Geographic information can also benefit from the use of specific search engines. We can distinguish between two main approaches in GI retrieval information efforts: Classical OGC standardization on one hand (CSW, WFS filters), which are very complex for the mainstream user, and on the other hand the neogeographer’s approach, usually in the form of specific APIs lacking a common query interface and standard geographic formats. A draft ‘geo’ extension for OpenSearch has been proposed. It adds geographic filtering for queries and recommends a set of simple standard response geographic formats, such as KML, Atom and GeoRSS. This proposal enables standardization while keeping simplicity, thus covering a wide range of use cases, in both OGC and the neogeography paradigms. In this article we will analyze the OpenSearch geo extension in detail and its use cases, demonstrating its applicability to both the SDI and the geoweb. Open source implementations will be presented as well

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La present comunicació exposa una experiència en l'organització del currículum de l'estudiant del grau de Geografia, Ordenació del Territori i Gestió del Medi Ambient de la Universitat de Girona relacionada amb les capacitats de tractament i explotació de la Informació Geogràfica, especialment amb els Sistemes d'Informació Geogràfica (SIG). En aquest marc, es defineixen uns perfils amb l'objectiu de formar professionals que puguin accedir a llocs de treball com a Tècnic, Analista o Director de projectes SIG i, mitjançant les assignatures específiques pròpies es dissenya un programa que, juntament amb d'altres recursos educatius externs al Grau de Geografia, permeten a l'estudiant escollir un nivell d'aprofundiment diferent i personalitzat, segons les seves capacitats i voluntats

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The methodology is focused on the use of digital air photos to monitor changes in land covers and to study its dynamics and its patterns in the last 50 years. The dissertation also take into account the relationship between open habitats patterns/dynamics versus biodiversity persistence, increase risk of fire, land ownership and management. Therefore Geographic Information System (GIS) is a very interesting mapping tool that enables geographic or spatial data capture, storage, retrieval, manipulation, analysis and modeling. Finally this research develop a heuristic model to create sites using suitability maps and a reserve design model to select the most optimum sites in order to increase landscape heterogeneity at the less cost.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El test de circuits és una fase del procés de producció que cada vegada pren més importància quan es desenvolupa un nou producte. Les tècniques de test i diagnosi per a circuits digitals han estat desenvolupades i automatitzades amb èxit, mentre que aquest no és encara el cas dels circuits analògics. D'entre tots els mètodes proposats per diagnosticar circuits analògics els més utilitzats són els diccionaris de falles. En aquesta tesi se'n descriuen alguns, tot analitzant-ne els seus avantatges i inconvenients. Durant aquests últims anys, les tècniques d'Intel·ligència Artificial han esdevingut un dels camps de recerca més importants per a la diagnosi de falles. Aquesta tesi desenvolupa dues d'aquestes tècniques per tal de cobrir algunes de les mancances que presenten els diccionaris de falles. La primera proposta es basa en construir un sistema fuzzy com a eina per identificar. Els resultats obtinguts son força bons, ja que s'aconsegueix localitzar la falla en un elevat tant percent dels casos. Per altra banda, el percentatge d'encerts no és prou bo quan a més a més s'intenta esbrinar la desviació. Com que els diccionaris de falles es poden veure com una aproximació simplificada al Raonament Basat en Casos (CBR), la segona proposta fa una extensió dels diccionaris de falles cap a un sistema CBR. El propòsit no és donar una solució general del problema sinó contribuir amb una nova metodologia. Aquesta consisteix en millorar la diagnosis dels diccionaris de falles mitjançant l'addició i l'adaptació dels nous casos per tal d'esdevenir un sistema de Raonament Basat en Casos. Es descriu l'estructura de la base de casos així com les tasques d'extracció, de reutilització, de revisió i de retenció, fent èmfasi al procés d'aprenentatge. En el transcurs del text s'utilitzen diversos circuits per mostrar exemples dels mètodes de test descrits, però en particular el filtre biquadràtic és l'utilitzat per provar les metodologies plantejades, ja que és un dels benchmarks proposats en el context dels circuits analògics. Les falles considerades son paramètriques, permanents, independents i simples, encara que la metodologia pot ser fàcilment extrapolable per a la diagnosi de falles múltiples i catastròfiques. El mètode es centra en el test dels components passius, encara que també es podria extendre per a falles en els actius.