973 resultados para Unstructured content search
Resumo:
The paper proposes an ISE (Information goal, Search strategy, Evaluation threshold) user classification model based on Information Foraging Theory for understanding user interaction with content-based image retrieval (CBIR). The proposed model is verified by a multiple linear regression analysis based on 50 users' interaction features collected from a task-based user study of interactive CBIR systems. To our best knowledge, this is the first principled user classification model in CBIR verified by a formal and systematic qualitative analysis of extensive user interaction data. Copyright 2010 ACM.
Resumo:
In order to bridge the Semantic gap, a number of relevance feedback (RF) mechanisms have been applied to content-based image retrieval (CBIR). However current RF techniques in most existing CBIR systems still lack satisfactory user interaction although some work has been done to improve the interaction as well as the search accuracy. In this paper, we propose a four-factor user interaction model and investigate its effects on CBIR by an empirical evaluation. Whilst the model was developed for our research purposes, we believe the model could be adapted to any content-based search system.
Resumo:
This paper presents an interactive content-based image retrieval frameworkuInteract, for delivering a novel four-factor user interaction model visually. The four-factor user interaction model is an interactive relevance feedback mechanism that we proposed, aiming to improve the interaction between users and the CBIR system and in turn users overall search experience. In this paper, we present how the framework is developed to deliver the four-factor user interaction model, and how the visual interface is designed to support user interaction activities. From our preliminary user evaluation result on the ease of use and usefulness of the proposed framework, we have learnt what the users like about the framework and the aspects we could improve in future studies. Whilst the framework is developed for our research purposes, we believe the functionalities could be adapted to any content-based image search framework.
Resumo:
Dissimilarity measurement plays a crucial role in content-based image retrieval, where data objects and queries are represented as vectors in high-dimensional content feature spaces. Given the large number of dissimilarity measures that exist in many fields, a crucial research question arises: Is there a dependency, if yes, what is the dependency, of a dissimilarity measures retrieval performance, on different feature spaces? In this paper, we summarize fourteen core dissimilarity measures and classify them into three categories. A systematic performance comparison is carried out to test the effectiveness of these dissimilarity measures with six different feature spaces and some of their combinations on the Corel image collection. From our experimental results, we have drawn a number of observations and insights on dissimilarity measurement in content-based image retrieval, which will lay a foundation for developing more effective image search technologies.
Resumo:
Background - This study investigates the coverage of adherence to medicine by the UK and US newsprint media. Adherence to medicine is recognised as an important issue facing healthcare professionals and the newsprint media is a key source of health information, however, little is known about newspaper coverage of medication adherence. Methods - A search of the newspaper database NexisUK from 20042011 was performed. Content analysis of newspaper articles which referenced medication adherence from the twelve highest circulating UK and US daily newspapers and their Sunday equivalents was carried out. A second researcher coded a 15% sample of newspaper articles to establish the inter-rater reliability of coding. Results - Searches of newspaper coverage of medication adherence in the UK and US yielded 181 relevant articles for each country. There was a large increase in the number of scientific articles on medication adherence in PubMed over the study period, however, this was not reflected in the frequency of newspaper articles published on medication adherence. UK newspaper articles were significantly more likely to report the benefits of adherence (p = 0.005), whereas US newspaper articles were significantly more likely to report adherence issues in the elderly population (p = 0.004) and adherence associated with diseases of the central nervous system (p = 0.046). The most commonly reported barriers to adherence were patient factors e.g. poor memory, beliefs and age, whereas, the most commonly reported facilitators to adherence were medication factors including simplified regimens, shorter treatment duration and combination tablets. HIV/AIDS was the single most frequently cited disease (reported in 20% of newspaper articles). Poor quality reporting of medication adherence was identified in 62% of newspaper articles. Conclusion - Adherence is not well covered in the newspaper media despite a significant presence in the medical literature. The mass media have the potential to help educate and shape the publics knowledge regarding the importance of medication adherence; this potential is not being realised at present.
Resumo:
The expansion of the Internet has made the task of searching a crucial one. Internet users, however, have to make a great effort in order to formulate a search query that returns the required results. Many methods have been devised to assist in this task by helping the users modify their query to give better results. In this paper we propose an interactive method for query expansion. It is based on the observation that documents are often found to contain terms with high information content, which can summarise their subject matter. We present experimental results, which demonstrate that our approach significantly shortens the time required in order to accomplish a certain task by performing web searches.
Resumo:
his paper presents an ontological model of the knowledge about Bulgarian iconographical artefacts. It also describes content-sensitive services for access, browse, search and group iconographical objects, based on the presented ontology that will be implemented in the multimedia digital library Virtual encyclopedia of Bulgarian iconography.
Resumo:
Search engines sometimes apply the search on the full text of documents or web-pages; but sometimes they can apply the search on selected parts of the documents only, e.g. their titles. Full-text search may consume a lot of computing resources and time. It may be possible to save resources by applying the search on the titles of documents only, assuming that a title of a document provides a concise representation of its content. We tested this assumption using Google search engine. We ran search queries that have been defined by users, distinguishing between two types of queries/users: queries of users who are familiar with the area of the search, and queries of users who are not familiar with the area of the search. We found that searches which use titles provide similar and sometimes even (slightly) better results compared to searches which use the full-text. These results hold for both types of queries/users. Moreover, we found an advantage in title-search when searching in unfamiliar areas because the general terms used in queries in unfamiliar areas match better with general terms which tend to be used in document titles.
Resumo:
As the volume of image data and the need of using it in various applications is growing significantly in the last days it brings a necessity of retrieval efficiency and effectiveness. Unfortunately, existing indexing methods are not applicable to a wide range of problem-oriented fields due to their operating time limitations and strong dependency on the traditional descriptors extracted from the image. To meet higher requirements, a novel distance-based indexing method for region-based image retrieval has been proposed and investigated. The method creates premises for considering embedded partitions of images to carry out the search with different refinement or roughening level and so to seek the image meaningful content.
Resumo:
This dissertation is a study of customer relationship management theory and practice. Customer Relationship Management (CRM) is a business strategy whereby companies build strong relationships with existing and prospective customers with the goal of increasing organizational profitability. It is also a learning process involving managing change in processes, people, and technology. CRM implementation and its ramifications are also not completely understood as evidenced by the high number of failures in CRM implementation in organizations and the resulting disappointments. ^ The goal of this dissertation is to study emerging issues and trends in CRM, including the effect of computer software and the accompanying new management processes on organizations, and the dynamics of the alignment of marketing, sales and services, and all other functions responsible for delivering customers a satisfying experience. ^ In order to understand CRM better a content analysis of more than a hundred articles and documents from academic and industry sources was undertaken using a new methodological twist to the traditional method. An Internet domain name (http://crm.fiu.edu) was created for the purpose of this research by uploading an initial one hundred plus abstracts of articles and documents onto it to form a knowledge database. Once the database was formed a search engine was developed to enable the search of abstracts using relevant CRM keywords to reveal emergent dominant CRM topics. The ultimate aim of this website is to serve as an information hub for CRM research, as well as a search engine where interested parties can enter CRM-relevant keywords or phrases to access abstracts, as well as submit abstracts to enrich the knowledge hub. ^ Research questions were investigated and answered by content analyzing the interpretation and discussion of dominant CRM topics and then amalgamating the findings. This was supported by comparisons within and across individual, paired, and sets-of-three occurrences of CRM keywords in the article abstracts. ^ Results show that there is a lack of holistic thinking and discussion of CRM in both academics and industry which is required to understand how the people, process, and technology in CRM impact each other to affect successful implementation. Industry has to get their heads around CRM and holistically understand how these important dimensions affect each other. Only then will organizational learning occur, and overtime result in superior processes leading to strong profitable customer relationships and a hard to imitate competitive advantage. ^
Resumo:
The advent of smart TVs has reshaped the TV-consumer interaction by combining TVs with mobile-like applications and access to the Internet. However, consumers are still unable to seamlessly interact with the contents being streamed. An example of such limitation is TV shopping, in which a consumer makes a purchase of a product or item displayed in the current TV show. Currently, consumers can only stop the current show and attempt to find a similar item in the Web or an actual store. It would be more convenient if the consumer could interact with the TV to purchase interesting items. ^ Towards the realization of TV shopping, this dissertation proposes a scalable multimedia content processing framework. Two main challenges in TV shopping are addressed: the efficient detection of products in the content stream, and the retrieval of similar products given a consumer-selected product. The proposed framework consists of three components. The first component performs computational and temporal aware multimedia abstraction to select a reduced number of frames that summarize the important information in the video stream. By both reducing the number of frames and taking into account the computational cost of the subsequent detection phase, this component component allows the efficient detection of products in the stream. The second component realizes the detection phase. It executes scalable product detection using multi-cue optimization. Additional information cues are formulated into an optimization problem that allows the detection of complex products, i.e., those that do not have a rigid form and can appear in various poses. After the second component identifies products in the video stream, the consumer can select an interesting one for which similar ones must be located in a product database. To this end, the third component of the framework consists of an efficient, multi-dimensional, tree-based indexing method for multimedia databases. The proposed index mechanism serves as the backbone of the search. Moreover, it is able to efficiently bridge the semantic gap and perception subjectivity issues during the retrieval process to provide more relevant results.^
Resumo:
As the Web evolves unexpectedly fast, information grows explosively. Useful resources become more and more difficult to find because of their dynamic and unstructured characteristics. A vertical search engine is designed and implemented towards a specific domain. Instead of processing the giant volume of miscellaneous information distributed in the Web, a vertical search engine targets at identifying relevant information in specific domains or topics and eventually provides users with up-to-date information, highly focused insights and actionable knowledge representation. As the mobile device gets more popular, the nature of the search is changing. So, acquiring information on a mobile device poses unique requirements on traditional search engines, which will potentially change every feature they used to have. To summarize, users are strongly expecting search engines that can satisfy their individual information needs, adapt their current situation, and present highly personalized search results. ^ In my research, the next generation vertical search engine means to utilize and enrich existing domain information to close the loop of vertical search engine's system that mutually facilitate knowledge discovering, actionable information extraction, and user interests modeling and recommendation. I investigate three problems in which domain taxonomy plays an important role, including taxonomy generation using a vertical search engine, actionable information extraction based on domain taxonomy, and the use of ensemble taxonomy to catch user's interests. As the fundamental theory, ultra-metric, dendrogram, and hierarchical clustering are intensively discussed. Methods on taxonomy generation using my research on hierarchical clustering are developed. The related vertical search engine techniques are practically used in Disaster Management Domain. Especially, three disaster information management systems are developed and represented as real use cases of my research work.^
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
As the Web evolves unexpectedly fast, information grows explosively. Useful resources become more and more difficult to find because of their dynamic and unstructured characteristics. A vertical search engine is designed and implemented towards a specific domain. Instead of processing the giant volume of miscellaneous information distributed in the Web, a vertical search engine targets at identifying relevant information in specific domains or topics and eventually provides users with up-to-date information, highly focused insights and actionable knowledge representation. As the mobile device gets more popular, the nature of the search is changing. So, acquiring information on a mobile device poses unique requirements on traditional search engines, which will potentially change every feature they used to have. To summarize, users are strongly expecting search engines that can satisfy their individual information needs, adapt their current situation, and present highly personalized search results. In my research, the next generation vertical search engine means to utilize and enrich existing domain information to close the loop of vertical search engine's system that mutually facilitate knowledge discovering, actionable information extraction, and user interests modeling and recommendation. I investigate three problems in which domain taxonomy plays an important role, including taxonomy generation using a vertical search engine, actionable information extraction based on domain taxonomy, and the use of ensemble taxonomy to catch user's interests. As the fundamental theory, ultra-metric, dendrogram, and hierarchical clustering are intensively discussed. Methods on taxonomy generation using my research on hierarchical clustering are developed. The related vertical search engine techniques are practically used in Disaster Management Domain. Especially, three disaster information management systems are developed and represented as real use cases of my research work.
Resumo:
In the last several years there has been an increase in the amount of qualitative research using in-depth interviews and comprehensive content analyses in sport psychology. However, no explicit method has been provided to deal with the large amount of unstructured data. This article provides common guidelines for organizing and interpreting unstructured data. Two main operations are suggested and discussed: first, coding meaningful text segments, or creating tags, and second, regrouping similar text segments,or creating categories. Furthermore, software programs for the microcomputer are presented as away to facilitate the organization and interpretation of qualitative data