955 resultados para Open source information retrieval
Resumo:
The international perspectives on these issues are especially valuable in an increasingly connected, but still institutionally and administratively diverse world. The research addressed in several chapters in this volume includes issues around technical standards bodies like EpiDoc and the TEI, engaging with ways these standards are implemented, documented, taught, used in the process of transcribing and annotating texts, and used to generate publications and as the basis for advanced textual or corpus research. Other chapters focus on various aspects of philological research and content creation, including collaborative or community driven efforts, and the issues surrounding editorial oversight, curation, maintenance and sustainability of these resources. Research into the ancient languages and linguistics, in particular Greek, and the language teaching that is a staple of our discipline, are also discussed in several chapters, in particular for ways in which advanced research methods can lead into language technologies and vice versa and ways in which the skills around teaching can be used for public engagement, and vice versa. A common thread through much of the volume is the importance of open access publication or open source development and distribution of texts, materials, tools and standards, both because of the public good provided by such models (circulating materials often already paid for out of the public purse), and the ability to reach non-standard audiences, those who cannot access rich university libraries or afford expensive print volumes. Linked Open Data is another technology that results in wide and free distribution of structured information both within and outside academic circles, and several chapters present academic work that includes ontologies and RDF, either as a direct research output or as essential part of the communication and knowledge representation. Several chapters focus not on the literary and philological side of classics, but on the study of cultural heritage, archaeology, and the material supports on which original textual and artistic material are engraved or otherwise inscribed, addressing both the capture and analysis of artefacts in both 2D and 3D, the representation of data through archaeological standards, and the importance of sharing information and expertise between the several domains both within and without academia that study, record and conserve ancient objects. Almost without exception, the authors reflect on the issues of interdisciplinarity and collaboration, the relationship between their research practice and teaching and/or communication with a wider public, and the importance of the role of the academic researcher in contemporary society and in the context of cutting edge technologies. How research is communicated in a world of instant- access blogging and 140-character micromessaging, and how our expectations of the media affect not only how we publish but how we conduct our research, are questions about which all scholars need to be aware and self-critical.
Resumo:
Mode of access: Internet.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
The main aim of the proposed approach presented in this paper is to improve Web information retrieval effectiveness by overcoming the problems associated with a typical keyword matching retrieval system, through the use of concepts and an intelligent fusion of confidence values. By exploiting the conceptual hierarchy of the WordNet (G. Miller, 1995) knowledge base, we show how to effectively encode the conceptual information in a document using the semantic information implied by the words that appear within it. Rather than treating a word as a string made up of a sequence of characters, we consider a word to represent a concept.
Resumo:
In Information Filtering (IF) a user may be interested in several topics in parallel. But IF systems have been built on representational models derived from Information Retrieval and Text Categorization, which assume independence between terms. The linearity of these models results in user profiles that can only represent one topic of interest. We present a methodology that takes into account term dependencies to construct a single profile representation for multiple topics, in the form of a hierarchical term network. We also introduce a series of non-linear functions for evaluating documents against the profile. Initial experiments produced positive results.
Resumo:
Interpolated data are an important part of the environmental information exchange as many variables can only be measured at situate discrete sampling locations. Spatial interpolation is a complex operation that has traditionally required expert treatment, making automation a serious challenge. This paper presents a few lessons learnt from INTAMAP, a project that is developing an interoperable web processing service (WPS) for the automatic interpolation of environmental data using advanced geostatistics, adopting a Service Oriented Architecture (SOA). The “rainbow box” approach we followed provides access to the functionality at a whole range of different levels. We show here how the integration of open standards, open source and powerful statistical processing capabilities allows us to automate a complex process while offering users a level of access and control that best suits their requirements. This facilitates benchmarking exercises as well as the regular reporting of environmental information without requiring remote users to have specialized skills in geostatistics.
Resumo:
Existing theories of semantic cognition propose models of cognitive processing occurring in a conceptual space, where ‘meaning’ is derived from the spatial relationships between concepts’ mapped locations within the space. Information visualisation is a growing area of research within the field of information retrieval, and methods for presenting database contents visually in the form of spatial data management systems (SDMSs) are being developed. This thesis combined these two areas of research to investigate the benefits associated with employing spatial-semantic mapping (documents represented as objects in two- and three-dimensional virtual environments are proximally mapped dependent on the semantic similarity of their content) as a tool for improving retrieval performance and navigational efficiency when browsing for information within such systems. Positive effects associated with the quality of document mapping were observed; improved retrieval performance and browsing behaviour were witnessed when mapping was optimal. It was also shown using a third dimension for virtual environment (VE) presentation provides sufficient additional information regarding the semantic structure of the environment that performance is increased in comparison to using two-dimensions for mapping. A model that describes the relationship between retrieval performance and browsing behaviour was proposed on the basis of findings. Individual differences were not found to have any observable influence on retrieval performance or browsing behaviour when mapping quality was good. The findings from this work have implications for both cognitive modelling of semantic information, and for designing and testing information visualisation systems. These implications are discussed in the conclusions of this work.
Resumo:
Evaluation and benchmarking in content-based image retrieval has always been a somewhat neglected research area, making it difficult to judge the efficacy of many presented approaches. In this paper we investigate the issue of benchmarking for colour-based image retrieval systems, which enable users to retrieve images from a database based on lowlevel colour content alone. We argue that current image retrieval evaluation methods are not suited to benchmarking colour-based image retrieval systems, due in main to not allowing users to reflect upon the suitability of retrieved images within the context of a creative project and their reliance on highly subjective ground-truths. As a solution to these issues, the research presented here introduces the Mosaic Test for evaluating colour-based image retrieval systems, in which test-users are asked to create an image mosaic of a predetermined target image, using the colour-based image retrieval system that is being evaluated. We report on our findings from a user study which suggests that the Mosaic Test overcomes the major drawbacks associated with existing image retrieval evaluation methods, by enabling users to reflect upon image selections and automatically measuring image relevance in a way that correlates with the perception of many human assessors. We therefore propose that the Mosaic Test be adopted as a standardised benchmark for evaluating and comparing colour-based image retrieval systems.
Resumo:
Procedural knowledge is the knowledge required to perform certain tasks. It forms an important part of expertise, and is crucial for learning new tasks. This paper summarises existing work on procedural knowledge acquisition, and identifies two major challenges that remain to be solved in this field; namely, automating the acquisition process to tackle bottleneck in the formalization of procedural knowledge, and enabling machine understanding and manipulation of procedural knowledge. It is believed that recent advances in information extraction techniques can be applied compose a comprehensive solution to address these challenges. We identify specific tasks required to achieve the goal, and present detailed analyses of new research challenges and opportunities. It is expected that these analyses will interest researchers of various knowledge management tasks, particularly knowledge acquisition and capture.
Resumo:
In order to bridge the “Semantic gap”, a number of relevance feedback (RF) mechanisms have been applied to content-based image retrieval (CBIR). However current RF techniques in most existing CBIR systems still lack satisfactory user interaction although some work has been done to improve the interaction as well as the search accuracy. In this paper, we propose a four-factor user interaction model and investigate its effects on CBIR by an empirical evaluation. Whilst the model was developed for our research purposes, we believe the model could be adapted to any content-based search system.
Resumo:
This paper presents an interactive content-based image retrieval framework—uInteract, for delivering a novel four-factor user interaction model visually. The four-factor user interaction model is an interactive relevance feedback mechanism that we proposed, aiming to improve the interaction between users and the CBIR system and in turn users overall search experience. In this paper, we present how the framework is developed to deliver the four-factor user interaction model, and how the visual interface is designed to support user interaction activities. From our preliminary user evaluation result on the ease of use and usefulness of the proposed framework, we have learnt what the users like about the framework and the aspects we could improve in future studies. Whilst the framework is developed for our research purposes, we believe the functionalities could be adapted to any content-based image search framework.
Resumo:
Dissimilarity measurement plays a crucial role in content-based image retrieval, where data objects and queries are represented as vectors in high-dimensional content feature spaces. Given the large number of dissimilarity measures that exist in many fields, a crucial research question arises: Is there a dependency, if yes, what is the dependency, of a dissimilarity measure’s retrieval performance, on different feature spaces? In this paper, we summarize fourteen core dissimilarity measures and classify them into three categories. A systematic performance comparison is carried out to test the effectiveness of these dissimilarity measures with six different feature spaces and some of their combinations on the Corel image collection. From our experimental results, we have drawn a number of observations and insights on dissimilarity measurement in content-based image retrieval, which will lay a foundation for developing more effective image search technologies.
Resumo:
* This research is partially supported by a grant (bourse Lavoisier) from the French Ministry of Foreign Affairs (Ministère des Affaires Etrangères).