24 resultados para Open source information retrieval

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Extensible Business Reporting Language (XBRL) is being adopted by European regulators as a data standard for the exchange of business information. This paper examines the approach of XBRL International (XII) to the meta-data standard's development and diffusion. We theorise the development of XBRL using concepts drawn from a model of successful open source projects. Comparison of the open source model to XBRL enables us to identify a number of interesting similarities and differences. In common with open source projects, the benefits and progress of XBRL have been overstated and 'hyped' by enthusiastic participants. While XBRL is an open data standard in terms of access to the equivalent of its 'source code' we find that the governance structure of the XBRL consortium is significantly different to a model open source approach. The barrier to participation that is created by requiring paid membership and a focus on transacting business at physical conferences and meetings is identified as particularly critical. Decisions about the technical structure of XBRL, the regulator-led pattern of adoption and the organisation of XII are discussed. Finally areas for future research are identified.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monitoring land-cover changes on sites of conservation importance allows environmental problems to be detected, solutions to be developed and the effectiveness of actions to be assessed. However, the remoteness of many sites or a lack of resources means these data are frequently not available. Remote sensing may provide a solution, but large-scale mapping and change detection may not be appropriate, necessitating site-level assessments. These need to be easy to undertake, rapid and cheap. We present an example of a Web-based solution based on free and open-source software and standards (including PostGIS, OpenLayers, Web Map Services, Web Feature Services and GeoServer) to support assessments of land-cover change (and validation of global land-cover maps). Authorised users are provided with means to assess land-cover visually and may optionally provide uncertainty information at various levels: from a general rating of their confidence in an assessment to a quantification of the proportions of land-cover types within a reference area. Versions of this tool have been developed for the TREES-3 initiative (Simonetti, Beuchle and Eva, 2011). This monitors tropical land-cover change through ground-truthing at latitude / longitude degree confluence points, and for monitoring of change within and around Important Bird Areas (IBAs) by Birdlife International and the Royal Society for the Protection of Birds (RSPB). In this paper we present results from the second of these applications. We also present further details on the potential use of the land-cover change assessment tool on sites of recognised conservation importance, in combination with NDVI and other time series data from the eStation (a system for receiving, processing and disseminating environmental data). We show how the tool can be used to increase the usability of earth observation data by local stakeholders and experts, and assist in evaluating the impact of protection regimes on land-cover change.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper summarizes the scientific work presented at the 32nd European Conference on Information Retrieval. It demonstrates that information retrieval (IR) as a research area continues to thrive with progress being made in three complementary sub-fields, namely IR theory and formal methods together with indexing and query representation issues, furthermore Web IR as a primary application area and finally research into evaluation methods and metrics. It is the combination of these areas that gives IR its solid scientific foundations. The paper also illustrates that significant progress has been made in other areas of IR. The keynote speakers addressed three such subject fields, social search engines using personalization and recommendation technologies, the renewed interest in applying natural language processing to IR, and multimedia IR as another fast-growing area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a text mining method called LRD (latent relation discovery), which extends the traditional vector space model of document representation in order to improve information retrieval (IR) on documents and document clustering. Our LRD method extracts terms and entities, such as person, organization, or project names, and discovers relationships between them by taking into account their co-occurrence in textual corpora. Given a target entity, LRD discovers other entities closely related to the target effectively and efficiently. With respect to such relatedness, a measure of relation strength between entities is defined. LRD uses relation strength to enhance the vector space model, and uses the enhanced vector space model for query based IR on documents and clustering documents in order to discover complex relationships among terms and entities. Our experiments on a standard dataset for query based IR shows that our LRD method performed significantly better than traditional vector space model and other five standard statistical methods for vector expansion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Developing Cyber-Physical Systems requires methods and tools to support simulation and verification of hybrid (both continuous and discrete) models. The Acumen modeling and simulation language is an open source testbed for exploring the design space of what rigorousbut- practical next-generation tools can deliver to developers of Cyber- Physical Systems. Like verification tools, a design goal for Acumen is to provide rigorous results. Like simulation tools, it aims to be intuitive, practical, and scalable. However, it is far from evident whether these two goals can be achieved simultaneously. This paper explains the primary design goals for Acumen, the core challenges that must be addressed in order to achieve these goals, the “agile research method” taken by the project, the steps taken to realize these goals, the key lessons learned, and the emerging language design.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There has been a dramatic change in the U.K. government policy regarding the establishment of new towns. The emphasis is now on the redevelopment of existing cities rather than on building new ones. This has created an urgent need to carry out detailed surveys and inventories of many aspects of urban land use in metropolitan areas: this study concentrates on just one aspect - urban open space. In the first stage a comparison was made between 1:10,000 scale black and white and 1:10,000 scale colour infra-red aerial photographs, to compare the type and amount of open space information which could be obtained from these two sources. The advantages of using colour infra-red photography were clearly demonstrated in this comparison. The second stage was the use of colour infra-red photography as the sole source of data to survey and map the urban open space of a sample area in Merseyside Metropolitan County. This sample area comprised eleven 1/4km2 squares, on each of which a 20m x 20m grid cell was placed to record, directly from the photography, 625 sets of data. Each set of data recorded the type and amount of open space, its surface cover, maintenance status and management. The data recorded were fed into a computer and a suite of programs was developed to provide output in both computer map and statistical form, for each of the eleven -1/4km2 -sample areas. The third stage involved a comparison of open space data with socio-economic status. Merseyside County Planning Authority had previously conducted a socio-economic survey of the county, and this information was used to identify ' the socio-economic status of the population in the eleven ilkm2 areas of this project. This comparison revealed many interesting and useful relationships between the provision of urban open space and socio-economic status.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In Information Filtering (IF) a user may be interested in several topics in parallel. But IF systems have been built on representational models derived from Information Retrieval and Text Categorization, which assume independence between terms. The linearity of these models results in user profiles that can only represent one topic of interest. We present a methodology that takes into account term dependencies to construct a single profile representation for multiple topics, in the form of a hierarchical term network. We also introduce a series of non-linear functions for evaluating documents against the profile. Initial experiments produced positive results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Interpolated data are an important part of the environmental information exchange as many variables can only be measured at situate discrete sampling locations. Spatial interpolation is a complex operation that has traditionally required expert treatment, making automation a serious challenge. This paper presents a few lessons learnt from INTAMAP, a project that is developing an interoperable web processing service (WPS) for the automatic interpolation of environmental data using advanced geostatistics, adopting a Service Oriented Architecture (SOA). The “rainbow box” approach we followed provides access to the functionality at a whole range of different levels. We show here how the integration of open standards, open source and powerful statistical processing capabilities allows us to automate a complex process while offering users a level of access and control that best suits their requirements. This facilitates benchmarking exercises as well as the regular reporting of environmental information without requiring remote users to have specialized skills in geostatistics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing theories of semantic cognition propose models of cognitive processing occurring in a conceptual space, where ‘meaning’ is derived from the spatial relationships between concepts’ mapped locations within the space. Information visualisation is a growing area of research within the field of information retrieval, and methods for presenting database contents visually in the form of spatial data management systems (SDMSs) are being developed. This thesis combined these two areas of research to investigate the benefits associated with employing spatial-semantic mapping (documents represented as objects in two- and three-dimensional virtual environments are proximally mapped dependent on the semantic similarity of their content) as a tool for improving retrieval performance and navigational efficiency when browsing for information within such systems. Positive effects associated with the quality of document mapping were observed; improved retrieval performance and browsing behaviour were witnessed when mapping was optimal. It was also shown using a third dimension for virtual environment (VE) presentation provides sufficient additional information regarding the semantic structure of the environment that performance is increased in comparison to using two-dimensions for mapping. A model that describes the relationship between retrieval performance and browsing behaviour was proposed on the basis of findings. Individual differences were not found to have any observable influence on retrieval performance or browsing behaviour when mapping quality was good. The findings from this work have implications for both cognitive modelling of semantic information, and for designing and testing information visualisation systems. These implications are discussed in the conclusions of this work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evaluation and benchmarking in content-based image retrieval has always been a somewhat neglected research area, making it difficult to judge the efficacy of many presented approaches. In this paper we investigate the issue of benchmarking for colour-based image retrieval systems, which enable users to retrieve images from a database based on lowlevel colour content alone. We argue that current image retrieval evaluation methods are not suited to benchmarking colour-based image retrieval systems, due in main to not allowing users to reflect upon the suitability of retrieved images within the context of a creative project and their reliance on highly subjective ground-truths. As a solution to these issues, the research presented here introduces the Mosaic Test for evaluating colour-based image retrieval systems, in which test-users are asked to create an image mosaic of a predetermined target image, using the colour-based image retrieval system that is being evaluated. We report on our findings from a user study which suggests that the Mosaic Test overcomes the major drawbacks associated with existing image retrieval evaluation methods, by enabling users to reflect upon image selections and automatically measuring image relevance in a way that correlates with the perception of many human assessors. We therefore propose that the Mosaic Test be adopted as a standardised benchmark for evaluating and comparing colour-based image retrieval systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Procedural knowledge is the knowledge required to perform certain tasks. It forms an important part of expertise, and is crucial for learning new tasks. This paper summarises existing work on procedural knowledge acquisition, and identifies two major challenges that remain to be solved in this field; namely, automating the acquisition process to tackle bottleneck in the formalization of procedural knowledge, and enabling machine understanding and manipulation of procedural knowledge. It is believed that recent advances in information extraction techniques can be applied compose a comprehensive solution to address these challenges. We identify specific tasks required to achieve the goal, and present detailed analyses of new research challenges and opportunities. It is expected that these analyses will interest researchers of various knowledge management tasks, particularly knowledge acquisition and capture.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to bridge the “Semantic gap”, a number of relevance feedback (RF) mechanisms have been applied to content-based image retrieval (CBIR). However current RF techniques in most existing CBIR systems still lack satisfactory user interaction although some work has been done to improve the interaction as well as the search accuracy. In this paper, we propose a four-factor user interaction model and investigate its effects on CBIR by an empirical evaluation. Whilst the model was developed for our research purposes, we believe the model could be adapted to any content-based search system.