14 resultados para Open source information retrieval
em CentAUR: Central Archive University of Reading - UK
Resumo:
In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.
Resumo:
Measurements of the ionospheric E-region during total solar eclipses have been used to provide information about the evolution of the solar magnetic field and EUV and X-ray emissions from the solar corona and chromosphere. By measuring levels of ionisation during an eclipse and comparing these measurements with an estimate of the unperturbed ionisation levels (such as those made during a control day, where available) it is possible to estimate the percentage of ionising radiation being emitted by the solar corona and chromosphere. Previously unpublished data from the two eclipses presented here are particularly valuable as they provide information that supplements the data published to date. The eclipse of 23 October 1976 over Australia provides information in a data gap that would otherwise have spanned the years 1966 to 1991. The eclipse of 4 December 2002 over Southern Africa is important as it extends the published sequence of measurements. Comparing measurements from eclipses between 1932 and 2002 with the solar magnetic source flux reveals that changes in the solar EUV and X-ray flux lag the open source flux measurements by approximately 1.5 years. We suggest that this unexpected result comes about from changes to the relative size of the limb corona between eclipses, with the lag representing the time taken to populate the coronal field with plasma hot enough to emit the EUV and X-rays ionising our atmosphere.
Resumo:
A large volume of visual content is inaccessible until effective and efficient indexing and retrieval of such data is achieved. In this paper, we introduce the DREAM system, which is a knowledge-assisted semantic-driven context-aware visual information retrieval system applied in the film post production domain. We mainly focus on the automatic labelling and topic map related aspects of the framework. The use of the context- related collateral knowledge, represented by a novel probabilistic based visual keyword co-occurrence matrix, had been proven effective via the experiments conducted during system evaluation. The automatically generated semantic labels were fed into the Topic Map Engine which can automatically construct ontological networks using Topic Maps technology, which dramatically enhances the indexing and retrieval performance of the system towards an even higher semantic level.
Resumo:
A quasi-optical interferometric technique capable of measuring antenna phase patterns without the need for a heterodyne receiver is presented. It is particularly suited to the characterization of terahertz antennas feeding power detectors or mixers employing quasi-optical local oscillator injection. Examples of recorded antenna phase patterns at frequencies of 1.4 and 2.5 THz using homodyne detectors are presented. To our knowledge, these are the highest frequency antenna phase patterns ever recovered. Knowledge of both the amplitude and phase patterns in the far field enable a Gauss-Hermite or Gauss-Laguerre beam-mode analysis to be carried out for the antenna, of importance in performance optimization calculations, such as antenna gain and beam efficiency parameters at the design and prototype stage of antenna development. A full description of the beam would also be required if the antenna is to be used to feed a quasi-optical system in the near-field to far-field transition region. This situation could often arise when the device is fitted directly at the back of telescopes in flying observatories. A further benefit of the proposed technique is simplicity for characterizing systems in situ, an advantage of considerable importance as in many situations, the components may not be removable for further characterization once assembled. The proposed methodology is generic and should be useful across the wider sensing community, e.g., in single detector acoustic imaging or in adaptive imaging array applications. Furthermore, it is applicable across other frequencies of the EM spectrum, provided adequate spatial and temporal phase stability of the source can be maintained throughout the measurement process. Phase information retrieval is also of importance to emergent research areas, such as band-gap structure characterization, meta-materials research, electromagnetic cloaking, slow light, super-lens design as well as near-field and virtual imaging applications.
Resumo:
The ability to create accurate geometric models of neuronal morphology is important for understanding the role of shape in information processing. Despite a significant amount of research on automating neuron reconstructions from image stacks obtained via microscopy, in practice most data are still collected manually. This paper describes Neuromantic, an open source system for three dimensional digital tracing of neurites. Neuromantic reconstructions are comparable in quality to those of existing commercial and freeware systems while balancing speed and accuracy of manual reconstruction. The combination of semi-automatic tracing, intuitive editing, and ability of visualizing large image stacks on standard computing platforms provides a versatile tool that can help address the reconstructions availability bottleneck. Practical considerations for reducing the computational time and space requirements of the extended algorithm are also discussed.
Resumo:
The large scale urban consumption of energy (LUCY) model simulates all components of anthropogenic heat flux (QF) from the global to individual city scale at 2.5 × 2.5 arc-minute resolution. This includes a database of different working patterns and public holidays, vehicle use and energy consumption in each country. The databases can be edited to include specific diurnal and seasonal vehicle and energy consumption patterns, local holidays and flows of people within a city. If better information about individual cities is available within this (open-source) database, then the accuracy of this model can only improve, to provide the community data from global-scale climate modelling or the individual city scale in the future. The results show that QF varied widely through the year, through the day, between countries and urban areas. An assessment of the heat emissions estimated revealed that they are reasonably close to those produced by a global model and a number of small-scale city models, so results from LUCY can be used with a degree of confidence. From LUCY, the global mean urban QF has a diurnal range of 0.7–3.6 W m−2, and is greater on weekdays than weekends. The heat release from building is the largest contributor (89–96%), to heat emissions globally. Differences between months are greatest in the middle of the day (up to 1 W m−2 at 1 pm). December to February, the coldest months in the Northern Hemisphere, have the highest heat emissions. July and August are at the higher end. The least QF is emitted in May. The highest individual grid cell heat fluxes in urban areas were located in New York (577), Paris (261.5), Tokyo (178), San Francisco (173.6), Vancouver (119) and London (106.7). Copyright © 2010 Royal Meteorological Society
Resumo:
The CHARMe project enables the annotation of climate data with key pieces of supporting information that we term “commentary”. Commentary reflects the experience that has built up in the user community, and can help new or less-expert users (such as consultants, SMEs, experts in other fields) to understand and interpret complex data. In the context of global climate services, the CHARMe system will record, retain and disseminate this commentary on climate datasets, and provide a means for feeding back this experience to the data providers. Based on novel linked data techniques and standards, the project has developed a core system, data model and suite of open-source tools to enable this information to be shared, discovered and exploited by the community.
Resumo:
For users of climate services, the ability to quickly determine the datasets that best fit one's needs would be invaluable. The volume, variety and complexity of climate data makes this judgment difficult. The ambition of CHARMe ("Characterization of metadata to enable high-quality climate services") is to give a wider interdisciplinary community access to a range of supporting information, such as journal articles, technical reports or feedback on previous applications of the data. The capture and discovery of this "commentary" information, often created by data users rather than data providers, and currently not linked to the data themselves, has not been significantly addressed previously. CHARMe applies the principles of Linked Data and open web standards to associate, record, search and publish user-derived annotations in a way that can be read both by users and automated systems. Tools have been developed within the CHARMe project that enable annotation capability for data delivery systems already in wide use for discovering climate data. In addition, the project has developed advanced tools for exploring data and commentary in innovative ways, including an interactive data explorer and comparator ("CHARMe Maps") and a tool for correlating climate time series with external "significant events" (e.g. instrument failures or large volcanic eruptions) that affect the data quality. Although the project focuses on climate science, the concepts are general and could be applied to other fields. All CHARMe system software is open-source, released under a liberal licence, permitting future projects to re-use the source code as they wish.
Resumo:
Geospatial information of many kinds, from topographic maps to scientific data, is increasingly being made available through web mapping services. These allow georeferenced map images to be served from data stores and displayed in websites and geographic information systems, where they can be integrated with other geographic information. The Open Geospatial Consortium’s Web Map Service (WMS) standard has been widely adopted in diverse communities for sharing data in this way. However, current services typically provide little or no information about the quality or accuracy of the data they serve. In this paper we will describe the design and implementation of a new “quality-enabled” profile of WMS, which we call “WMS-Q”. This describes how information about data quality can be transmitted to the user through WMS. Such information can exist at many levels, from entire datasets to individual measurements, and includes the many different ways in which data uncertainty can be expressed. We also describe proposed extensions to the Symbology Encoding specification, which include provision for visualizing uncertainty in raster data in a number of different ways, including contours, shading and bivariate colour maps. We shall also describe new open-source implementations of the new specifications, which include both clients and servers.
Resumo:
Climate change poses new challenges to cities and new flexible forms of governance are required that are able to take into account the uncertainty and abruptness of changes. The purpose of this paper is to discuss adaptive climate change governance for urban resilience. This paper identifies and reviews three traditions of literature on the idea of transitions and transformations, and assesses to what extent the transitions encompass elements of adaptive governance. This paper uses the open source Urban Transitions Project database to assess how urban experiments take into account principles of adaptive governance. The results show that: the experiments give no explicit information of ecological knowledge; the leadership of cities is primarily from local authorities; and evidence of partnerships and anticipatory or planned adaptation is limited or absent. The analysis shows that neither technological, political nor ecological solutions alone are sufficient to further our understanding of the analytical aspects of transition thinking in urban climate governance. In conclusion, the paper argues that the future research agenda for urban climate governance needs to explore further the links between the three traditions in order to better identify contradictions, complementarities or compatibilities, and what this means in practice for creating and assessing urban experiments.
Resumo:
This paper presents an open-source canopy height profile (CHP) toolkit designed for processing small-footprint full-waveform LiDAR data to obtain the estimates of effective leaf area index (LAIe) and CHPs. The use of the toolkit is presented with a case study of LAIe estimation in discontinuous-canopy fruit plantations. The experiments are carried out in two study areas, namely, orange and almond plantations, with different percentages of canopy cover (48% and 40%, respectively). For comparison, two commonly used discrete-point LAIe estimation methods are also tested. The LiDAR LAIe values are first computed for each of the sites and each method as a whole, providing “apparent” site-level LAIe, which disregards the discontinuity of the plantations’ canopies. Since the toolkit allows for the calculation of the study area LAIe at different spatial scales, between-tree-level clumpingcan be easily accounted for and is then used to illustrate the impact of the discontinuity of canopy cover on LAIe retrieval. The LiDAR LAIe estimates are therefore computed at smaller scales as a mean of LAIe in various grid-cell sizes, providing estimates of “actual” site-level LAIe. Subsequently, the LiDAR LAIe results are compared with theoretical models of “apparent” LAIe versus “actual” LAIe, based on known percent canopy cover in each site. The comparison of those models to LiDAR LAIe derived from the smallest grid-cell sizes against the estimates of LAIe for the whole site has shown that the LAIe estimates obtained from the CHP toolkit provided values that are closest to those of theoretical models.