930 resultados para Memory Management (Computer science)
Resumo:
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index image's multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partition's center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images have similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the dimensionality curse existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms image's text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partition's center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. To effectively integrate multi-features, we also investigated the following evidence combination techniques-Certainty Factor, Dempster Shafer Theory, Compound Probability, and Linear Combination. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude. And Certainty Factor and Dempster Shafer Theory perform best in combining multiple similarities from corresponding multiple features.
Resumo:
Summarizing topological relations is fundamental to many spatial applications including spatial query optimization. In this article, we present several novel techniques to effectively construct cell density based spatial histograms for range (window) summarizations restricted to the four most important level-two topological relations: contains, contained, overlap, and disjoint. We first present a novel framework to construct a multiscale Euler histogram in 2D space with the guarantee of the exact summarization results for aligned windows in constant time. To minimize the storage space in such a multiscale Euler histogram, an approximate algorithm with the approximate ratio 19/12 is presented, while the problem is shown NP-hard generally. To conform to a limited storage space where a multiscale histogram may be allowed to have only k Euler histograms, an effective algorithm is presented to construct multiscale histograms to achieve high accuracy in approximately summarizing aligned windows. Then, we present a new approximate algorithm to query an Euler histogram that cannot guarantee the exact answers; it runs in constant time. We also investigate the problem of nonaligned windows and the problem of effectively partitioning the data space to support nonaligned window queries. Finally, we extend our techniques to 3D space. Our extensive experiments against both synthetic and real world datasets demonstrate that the approximate multiscale histogram techniques may improve the accuracy of the existing techniques by several orders of magnitude while retaining the cost efficiency, and the exact multiscale histogram technique requires only a storage space linearly proportional to the number of cells for many popular real datasets.
Resumo:
In many advanced applications, data are described by multiple high-dimensional features. Moreover, different queries may weight these features differently; some may not even specify all the features. In this paper, we propose our solution to support efficient query processing in these applications. We devise a novel representation that compactly captures f features into two components: The first component is a 2D vector that reflects a distance range ( minimum and maximum values) of the f features with respect to a reference point ( the center of the space) in a metric space and the second component is a bit signature, with two bits per dimension, obtained by analyzing each feature's descending energy histogram. This representation enables two levels of filtering: The first component prunes away points that do not share similar distance ranges, while the bit signature filters away points based on the dimensions of the relevant features. Moreover, the representation facilitates the use of a single index structure to further speed up processing. We employ the classical B+-tree for this purpose. We also propose a KNN search algorithm that exploits the access orders of critical dimensions of highly selective features and partial distances to prune the search space more effectively. Our extensive experiments on both real-life and synthetic data sets show that the proposed solution offers significant performance advantages over sequential scan and retrieval methods using single and multiple VA-files.
Resumo:
A set of techniques referred to as circular statistics has been developed for the analysis of directional and orientational data. The unit of measure for such data is angular (usually in either degrees or radians), and the statistical distributions underlying the techniques are characterised by their cyclic nature-for example, angles of 359.9 degrees are considered close to angles of 0 degrees. In this paper, we assert that such approaches can be easily adapted to analyse time-of-day and time-of-week data, and in particular daily cycles in the numbers of incidents reported to the police. We begin the paper by describing circular statistics. We then discuss how these may be modified, and demonstrate the approach with some examples for reported incidents in the Cardiff area of Wales. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents a scientific and technical description of the modelling framework and the main results of modelling the long-term average sediment delivery at hillslope to medium-scale catchments over the entire Murray Darling Basin (MDB). A theoretical development that relates long-term averaged sediment delivery to the statistics of rainfall and catchment parameters is presented. The derived flood frequency approach was adapted to investigate the problem of regionalization of the sediment delivery ratio (SDR) across the Basin. SDR, a measure of catchment response to the upland erosion rate, was modeled by two lumped linear stores arranged in series: hillslope transport to the nearest streams and flow routing in the channel network. The theory shows that the ratio of catchment sediment residence time (SRT) to average effective rainfall duration is the most important control in the sediment delivery processes. In this study, catchment SRTs were estimated using travel time for overland flow multiplied by an enlargement factor which is a function of particle size. Rainfall intensity and effective duration statistics were regionalized by using long-term measurements from 195 pluviograph sites within and around the Basin. Finally, the model was implemented across the MDB by using spatially distributed soil, vegetation, topographical and land use properties under Geographic Information System (GIs) environment. The results predict strong variations in SDR from close to 0 in floodplains to 70% in the eastern uplands of the Basin. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Irrigation practices that are profligate in their use of water have come under closer scrutiny by water managers and the public. Trickle irrigation has the propensity to increase water use efficiency but only if the system is designed to meet the soil and plant conditions. Recently we have provided a software tool, WetUp (http://www.clw.csiro.au/products/wetup/), to calculate the wetting patterns from trickle irrigation emitters. WetUp uses an analytical solution to calculate the wetted perimeter for both buried and surface emitters. This analytical solution has a number of assumptions, two of which are that the wetting front is defined by water content at which the hydraulic conductivity (K) is I mm day(-1) and that the flow occurs from a point source. Here we compare the wetting patterns calculated with a 2-dimensional numerical model, HYDRUS2D, for solving the water flow into typical soils with the analytical solution. The results show that the wetting patterns are similar, except when the soil properties result in the assumption of a point source no longer being a good description of the flow regime. Difficulties were also experienced with getting stable solutions with HYDRUS2D for soils with low hydraulic conductivities. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
For second-hand products sold with warranty, the expected warranty cost for an item to the manufacturer, depends on (i) the age and/or usage as well as the maintenance history for the item and (ii) the terms of the warranty policy. The paper develops probabilistic models to compute the expected warranty cost to the manufacturer when the items are sold with free replacement or pro rata warranties. (C) 2000 Elsevier Science Ltd. All rights reserved.
Resumo:
Business environments have become exceedingly dynamic and competitive in recent times. This dynamism is manifested in the form of changing process requirements and time constraints. Workflow technology is currently one of the most promising fields of research in business process automation. However, workflow systems to date do not provide the flexibility necessary to support the dynamic nature of business processes. In this paper we primarily discuss the issues and challenges related to managing change and time in workflows representing dynamic business processes. We also present an analysis of workflow modifications and provide feasibility considerations for the automation of this process.
Resumo:
Multiresolution (or multi-scale) techniques make it possible for Web-based GIS applications to access large dataset. The performance of such systems relies on data transmission over network and multiresolution query processing. In the literature the latter has received little research attention so far, and the existing methods are not capable of processing large dataset. In this paper, we aim to improve multiresolution query processing in an online environment. A cost model for such query is proposed first, followed by three strategies for its optimization. Significant theoretical improvement can be observed when comparing against available methods. Application of these strategies is also discussed, and similar performance enhancement can be expected if implemented in online GIS applications.
Resumo:
A major task of traditional temporal event sequence mining is to predict the occurrences of a special type of event (called target event) in a long temporal sequence. Our previous work has defined a new type of pattern, called event-oriented pattern, which can potentially predict the target event within a certain period of time. However, in the event-oriented pattern discovery, because the size of interval for prediction is pre-defined, the mining results could be inaccurate and carry misleading information. In this paper, we introduce a new concept, called temporal feature, to rectify this shortcoming. Generally, for any event-oriented pattern discovered under the pre-given size of interval, the temporal feature is the minimal size of interval that makes the pattern interesting. Thus, by further investigating the temporal features of discovered event-oriented patterns, we can refine the knowledge for the target event prediction.
Resumo:
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index imagersquos multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partitionrsquos center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images haves similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the ldquodimensionality curserdquo existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms imagersquos text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partitionrsquos center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude.
Resumo:
Collaborative recommendation is one of widely used recommendation systems, which recommend items to visitor on a basis of referring other's preference that is similar to current user. User profiling technique upon Web transaction data is able to capture such informative knowledge of user task or interest. With the discovered usage pattern information, it is likely to recommend Web users more preferred content or customize the Web presentation to visitors via collaborative recommendation. In addition, it is helpful to identify the underlying relationships among Web users, items as well as latent tasks during Web mining period. In this paper, we propose a Web recommendation framework based on user profiling technique. In this approach, we employ Probabilistic Latent Semantic Analysis (PLSA) to model the co-occurrence activities and develop a modified k-means clustering algorithm to build user profiles as the representatives of usage patterns. Moreover, the hidden task model is derived by characterizing the meaningful latent factor space. With the discovered user profiles, we then choose the most matched profile, which possesses the closely similar preference to current user and make collaborative recommendation based on the corresponding page weights appeared in the selected user profile. The preliminary experimental results performed on real world data sets show that the proposed approach is capable of making recommendation accurately and efficiently.
Resumo:
For determining functionality dependencies between two proteins, both represented as 3D structures, it is an essential condition that they have one or more matching structural regions called patches. As 3D structures for proteins are large, complex and constantly evolving, it is computationally expensive and very time-consuming to identify possible locations and sizes of patches for a given protein against a large protein database. In this paper, we address a vector space based representation for protein structures, where a patch is formed by the vectors within the region. Based on our previews work, a compact representation of the patch named patch signature is applied here. A similarity measure of two patches is then derived based on their signatures. To achieve fast patch matching in large protein databases, a match-and-expand strategy is proposed. Given a query patch, a set of small k-sized matching patches, called candidate patches, is generated in match stage. The candidate patches are further filtered by enlarging k in expand stage. Our extensive experimental results demonstrate encouraging performances with respect to this biologically critical but previously computationally prohibitive problem.