96 resultados para Database management -- Computer programs
Resumo:
In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index imagersquos multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partitionrsquos center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images haves similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the ldquodimensionality curserdquo existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms imagersquos text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partitionrsquos center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude.
Resumo:
One of critical challenges in automatic recognition of TV commercials is to generate a unique, robust and compact signature. Uniqueness indicates the ability to identify the similarity among the commercial video clips which may have slight content variation. Robustness means the ability to match commercial video clips containing the same content but probably with different digitalization/encoding, some noise data, and/or transmission and recording distortion. Efficiency is about the capability of effectively matching commercial video sequences with a low computation cost and storage overhead. In this paper, we present a binary signature based method, which meets all the three criteria above, by combining the techniques of ordinal and color measurements. Experimental results on a real large commercial video database show that our novel approach delivers a significantly better performance comparing to the existing methods.
Resumo:
Collaborative recommendation is one of widely used recommendation systems, which recommend items to visitor on a basis of referring other's preference that is similar to current user. User profiling technique upon Web transaction data is able to capture such informative knowledge of user task or interest. With the discovered usage pattern information, it is likely to recommend Web users more preferred content or customize the Web presentation to visitors via collaborative recommendation. In addition, it is helpful to identify the underlying relationships among Web users, items as well as latent tasks during Web mining period. In this paper, we propose a Web recommendation framework based on user profiling technique. In this approach, we employ Probabilistic Latent Semantic Analysis (PLSA) to model the co-occurrence activities and develop a modified k-means clustering algorithm to build user profiles as the representatives of usage patterns. Moreover, the hidden task model is derived by characterizing the meaningful latent factor space. With the discovered user profiles, we then choose the most matched profile, which possesses the closely similar preference to current user and make collaborative recommendation based on the corresponding page weights appeared in the selected user profile. The preliminary experimental results performed on real world data sets show that the proposed approach is capable of making recommendation accurately and efficiently.
Resumo:
For determining functionality dependencies between two proteins, both represented as 3D structures, it is an essential condition that they have one or more matching structural regions called patches. As 3D structures for proteins are large, complex and constantly evolving, it is computationally expensive and very time-consuming to identify possible locations and sizes of patches for a given protein against a large protein database. In this paper, we address a vector space based representation for protein structures, where a patch is formed by the vectors within the region. Based on our previews work, a compact representation of the patch named patch signature is applied here. A similarity measure of two patches is then derived based on their signatures. To achieve fast patch matching in large protein databases, a match-and-expand strategy is proposed. Given a query patch, a set of small k-sized matching patches, called candidate patches, is generated in match stage. The candidate patches are further filtered by enlarging k in expand stage. Our extensive experimental results demonstrate encouraging performances with respect to this biologically critical but previously computationally prohibitive problem.
Resumo:
A Geographic Information System (GIS) was used to model datasets of Leyte Island, the Philippines, to identify land which was suitable for a forest extension program on the island. The datasets were modelled to provide maps of the distance of land from cities and towns, land which was a suitable elevation and slope for smallholder forestry and land of various soil types. An expert group was used to assign numeric site suitabilities to the soil types and maps of site suitability were used to assist the selection of municipalities for the provision of extension assistance to smallholders. Modelling of the datasets was facilitated by recent developments of the ArcGIS® suite of computer programs and derivation of elevation and slope was assisted by the availability of digital elevation models (DEM) produced by the Shuttle Radar Topography (SRTM) mission. The usefulness of GIS software as a decision support tool for small-scale forestry extension programs is discussed.
Resumo:
The second edition of An Introduction to Efficiency and Productivity Analysis is designed to be a general introduction for those who wish to study efficiency and productivity analysis. The book provides an accessible, well-written introduction to the four principal methods involved: econometric estimation of average response models; index numbers, data envelopment analysis (DEA); and stochastic frontier analysis (SFA). For each method, a detailed introduction to the basic concepts is presented, numerical examples are provided, and some of the more important extensions to the basic methods are discussed. Of special interest is the systematic use of detailed empirical applications using real-world data throughout the book. In recent years, there have been a number of excellent advance-level books published on performance measurement. This book, however, is the first systematic survey of performance measurement with the express purpose of introducing the field to a wide audience of students, researchers, and practitioners. Indeed, the 2nd Edition maintains its uniqueness: (1) It is a well-written introduction to the field. (2) It outlines, discusses and compares the four principal methods for efficiency and productivity analysis in a well-motivated presentation. (3) It provides detailed advice on computer programs that can be used to implement these performance measurement methods. The book contains computer instructions and output listings for the SHAZAM, LIMDEP, TFPIP, DEAP and FRONTIER computer programs. More extensive listings of data and computer instruction files are available on the book's website: (www.uq.edu.au/economics/cepa/crob2005).