53 resultados para random indexing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: The clustering of gene profiles across some experimental conditions of interest contributes significantly to the elucidation of unknown gene function, the validation of gene discoveries and the interpretation of biological processes. However, this clustering problem is not straightforward as the profiles of the genes are not all independently distributed and the expression levels may have been obtained from an experimental design involving replicated arrays. Ignoring the dependence between the gene profiles and the structure of the replicated data can result in important sources of variability in the experiments being overlooked in the analysis, with the consequent possibility of misleading inferences being made. We propose a random-effects model that provides a unified approach to the clustering of genes with correlated expression levels measured in a wide variety of experimental situations. Our model is an extension of the normal mixture model to account for the correlations between the gene profiles and to enable covariate information to be incorporated into the clustering process. Hence the model is applicable to longitudinal studies with or without replication, for example, time-course experiments by using time as a covariate, and to cross-sectional experiments by using categorical covariates to represent the different experimental classes. Results: We show that our random-effects model can be fitted by maximum likelihood via the EM algorithm for which the E(expectation) and M(maximization) steps can be implemented in closed form. Hence our model can be fitted deterministically without the need for time-consuming Monte Carlo approximations. The effectiveness of our model-based procedure for the clustering of correlated gene profiles is demonstrated on three real datasets, representing typical microarray experimental designs, covering time-course, repeated-measurement and cross-sectional data. In these examples, relevant clusters of the genes are obtained, which are supported by existing gene-function annotation. A synthetic dataset is considered too.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The random switching of measurement bases is commonly assumed to be a necessary step of quantum key distribution protocols. In this paper we present a no-switching protocol and show that switching is not required for coherent-state continuous-variable quantum key distribution. Further, this protocol achieves higher information rates and a simpler experimental setup compared to previous protocols that rely on switching. We propose an optimal eavesdropping attack against this protocol, assuming individual Gaussian attacks. Finally, we investigate and compare the no-switching protocol applied to the original Bennett-Brassard 1984 scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index image's multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partition's center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images have similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the dimensionality curse existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms image's text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partition's center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. To effectively integrate multi-features, we also investigated the following evidence combination techniques-Certainty Factor, Dempster Shafer Theory, Compound Probability, and Linear Combination. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude. And Certainty Factor and Dempster Shafer Theory perform best in combining multiple similarities from corresponding multiple features.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work deals with the random free vibration of functionally graded laminates with general boundary conditions and subjected to a temperature change, taking into account the randomness in a number of independent input variables such as Young's modulus, Poisson's ratio and thermal expansion coefficient of each constituent material. Based on third-order shear deformation theory, the mixed-type formulation and a semi-analytical approach are employed to derive the standard eigenvalue problem in terms of deflection, mid-plane rotations and stress function. A mean-centered first-order perturbation technique is adopted to obtain the second-order statistics of vibration frequencies. A detailed parametric study is conducted, and extensive numerical results are presented in both tabular and graphical forms for laminated plates that contain functionally graded material which is made of aluminum and zirconia, showing the effects of scattering in thermo-clastic material constants, temperature change, edge support condition, side-to-thickness ratio, and plate aspect ratio on the stochastic characteristics of natural frequencies. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Niche apportionment models have only been applied once to parasite communities. Only the random assortment model (RA), which indicates that species abundances are independent from each other and that interspecific competition is unimportant, provided a good fit to 3 out of 6 parasite communities investigated. The generality of this result needs to be validated, however. In this study we apply 5 niche apportionment models to the parasite communities of 14 fish species from the Great Barrier Reef. We determined which model fitted the data when using either numerical abundance or biomass as an estimate of parasite abundance, and whether the fit of niche apportionment models depends on how the parasite community is defined (e.g. ecto, endoparasites or all parasites considered together). The RA model provided a good fit for the whole community of parasites in 7 fish species when using biovolume (as a surrogate of biomass) as a measure of species abundance. The RA model also fitted observed data when ecto- and endoparasites were considered separately, using abundance or biovolume, but less frequently. Variation in fish sizes among species was not associated with the probability of a model fitting the data. Total numerical abundance and biovolume of parasites were not related across host species, suggesting that they capture different aspects of abundance. Biovolume is not only a better measurement to use with niche-orientated models, it should also be the preferred descriptor to analyse parasite community structure in other contexts. Most of the biological assumptions behind the RA model, i.e. randomness in apportioning niche space, lack of interspecific competition, independence of abundance among different species, and species with variable niches in changeable environments, are in accordance with some previous findings on parasite communities. Thus, parasite communities may generally be unsaturated with species, with empty niches, and interspecific interactions may generally be unimportant in determining parasite community structure.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problems of computing the power and exponential moments EXs and EetX of square Gaussian random matrices X=A+BWC for positive integer s and real t, where W is a standard normal random vector and A, B, C are appropriately dimensioned constant matrices. We solve the problems by a matrix product scalarization technique and interpret the solutions in system-theoretic terms. The results of the paper are applicable to Bayesian prediction in multivariate autoregressive time series and mean-reverting diffusion processes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Indexing high dimensional datasets has attracted extensive attention from many researchers in the last decade. Since R-tree type of index structures are known as suffering curse of dimensionality problems, Pyramid-tree type of index structures, which are based on the B-tree, have been proposed to break the curse of dimensionality. However, for high dimensional data, the number of pyramids is often insufficient to discriminate data points when the number of dimensions is high. Its effectiveness degrades dramatically with the increase of dimensionality. In this paper, we focus on one particular issue of curse of dimensionality; that is, the surface of a hypercube in a high dimensional space approaches 100% of the total hypercube volume when the number of dimensions approaches infinite. We propose a new indexing method based on the surface of dimensionality. We prove that the Pyramid tree technology is a special case of our method. The results of our experiments demonstrate clear priority of our novel method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With rapid advances in video processing technologies and ever fast increments in network bandwidth, the popularity of video content publishing and sharing has made similarity search an indispensable operation to retrieve videos of user interests. The video similarity is usually measured by the percentage of similar frames shared by two video sequences, and each frame is typically represented as a high-dimensional feature vector. Unfortunately, high complexity of video content has posed the following major challenges for fast retrieval: (a) effective and compact video representations, (b) efficient similarity measurements, and (c) efficient indexing on the compact representations. In this paper, we propose a number of methods to achieve fast similarity search for very large video database. First, each video sequence is summarized into a small number of clusters, each of which contains similar frames and is represented by a novel compact model called Video Triplet (ViTri). ViTri models a cluster as a tightly bounded hypersphere described by its position, radius, and density. The ViTri similarity is measured by the volume of intersection between two hyperspheres multiplying the minimal density, i.e., the estimated number of similar frames shared by two clusters. The total number of similar frames is then estimated to derive the overall similarity between two video sequences. Hence the time complexity of video similarity measure can be reduced greatly. To further reduce the number of similarity computations on ViTris, we introduce a new one dimensional transformation technique which rotates and shifts the original axis system using PCA in such a way that the original inter-distance between two high-dimensional vectors can be maximally retained after mapping. An efficient B+-tree is then built on the transformed one dimensional values of ViTris' positions. Such a transformation enables B+-tree to achieve its optimal performance by quickly filtering a large portion of non-similar ViTris. Our extensive experiments on real large video datasets prove the effectiveness of our proposals that outperform existing methods significantly.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we present a novel indexing technique called Multi-scale Similarity Indexing (MSI) to index imagersquos multi-features into a single one-dimensional structure. Both for text and visual feature spaces, the similarity between a point and a local partitionrsquos center in individual space is used as the indexing key, where similarity values in different features are distinguished by different scale. Then a single indexing tree can be built on these keys. Based on the property that relevant images haves similar similarity values from the center of the same local partition in any feature space, certain number of irrelevant images can be fast pruned based on the triangle inequity on indexing keys. To remove the ldquodimensionality curserdquo existing in high dimensional structure, we propose a new technique called Local Bit Stream (LBS). LBS transforms imagersquos text and visual feature representations into simple, uniform and effective bit stream (BS) representations based on local partitionrsquos center. Such BS representations are small in size and fast for comparison since only bit operation are involved. By comparing common bits existing in two BSs, most of irrelevant images can be immediately filtered. Our extensive experiment showed that single one-dimensional index on multi-features improves multi-indices on multi-features greatly. Our LBS method outperforms sequential scan on high dimensional space by an order of magnitude.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While others have attempted to determine, by way of mathematical formulae, optimal resource duplication strategies for random walk protocols, this paper is concerned with studying the emergent effects of dynamic resource propagation and replication. In particular, we show, via modelling and experimentation, that under any given decay (purge) rate the number of nodes that have knowledge of particular resource converges to a fixed point or a limit cycle. We also show that even for high rates of decay - that is, when few nodes have knowledge of a particular resource - the number of hops required to find that resource is small.