882 resultados para Interest Similarity


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we consider the problem of learning an n × n kernel matrix from m(1) similarity matrices under general convex loss. Past research have extensively studied the m = 1 case and have derived several algorithms which require sophisticated techniques like ACCP, SOCP, etc. The existing algorithms do not apply if one uses arbitrary losses and often can not handle m > 1 case. We present several provably convergent iterative algorithms, where each iteration requires either an SVM or a Multiple Kernel Learning (MKL) solver for m > 1 case. One of the major contributions of the paper is to extend the well knownMirror Descent(MD) framework to handle Cartesian product of psd matrices. This novel extension leads to an algorithm, called EMKL, which solves the problem in O(m2 log n 2) iterations; in each iteration one solves an MKL involving m kernels and m eigen-decomposition of n × n matrices. By suitably defining a restriction on the objective function, a faster version of EMKL is proposed, called REKL,which avoids the eigen-decomposition. An alternative to both EMKL and REKL is also suggested which requires only an SVMsolver. Experimental results on real world protein data set involving several similarity matrices illustrate the efficacy of the proposed algorithms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In computational molecular biology, the aim of restriction mapping is to locate the restriction sites of a given enzyme on a DNA molecule. Double digest and partial digest are two well-studied techniques for restriction mapping. While double digest is NP-complete, there is no known polynomial-time algorithm for partial digest. Another disadvantage of the above techniques is that there can be multiple solutions for reconstruction. In this paper, we study a simple technique called labeled partial digest for restriction mapping. We give a fast polynomial time (O(n(2) log n) worst-case) algorithm for finding all the n sites of a DNA molecule using this technique. An important advantage of the algorithm is the unique reconstruction of the DNA molecule from the digest. The technique is also robust in handling errors in fragment lengths which arises in the laboratory. We give a robust O(n(4)) worst-case algorithm that can provably tolerate an absolute error of O(Delta/n) (where Delta is the minimum inter-site distance), while giving a unique reconstruction. We test our theoretical results by simulating the performance of the algorithm on a real DNA molecule. Motivated by the similarity to the labeled partial digest problem, we address a related problem of interest-the de novo peptide sequencing problem (ACM-SIAM Symposium on Discrete Algorithms (SODA), 2000, pp. 389-398), which arises in the reconstruction of the peptide sequence of a protein molecule. We give a simple and efficient algorithm for the problem without using dynamic programming. The algorithm runs in time O(k log k), where k is the number of ions and is an improvement over the algorithm in Chen et al. (C) 2002 Elsevier Science (USA). All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The rapidly growing structure databases enhance the probability of finding identical sequences sharing structural similarity. Structure prediction methods are being used extensively to abridge the gap between known protein sequences and the solved structures which is essential to understand its specific biochemical and cellular functions. In this work, we plan to study the ambiguity between sequence-structure relationships and examine if sequentially identical peptide fragments adopt similar three-dimensional structures. Fragments of varying lengths (five to ten residues) were used to observe the behavior of sequence and its three-dimensional structures. The STAMP program was used to superpose the three-dimensional structures and the two parameters (Sequence Structure Similarity Score (Sc) and Root Mean Square Deviation value) were employed to classify them into three categories: similar, intermediate and dissimilar structures. Furthermore, the same approach was carried out on all the three-dimensional protein structures solved in the two organisms, Mycobacterium tuberculosis and Plasmodium falciparum to validate our results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Competition theory predicts that local communities should consist of species that are more dissimilar than expected by chance. We find a strikingly different pattern in a multicontinent data set (55 presence-absence matrices from 24 locations) on the composition of mixed-species bird flocks, which are important sub-units of local bird communities the world over. By using null models and randomization tests followed by meta-analysis, we find the association strengths of species in flocks to be strongly related to similarity in body size and foraging behavior and higher for congeneric compared with noncongeneric species pairs. Given the local spatial scales of our individual analyses, differences in the habitat preferences of species are unlikely to have caused these association patterns; the patterns observed are most likely the outcome of species interactions. Extending group-living and social-information-use theory to a heterospecific context, we discuss potential behavioral mechanisms that lead to positive interactions among similar species in flocks, as well as ways in which competition costs are reduced. Our findings highlight the need to consider positive interactions along with competition when seeking to explain community assembly.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

How do we perform rapid visual categorization?It is widely thought that categorization involves evaluating the similarity of an object to other category items, but the underlying features and similarity relations remain unknown. Here, we hypothesized that categorization performance is based on perceived similarity relations between items within and outside the category. To this end, we measured the categorization performance of human subjects on three diverse visual categories (animals, vehicles, and tools) and across three hierarchical levels (superordinate, basic, and subordinate levels among animals). For the same subjects, we measured their perceived pair-wise similarities between objects using a visual search task. Regardless of category and hierarchical level, we found that the time taken to categorize an object could be predicted using its similarity to members within and outside its category. We were able to account for several classic categorization phenomena, such as (a) the longer times required to reject category membership; (b) the longer times to categorize atypical objects; and (c) differences in performance across tasks and across hierarchical levels. These categorization times were also accounted for by a model that extracts coarse structure from an image. The striking agreement observed between categorization and visual search suggests that these two disparate tasks depend on a shared coarse object representation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Time series classification deals with the problem of classification of data that is multivariate in nature. This means that one or more of the attributes is in the form of a sequence. The notion of similarity or distance, used in time series data, is significant and affects the accuracy, time, and space complexity of the classification algorithm. There exist numerous similarity measures for time series data, but each of them has its own disadvantages. Instead of relying upon a single similarity measure, our aim is to find the near optimal solution to the classification problem by combining different similarity measures. In this work, we use genetic algorithms to combine the similarity measures so as to get the best performance. The weightage given to different similarity measures evolves over a number of generations so as to get the best combination. We test our approach on a number of benchmark time series datasets and present promising results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Accurate and timely prediction of weather phenomena, such as hurricanes and flash floods, require high-fidelity compute intensive simulations of multiple finer regions of interest within a coarse simulation domain. Current weather applications execute these nested simulations sequentially using all the available processors, which is sub-optimal due to their sub-linear scalability. In this work, we present a strategy for parallel execution of multiple nested domain simulations based on partitioning the 2-D processor grid into disjoint rectangular regions associated with each domain. We propose a novel combination of performance prediction, processor allocation methods and topology-aware mapping of the regions on torus interconnects. Experiments on IBM Blue Gene systems using WRF show that the proposed strategies result in performance improvement of up to 33% with topology-oblivious mapping and up to additional 7% with topology-aware mapping over the default sequential strategy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: The number of genome-wide association studies (GWAS) has increased rapidly in the past couple of years, resulting in the identification of genes associated with different diseases. The next step in translating these findings into biomedically useful information is to find out the mechanism of the action of these genes. However, GWAS studies often implicate genes whose functions are currently unknown; for example, MYEOV, ANKLE1, TMEM45B and ORAOV1 are found to be associated with breast cancer, but their molecular function is unknown. Results: We carried out Bayesian inference of Gene Ontology (GO) term annotations of genes by employing the directed acyclic graph structure of GO and the network of protein-protein interactions (PPIs). The approach is designed based on the fact that two proteins that interact biophysically would be in physical proximity of each other, would possess complementary molecular function, and play role in related biological processes. Predicted GO terms were ranked according to their relative association scores and the approach was evaluated quantitatively by plotting the precision versus recall values and F-scores (the harmonic mean of precision and recall) versus varying thresholds. Precisions of similar to 58% and similar to 40% for localization and functions respectively of proteins were determined at a threshold of similar to 30 (top 30 GO terms in the ranked list). Comparison with function prediction based on semantic similarity among nodes in an ontology and incorporation of those similarities in a k nearest neighbor classifier confirmed that our results compared favorably. Conclusions: This approach was applied to predict the cellular component and molecular function GO terms of all human proteins that have interacting partners possessing at least one known GO annotation. The list of predictions is available at http://severus.dbmi.pitt.edu/engo/GOPRED.html. We present the algorithm, evaluations and the results of the computational predictions, especially for genes identified in GWAS studies to be associated with diseases, which are of translational interest.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical-and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Visual tracking is an important task in various computer vision applications including visual surveillance, human computer interaction, event detection, video indexing and retrieval. Recent state of the art sparse representation (SR) based trackers show better robustness than many of the other existing trackers. One of the issues with these SR trackers is low execution speed. The particle filter framework is one of the major aspects responsible for slow execution, and is common to most of the existing SR trackers. In this paper,(1) we propose a robust interest point based tracker in l(1) minimization framework that runs at real-time with performance comparable to the state of the art trackers. In the proposed tracker, the target dictionary is obtained from the patches around target interest points. Next, the interest points from the candidate window of the current frame are obtained. The correspondence between target and candidate points is obtained via solving the proposed l(1) minimization problem. In order to prune the noisy matches, a robust matching criterion is proposed, where only the reliable candidate points that mutually match with target and candidate dictionary elements are considered for tracking. The object is localized by measuring the displacement of these interest points. The reliable candidate patches are used for updating the target dictionary. The performance and accuracy of the proposed tracker is benchmarked with several complex video sequences. The tracker is found to be considerably fast as compared to the reported state of the art trackers. The proposed tracker is further evaluated for various local patch sizes, number of interest points and regularization parameters. The performance of the tracker for various challenges including illumination change, occlusion, and background clutter has been quantified with a benchmark dataset containing 50 videos. (C) 2014 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Let R be a (commutative) local principal ideal ring of length two, for example, the ring R = Z/p(2)Z with p prime. In this paper, we develop a theory of normal forms for similarity classes in the matrix rings M-n (R) by interpreting them in terms of extensions of R t]-modules. Using this theory, we describe the similarity classes in M-n (R) for n <= 4, along with their centralizers. Among these, we characterize those classes which are similar to their transposes. Non-self-transpose classes are shown to exist for all n > 3. When R has finite residue field of order q, we enumerate the similarity classes and the cardinalities of their centralizers as polynomials in q. Surprisingly, the polynomials representing the number of similarity classes in M-n (R) turn out to have non-negative integer coefficients.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Query suggestion is an important feature of the search engine with the explosive and diverse growth of web contents. Different kind of suggestions like query, image, movies, music and book etc. are used every day. Various types of data sources are used for the suggestions. If we model the data into various kinds of graphs then we can build a general method for any suggestions. In this paper, we have proposed a general method for query suggestion by combining two graphs: (1) query click graph which captures the relationship between queries frequently clicked on common URLs and (2) query text similarity graph which finds the similarity between two queries using Jaccard similarity. The proposed method provides literally as well as semantically relevant queries for users' need. Simulation results show that the proposed algorithm outperforms heat diffusion method by providing more number of relevant queries. It can be used for recommendation tasks like query, image, and product suggestion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Diffuse optical tomography (DOT) using near-infrared light is a promising tool for non-invasive imaging of deep tissue. This technique is capable of quantitative reconstruction of absorption (mu(a)) and scattering coefficient (mu(s)) inhomogeneities in the tissue. The rationale for reconstructing the optical property map is that the absorption coefficient variation provides diagnostic information about metabolic and disease states of the tissue. The aim of DOT is to reconstruct the internal tissue cross section with good spatial resolution and contrast from noisy measurements non-invasively. We develop a region-of-interest scanning system based on DOT principles. Modulated light is injected into the phantom/tissue through one of the four light emitting diode sources. The light traversing through the tissue gets partially absorbed and scattered multiple times. The intensity and phase of the exiting light are measured using a set of photodetectors. The light transport through a tissue is diffusive in nature and is modeled using radiative transfer equation. However, a simplified model based on diffusion equation (DE) can be used if the system satisfies following conditions: (a) the optical parameter of the inhomogeneity is close to the optical property of the background, and (b) mu(s) of the medium is much greater than mu(a) (mu(s) >> mu(a)). The light transport through a highly scattering tissue satisfies both of these conditions. A discrete version of DE based on finite element method is used for solving the inverse problem. The depth of probing light inside the tissue depends on the wavelength of light, absorption, and scattering coefficients of the medium and the separation between the source and detector locations. Extensive simulation studies have been carried out and the results are validated using two sets of experimental measurements. The utility of the system can be further improved by using multiple wavelength light sources. In such a scheme, the spectroscopic variation of absorption coefficient in the tissue can be used to arrive at the oxygenation changes in the tissue. (C) 2016 AIP Publishing LLC.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Laminar-flow non-transferred DC plasma jets were generated by a torch with an inter-electrode insert by which the arc column was limited to a length of about 20 mm. Current–voltage characteristics, thermal efficiency and jet length, a parameter which changes greatly with the generating parameters in contrast with the almost unchangeable jet length of the turbulent plasma, were investigated systematically, by using the similarity theory combined with the corresponding experimental examination. Formulae in non-dimensional forms were derived for predicting the characteristics of the laminar plasma jet generation, within the parameter ranges where no transfer to turbulent flow occurs. Mean arc temperature in the torch channel and mean jet-flow temperature at the torch exit were obtained, and the results indicate that the thermal conductivity feature of the working gas seems to be an important factor affecting thermal efficiency of laminar plasma generation.