273 resultados para Graph API


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The fidelity of the folding pathways being encoded in the amino acid sequence is met with challenge in instances where proteins with no sequence homology, performing different functions and no apparent evolutionary linkage, adopt a similar fold. The problem stated otherwise is that a limited fold space is available to a repertoire of diverse sequences. The key question is what factors lead to the formation of a fold from diverse sequences. Here, with the NAD(P)-binding Rossmann fold domains as a case study and using the concepts of network theory, we have unveiled the consensus structural features that drive the formation of this fold. We have proposed a graph theoretic formalism to capture the structural details in terms of the conserved atomic interactions in global milieu, and hence extract the essential topological features from diverse sequences. A unified mathematical representation of the different structures together with a judicious concoction of several network parameters enabled us to probe into the structural features driving the adoption of the NAD(P)-binding Rossmann fold. The atomic interactions at key positions seem to be better conserved in proteins, as compared to the residues participating in these interactions. We propose a ``spatial motif'' and several ``fold specific hot spots'' that form the signature structural blueprints of the NAD(P)-binding Rossmann fold domain. Excellent agreement of our data with previous experimental and theoretical studies validates the robustness and validity of the approach. Additionally, comparison of our results with statistical coupling analysis (SCA) provides further support. The methodology proposed here is general and can be applied to similar problems of interest.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Low density parity-check (LDPC) codes are a class of linear block codes that are decoded by running belief propagation (BP) algorithm or log-likelihood ratio belief propagation (LLR-BP) over the factor graph of the code. One of the disadvantages of LDPC codes is the onset of an error floor at high values of signal to noise ratio caused by trapping sets. In this paper, we propose a two stage decoder to deal with different types of trapping sets. Oscillating trapping sets are taken care by the first stage of the decoder and the elementary trapping sets are handled by the second stage of the decoder. Simulation results on the regular PEG (504,252,3,6) code and the irregular PEG (1024,518,15,8) code shows that the proposed two stage decoder performs significantly better than the standard decoder.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this letter, we characterize the extrinsic information transfer (EXIT) behavior of a factor graph based message passing algorithm for detection in large multiple-input multiple-output (MIMO) systems with tens to hundreds of antennas. The EXIT curves of a joint detection-decoding receiver are obtained for low density parity check (LDPC) codes of given degree distributions. From the obtained EXIT curves, an optimization of the LDPC code degree profiles is carried out to design irregular LDPC codes matched to the large-MIMO channel and joint message passing receiver. With low complexity joint detection-decoding, these codes are shown to perform better than off-the-shelf irregular codes in the literature by about 1 to 1.5 dB at a coded BER of 10(-5) in 16 x 16, 64 x 64 and 256 x 256 MIMO systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Pervasive use of pointers in large-scale real-world applications continues to make points-to analysis an important optimization-enabler. Rapid growth of software systems demands a scalable pointer analysis algorithm. A typical inclusion-based points-to analysis iteratively evaluates constraints and computes a points-to solution until a fixpoint. In each iteration, (i) points-to information is propagated across directed edges in a constraint graph G and (ii) more edges are added by processing the points-to constraints. We observe that prioritizing the order in which the information is processed within each of the above two steps can lead to efficient execution of the points-to analysis. While earlier work in the literature focuses only on the propagation order, we argue that the other dimension, that is, prioritizing the constraint processing, can lead to even higher improvements on how fast the fixpoint of the points-to algorithm is reached. This becomes especially important as we prove that finding an optimal sequence for processing the points-to constraints is NP-Complete. The prioritization scheme proposed in this paper is general enough to be applied to any of the existing points-to analyses. Using the prioritization framework developed in this paper, we implement prioritized versions of Andersen's analysis, Deep Propagation, Hardekopf and Lin's Lazy Cycle Detection and Bloom Filter based points-to analysis. In each case, we report significant improvements in the analysis times (33%, 47%, 44%, 20% respectively) as well as the memory requirements for a large suite of programs, including SPEC 2000 benchmarks and five large open source programs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The n-interior point variant of the Erdos-Szekeres problem is to show the following: For any n, n-1, every point set in the plane with sufficient number of interior points contains a convex polygon containing exactly n-interior points. This has been proved only for n-3. In this paper, we prove it for pointsets having atmost logarithmic number of convex layers. We also show that any pointset containing atleast n interior points, there exists a 2-convex polygon that contains exactly n-interior points.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In many real world prediction problems the output is a structured object like a sequence or a tree or a graph. Such problems range from natural language processing to compu- tational biology or computer vision and have been tackled using algorithms, referred to as structured output learning algorithms. We consider the problem of structured classifi- cation. In the last few years, large margin classifiers like sup-port vector machines (SVMs) have shown much promise for structured output learning. The related optimization prob -lem is a convex quadratic program (QP) with a large num-ber of constraints, which makes the problem intractable for large data sets. This paper proposes a fast sequential dual method (SDM) for structural SVMs. The method makes re-peated passes over the training set and optimizes the dual variables associated with one example at a time. The use of additional heuristics makes the proposed method more efficient. We present an extensive empirical evaluation of the proposed method on several sequence learning problems.Our experiments on large data sets demonstrate that the proposed method is an order of magnitude faster than state of the art methods like cutting-plane method and stochastic gradient descent method (SGD). Further, SDM reaches steady state generalization performance faster than the SGD method. The proposed SDM is thus a useful alternative for large scale structured output learning.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we develop a game theoretic approach for clustering features in a learning problem. Feature clustering can serve as an important preprocessing step in many problems such as feature selection, dimensionality reduction, etc. In this approach, we view features as rational players of a coalitional game where they form coalitions (or clusters) among themselves in order to maximize their individual payoffs. We show how Nash Stable Partition (NSP), a well known concept in the coalitional game theory, provides a natural way of clustering features. Through this approach, one can obtain some desirable properties of the clusters by choosing appropriate payoff functions. For a small number of features, the NSP based clustering can be found by solving an integer linear program (ILP). However, for large number of features, the ILP based approach does not scale well and hence we propose a hierarchical approach. Interestingly, a key result that we prove on the equivalence between a k-size NSP of a coalitional game and minimum k-cut of an appropriately constructed graph comes in handy for large scale problems. In this paper, we use feature selection problem (in a classification setting) as a running example to illustrate our approach. We conduct experiments to illustrate the efficacy of our approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Boxicity of a graph G(V, E) is the minimum integer k such that G can be represented as the intersection graph of k-dimensional axis parallel boxes in Rk. Equivalently, it is the minimum number of interval graphs on the vertex set V such that the intersection of their edge sets is E. It is known that boxicity cannot be approximated even for graph classes like bipartite, co-bipartite and split graphs below O(n0.5-ε)-factor, for any ε > 0 in polynomial time unless NP = ZPP. Till date, there is no well known graph class of unbounded boxicity for which even an nε-factor approximation algorithm for computing boxicity is known, for any ε < 1. In this paper, we study the boxicity problem on Circular Arc graphs - intersection graphs of arcs of a circle. We give a (2+ 1/k)-factor polynomial time approximation algorithm for computing the boxicity of any circular arc graph along with a corresponding box representation, where k ≥ 1 is its boxicity. For Normal Circular Arc(NCA) graphs, with an NCA model given, this can be improved to an additive 2-factor approximation algorithm. The time complexity of the algorithms to approximately compute the boxicity is O(mn+n2) in both these cases and in O(mn+kn2) which is at most O(n3) time we also get their corresponding box representations, where n is the number of vertices of the graph and m is its number of edges. The additive 2-factor algorithm directly works for any Proper Circular Arc graph, since computing an NCA model for it can be done in polynomial time.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A $k$-box $B=(R_1,...,R_k)$, where each $R_i$ is a closed interval on the real line, is defined to be the Cartesian product $R_1\times R_2\times ...\times R_k$. If each $R_i$ is a unit length interval, we call $B$ a $k$-cube. Boxicity of a graph $G$, denoted as $\boxi(G)$, is the minimum integer $k$ such that $G$ is an intersection graph of $k$-boxes. Similarly, the cubicity of $G$, denoted as $\cubi(G)$, is the minimum integer $k$ such that $G$ is an intersection graph of $k$-cubes. It was shown in [L. Sunil Chandran, Mathew C. Francis, and Naveen Sivadasan: Representing graphs as the intersection of axis-parallel cubes. MCDES-2008, IISc Centenary Conference, available at CoRR, abs/cs/ 0607092, 2006.] that, for a graph $G$ with maximum degree $\Delta$, $\cubi(G)\leq \lceil 4(\Delta +1)\log n\rceil$. In this paper, we show that, for a $k$-degenerate graph $G$, $\cubi(G) \leq (k+2) \lceil 2e \log n \rceil$. Since $k$ is at most $\Delta$ and can be much lower, this clearly is a stronger result. This bound is tight. We also give an efficient deterministic algorithm that runs in $O(n^2k)$ time to output a $8k(\lceil 2.42 \log n\rceil + 1)$ dimensional cube representation for $G$. An important consequence of the above result is that if the crossing number of a graph $G$ is $t$, then $\boxi(G)$ is $O(t^{1/4}{\lceil\log t\rceil}^{3/4})$ . This bound is tight up to a factor of $O((\log t)^{1/4})$. We also show that, if $G$ has $n$ vertices, then $\cubi(G)$ is $O(\log n + t^{1/4}\log t)$. Using our bound for the cubicity of $k$-degenerate graphs we show that cubicity of almost all graphs in $\mathcal{G}(n,m)$ model is $O(d_{av}\log n)$, where $d_{av}$ denotes the average degree of the graph under consideration. model is O(davlogn).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Query focused summarization is the task of producing a compressed text of original set of documents based on a query. Documents can be viewed as graph with sentences as nodes and edges can be added based on sentence similarity. Graph based ranking algorithms which use 'Biased random surfer model' like topic-sensitive LexRank have been successfully applied to query focused summarization. In these algorithms, random walk will be biased towards the sentences which contain query relevant words. Specifically, it is assumed that random surfer knows the query relevance score of the sentence to where he jumps. However, neighbourhood information of the sentence to where he jumps is completely ignored. In this paper, we propose look-ahead version of topic-sensitive LexRank. We assume that random surfer not only knows the query relevance of the sentence to where he jumps but he can also look N-step ahead from that sentence to find query relevance scores of future set of sentences. Using this look ahead information, we figure out the sentences which are indirectly related to the query by looking at number of hops to reach a sentence which has query relevant words. Then we make the random walk biased towards even to the indirect query relevant sentences along with the sentences which have query relevant words. Experimental results show 20.2% increase in ROUGE-2 score compared to topic-sensitive LexRank on DUC 2007 data set. Further, our system outperforms best systems in DUC 2006 and results are comparable to state of the art systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A path in an edge colored graph is said to be a rainbow path if no two edges on the path have the same color. An edge colored graph is (strongly) rainbow connected if there exists a (geodesic) rainbow path between every pair of vertices. The (strong) rainbow connectivity of a graph G, denoted by (src(G), respectively) rc(G) is the smallest number of colors required to edge color the graph such that G is (strongly) rainbow connected. In this paper we study the rainbow connectivity problem and the strong rainbow connectivity problem from a computational point of view. Our main results can be summarised as below: 1) For every fixed k >= 3, it is NP-Complete to decide whether src(G) <= k even when the graph G is bipartite. 2) For every fixed odd k >= 3, it is NP-Complete to decide whether rc(G) <= k. This resolves one of the open problems posed by Chakraborty et al. (J. Comb. Opt., 2011) where they prove the hardness for the even case. 3) The following problem is fixed parameter tractable: Given a graph G, determine the maximum number of pairs of vertices that can be rainbow connected using two colors. 4) For a directed graph G, it is NP-Complete to decide whether rc(G) <= 2.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Low-complexity near-optimal detection of large-MIMO signals has attracted recent research. Recently, we proposed a local neighborhood search algorithm, namely reactive tabu search (RTS) algorithm, as well as a factor-graph based belief propagation (BP) algorithm for low-complexity large-MIMO detection. The motivation for the present work arises from the following two observations on the above two algorithms: i) Although RTS achieved close to optimal performance for 4-QAM in large dimensions, significant performance improvement was still possible for higher-order QAM (e.g., 16-, 64-QAM). ii) BP also achieved near-optimal performance for large dimensions, but only for {±1} alphabet. In this paper, we improve the large-MIMO detection performance of higher-order QAM signals by using a hybrid algorithm that employs RTS and BP. In particular, motivated by the observation that when a detection error occurs at the RTS output, the least significant bits (LSB) of the symbols are mostly in error, we propose to first reconstruct and cancel the interference due to bits other than LSBs at the RTS output and feed the interference cancelled received signal to the BP algorithm to improve the reliability of the LSBs. The output of the BP is then fed back to RTS for the next iteration. Simulation results show that the proposed algorithm performs better than the RTS algorithm, and semi-definite relaxation (SDR) and Gaussian tree approximation (GTA) algorithms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automated synthesis of mechanical designs is an important step towards the development of an intelligent CAD system. Research into methods for supporting conceptual design using automated synthesis has attracted much attention in the past decades. In our research, ten experimental studies are conducted to find out how designers synthesize solution concepts for multi-state mechanical devices. The designers are asked to think aloud, while carrying out the synthesis. These design synthesis processes are video recorded. It has been found that modification of kinematic pairs and mechanisms is the major activity carried out by all the designers. This paper presents an analysis of these synthesis processes using configuration space and topology graph to identify and classify the types of modifications that take place. Understanding of these modification processes and the context in which they happened is crucial to develop a system for supporting design synthesis of multiple state mechanical devices that is capable of creating a comprehensive variety of solution alternatives.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Text segmentation and localization algorithms are proposed for the born-digital image dataset. Binarization and edge detection are separately carried out on the three colour planes of the image. Connected components (CC's) obtained from the binarized image are thresholded based on their area and aspect ratio. CC's which contain sufficient edge pixels are retained. A novel approach is presented, where the text components are represented as nodes of a graph. Nodes correspond to the centroids of the individual CC's. Long edges are broken from the minimum spanning tree of the graph. Pair wise height ratio is also used to remove likely non-text components. A new minimum spanning tree is created from the remaining nodes. Horizontal grouping is performed on the CC's to generate bounding boxes of text strings. Overlapping bounding boxes are removed using an overlap area threshold. Non-overlapping and minimally overlapping bounding boxes are used for text segmentation. Vertical splitting is applied to generate bounding boxes at the word level. The proposed method is applied on all the images of the test dataset and values of precision, recall and H-mean are obtained using different approaches.