989 resultados para Weighted adjacency matrix
Resumo:
The increasing resistance of Mycobacterium tuberculosis to the existing drugs has alarmed the worldwide scientific community. In an attempt to overcome this problem, two models for the design and prediction of new antituberculosis agents were obtained. The first used a mixed approach, containing descriptors based on fragments and the topological substructural molecular design approach (TOPS-MODE) descriptors. The other model used a combination of two-dimensional (2D) and three-dimensional (3D) descriptors. A data set of 167 compounds with great structural variability, 72 of them antituberculosis agents and 95 compounds belonging to other pharmaceutical categories, was analyzed. The first model showed sensitivity, specificity, and accuracy values above 80% and the second one showed values higher than 75% for these statistical indices. Subsequently, 12 structures of imidazoles not included in this study were designed, taking into account the two models. In both cases accuracy was 100%, showing that the methodology in silico developed by us is promising for the rational design of antituberculosis drugs.
Resumo:
In this work, a performance analysis of transmission schemes employing turbo trellis coded modulation. In general, the performance analysis of such schemes is guided by evaluating the error probability of these schemes. The exact evaluation of this probability is very complex and inefficient from the computational point of view, a widely used alternative is the use of union bound of error probability, because of its easy implementation and computational produce bounds that converge quickly. Since it is the union bound, it should use to expurge some elements of distance spectrum to obtain a tight bound. The main contribution of this work is that the listing proposal is carried out from the puncturing at the level of symbol rather than bit-level as in most works of literature. The main reason for using the symbol level puncturing lies in the fact that the enummerating function of the turbo scheme is obtained directly from complex sequences of signals through the trellis and not indirectly from the binary sequences that require further binary to complex mapping, as proposed by previous works. Thus, algorithms can be applied through matrix from the adjacency matrix, which is obtained by calculating the distances of the complex sequences of the trellis. This work also presents two matrix algorithms for state reduction and the evaluation of the transfer function of this. The results presented in comparisons of the bounds obtained using the proposed technique with some turbo codes of the literature corroborate the proposition of this paper that the expurgated bounds obtained are quite tight and matrix algorithms are easily implemented in any programming software language
Resumo:
We present a nestedness index that measures the nestedness pattern of bipartite networks, a problem that arises in theoretical ecology. Our measure is derived using the sum of distances of the occupied elements in the adjacency matrix of the network. This index quantifies directly the deviation of a given matrix from the nested pattern. In the most simple case the distance of the matrix element ai,j is di,j = i+j, the Manhattan distance. A generic distance is obtained as di,j = (i¬ + j¬)1/¬. The nestedness índex is defined by = 1 − where is the temperature of the matrix. We construct the temperature index using two benchmarks: the distance of the complete nested matrix that corresponds to zero temperature and the distance of the average random matrix that is defined as temperature one. We discuss an important feature of the problem: matrix occupancy. We address this question using a metric index ¬ that adjusts for matrix occupancy
Resumo:
The occurrence of problems related to the scattering and tangling phenomenon, such as the difficulty to do system maintenance, increasingly frequent. One way to solve this problem is related to the crosscutting concerns identification. To maximize its benefits, the identification must be performed from early stages of development process, but some works have reported that this has not been done in most of cases, making the system development susceptible to the errors incidence and prone to the refactoring later. This situation affects directly to the quality and cost of the system. PL-AOVgraph is a goal-oriented requirements modeling language which offers support to the relationships representation among requirements and provides separation of crosscutting concerns by crosscutting relationships representation. Therefore, this work presents a semi-automatic method to crosscutting concern identification in requirements specifications written in PL-AOVgraph. An adjacency matrix is used to identify the contributions relationships among the elements. The crosscutting concern identification is based in fan-out analysis of contribution relationships from the informations of adjacency matrix. When identified, the crosscutting relationships are created. And also, this method is implemented as a new module of ReqSys-MDD tool
Discriminating Different Classes of Biological Networks by Analyzing the Graphs Spectra Distribution
Resumo:
The brain's structural and functional systems, protein-protein interaction, and gene networks are examples of biological systems that share some features of complex networks, such as highly connected nodes, modularity, and small-world topology. Recent studies indicate that some pathologies present topological network alterations relative to norms seen in the general population. Therefore, methods to discriminate the processes that generate the different classes of networks (e. g., normal and disease) might be crucial for the diagnosis, prognosis, and treatment of the disease. It is known that several topological properties of a network (graph) can be described by the distribution of the spectrum of its adjacency matrix. Moreover, large networks generated by the same random process have the same spectrum distribution, allowing us to use it as a "fingerprint". Based on this relationship, we introduce and propose the entropy of a graph spectrum to measure the "uncertainty" of a random graph and the Kullback-Leibler and Jensen-Shannon divergences between graph spectra to compare networks. We also introduce general methods for model selection and network model parameter estimation, as well as a statistical procedure to test the nullity of divergence between two classes of complex networks. Finally, we demonstrate the usefulness of the proposed methods by applying them to (1) protein-protein interaction networks of different species and (2) on networks derived from children diagnosed with Attention Deficit Hyperactivity Disorder (ADHD) and typically developing children. We conclude that scale-free networks best describe all the protein-protein interactions. Also, we show that our proposed measures succeeded in the identification of topological changes in the network while other commonly used measures (number of edges, clustering coefficient, average path length) failed.
Resumo:
The estimation of modal parameters of a structure from ambient measurements has attracted the attention of many researchers in the last years. The procedure is now well established and the use of state space models, stochastic system identification methods and stabilization diagrams allows to identify the modes of the structure. In this paper the contribution of each identified mode to the measured vibration is discussed. This modal contribution is computed using the Kalman filter and it is an indicator of the importance of the modes. Also the variation of the modal contribution with the order of the model is studied. This analysis suggests selecting the order for the state space model as the order that includes the modes with higher contribution. The order obtained using this method is compared to those obtained using other well known methods, like Akaike criteria for time series or the singular values of the weighted projection matrix in the Stochastic Subspace Identification method. Finally, both simulated and measured vibration data are used to show the practicability of the derived technique. Finally, it is important to remark that the method can be used with any identification method working in the state space model.
Resumo:
We propose a novel measure to assess the presence of meso-scale structures in complex networks. This measure is based on the identi?cation of regular patterns in the adjacency matrix of the network, and on the calculation of the quantity of information lost when pairs of nodes are iteratively merged. We show how this measure is able to quantify several meso-scale structures, like the presence of modularity, bipartite and core-periphery con?gurations, or motifs. Results corresponding to a large set of real networks are used to validate its ability to detect non-trivial topological patterns.
Resumo:
In this paper new results on personalized PageRank are shown. We consider directed graphs that may contain dangling nodes. The main result presented gives an analytical characterization of all the possible values of the personalized PageRank for any node.We use this result to give a theoretical justification of a recent model that uses the personalized PageRank to classify users of Social Networks Sites. We introduce new concepts concerning competitivity and leadership in complex networks. We also present some theoretical techniques to locate leaders and competitors which are valid for any personalization vector and by using only information related to the adjacency matrix of the graph and the distribution of its dangling nodes.
Resumo:
El objetivo del presente trabajo de investigación es explorar nuevas técnicas de implementación, basadas en grafos, para las Redes de Neuronas, con el fin de simplificar y optimizar las arquitecturas y la complejidad computacional de las mismas. Hemos centrado nuestra atención en una clase de Red de Neuronas: las Redes de Neuronas Recursivas (RNR), también conocidas como redes de Hopfield. El problema de obtener la matriz sináptica asociada con una RNR imponiendo un determinado número de vectores como puntos fijos, no está en absoluto resuelto, el número de vectores prototipo que pueden ser almacenados en la red, cuando se utiliza la ley de Hebb, es bastante limitado, la red se satura rápidamente cuando se pretende almacenar nuevos prototipos. La ley de Hebb necesita, por tanto, ser revisada. Algunas aproximaciones dirigidas a solventar dicho problema, han sido ya desarrolladas. Nosotros hemos desarrollado una nueva aproximación en la forma de implementar una RNR en orden a solucionar estos problemas. La matriz sináptica es obtenida mediante la superposición de las componentes de los vectores prototipo, sobre los vértices de un Grafo, lo cual puede ser también interpretado como una coloración de dicho grafo. Cuando el periodo de entrenamiento se termina, la matriz de adyacencia del Grafo Resultante o matriz de pesos, presenta ciertas propiedades por las cuales dichas matrices serán llamadas tetraédricas. La energía asociada a cualquier estado de la red es representado por un punto (a,b) de R2. Cada uno de los puntos de energía asociados a estados que disten lo mismo del vector cero está localizado sobre la misma línea de energía de R2. El espacio de vectores de estado puede, por tanto, clasificarse en n clases correspondientes a cada una de las n diferentes distancias que puede tener cualquier vector al vector cero. La matriz (n x n) de pesos puede reducirse a un n-vector; de esta forma, tanto el tiempo de computación como el espacio de memoria requerido par almacenar los pesos, son simplificados y optimizados. En la etapa de recuperación, es introducido un vector de parámetros R2, éste es utilizado para controlar la capacidad de la red: probaremos que lo mayor es la componente a¡, lo menor es el número de puntos fijos pertenecientes a la línea de energía R¡. Una vez que la capacidad de la red ha sido controlada mediante este parámetro, introducimos otro parámetro, definido como la desviación del vector de pesos relativos, este parámetro sirve para disminuir ostensiblemente el número de parásitos. A lo largo de todo el trabajo, hemos ido desarrollando un ejemplo, el cual nos ha servido para ir corroborando los resultados teóricos, los algoritmos están escritos en un pseudocódigo, aunque a su vez han sido implamentados utilizando el paquete Mathematica 2.2., mostrándolos en un volumen suplementario al texto.---ABSTRACT---The aim of the present research is intended to explore new specifícation techniques of Neural Networks based on Graphs to be used in the optimization and simplification of Network Architectures and Computational Complexhy. We have focused our attention in a, well known, class of Neural Networks: the Recursive Neural Networks, also known as Hopfield's Neural Networks. The general problem of constructing the synaptic matrix associated with a Recursive Neural Network imposing some vectors as fixed points is fer for completery solved, the number of prototype vectors (learning patterns) which can be stored by Hebb's law is rather limited and the memory will thus quickly reach saturation if new prototypes are continuously acquired in the course of time. Hebb's law needs thus to be revised in order to allow new prototypes to be stored at the expense of the older ones. Some approaches related with this problem has been developed. We have developed a new approach of implementing a Recursive Neural Network in order to sob/e these kind of problems, the synaptic matrix is obtained superposing the components of the prototype vectors over the vértices of a Graph which may be interpreted as a coloring of the Graph. When training is finished the adjacency matrix of the Resulting Graph or matrix of weights presents certain properties for which it may be called a tetrahedral matrix The energy associated to any possible state of the net is represented as a point (a,b) in R2. Every one of the energy points associated with state-vectors having the same Hamming distance to the zero vector are located over the same energy Une in R2. The state-vector space may be then classified in n classes according to the n different possible distances firom any of the state-vectors to the zero vector The (n x n) matrix of weights may also be reduced to a n-vector of weights, in this way the computational time and the memory space required for obtaining the weights is optimized and simplified. In the recall stage, a parameter vectora is introduced, this parameter is used for controlling the capacity of the net: it may be proved that the bigger is the r, component of J, the lower is the number of fixed points located in the r¡ energy line. Once the capacity of the net has been controlled by the ex parameter, we introduced other parameter, obtained as the relative weight vector deviation parameter, in order to reduce the number of spurious states. All along the present text, we have also developed an example, which serves as a prove for the theoretical results, the algorithms are shown in a pseudocode language in the text, these algorithm so as the graphics have been developed also using the Mathematica 2.2. mathematical package which are shown in a supplementary volume of the text.
Resumo:
Uma imagem engloba informação que precisa ser organizada para interpretar e compreender seu conteúdo. Existem diversas técnicas computacionais para extrair a principal informação de uma imagem e podem ser divididas em três áreas: análise de cor, textura e forma. Uma das principais delas é a análise de forma, por descrever características de objetos baseadas em seus pontos fronteira. Propomos um método de caracterização de imagens, por meio da análise de forma, baseada nas propriedades espectrais do laplaciano em grafos. O procedimento construiu grafos G baseados nos pontos fronteira do objeto, cujas conexões entre vértices são determinadas por limiares T_l. A partir dos grafos obtêm-se a matriz de adjacência A e a matriz de graus D, as quais definem a matriz Laplaciana L=D -A. A decomposição espectral da matriz Laplaciana (autovalores) é investigada para descrever características das imagens. Duas abordagens são consideradas: a) Análise do vetor característico baseado em limiares e a histogramas, considera dois parâmetros o intervalo de classes IC_l e o limiar T_l; b) Análise do vetor característico baseado em vários limiares para autovalores fixos; os quais representam o segundo e último autovalor da matriz L. As técnicas foram testada em três coleções de imagens: sintéticas (Genéricas), parasitas intestinais (SADPI) e folhas de plantas (CNShape), cada uma destas com suas próprias características e desafios. Na avaliação dos resultados, empregamos o modelo de classificação support vector machine (SVM), o qual avalia nossas abordagens, determinando o índice de separação das categorias. A primeira abordagem obteve um acerto de 90 % com a coleção de imagens Genéricas, 88 % na coleção SADPI, e 72 % na coleção CNShape. Na segunda abordagem, obtém-se uma taxa de acerto de 97 % com a coleção de imagens Genéricas; 83 % para SADPI e 86 % no CNShape. Os resultados mostram que a classificação de imagens a partir do espectro do Laplaciano, consegue categorizá-las satisfatoriamente.
Resumo:
The availability of the sheer volume of online product reviews makes it possible to derive implicit demographic information of product adopters from review documents. This paper proposes a novel approach to the extraction of product adopter mentions from online reviews. The extracted product adopters are the ncategorise into a number of different demographic user groups. The aggregated demographic information of many product adopters can be used to characterize both products and users, which can be incorporated into a recommendation method using weighted regularised matrix factorisation. Our experimental results on over 15 million reviews crawled from JINGDONG, the largest B2C e-commerce website in China, show the feasibility and effectiveness of our proposed frame work for product recommendation.
Resumo:
Subspaces and manifolds are two powerful models for high dimensional signals. Subspaces model linear correlation and are a good fit to signals generated by physical systems, such as frontal images of human faces and multiple sources impinging at an antenna array. Manifolds model sources that are not linearly correlated, but where signals are determined by a small number of parameters. Examples are images of human faces under different poses or expressions, and handwritten digits with varying styles. However, there will always be some degree of model mismatch between the subspace or manifold model and the true statistics of the source. This dissertation exploits subspace and manifold models as prior information in various signal processing and machine learning tasks.
A near-low-rank Gaussian mixture model measures proximity to a union of linear or affine subspaces. This simple model can effectively capture the signal distribution when each class is near a subspace. This dissertation studies how the pairwise geometry between these subspaces affects classification performance. When model mismatch is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the model mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. This linear transformation, termed TRAIT, also preserves some specific features in each class, being complementary to a recently developed Low Rank Transform (LRT). Moreover, when the model mismatch is more significant, TRAIT shows superior performance compared to LRT.
The manifold model enforces a constraint on the freedom of data variation. Learning features that are robust to data variation is very important, especially when the size of the training set is small. A learning machine with large numbers of parameters, e.g., deep neural network, can well describe a very complicated data distribution. However, it is also more likely to be sensitive to small perturbations of the data, and to suffer from suffer from degraded performance when generalizing to unseen (test) data.
From the perspective of complexity of function classes, such a learning machine has a huge capacity (complexity), which tends to overfit. The manifold model provides us with a way of regularizing the learning machine, so as to reduce the generalization error, therefore mitigate overfiting. Two different overfiting-preventing approaches are proposed, one from the perspective of data variation, the other from capacity/complexity control. In the first approach, the learning machine is encouraged to make decisions that vary smoothly for data points in local neighborhoods on the manifold. In the second approach, a graph adjacency matrix is derived for the manifold, and the learned features are encouraged to be aligned with the principal components of this adjacency matrix. Experimental results on benchmark datasets are demonstrated, showing an obvious advantage of the proposed approaches when the training set is small.
Stochastic optimization makes it possible to track a slowly varying subspace underlying streaming data. By approximating local neighborhoods using affine subspaces, a slowly varying manifold can be efficiently tracked as well, even with corrupted and noisy data. The more the local neighborhoods, the better the approximation, but the higher the computational complexity. A multiscale approximation scheme is proposed, where the local approximating subspaces are organized in a tree structure. Splitting and merging of the tree nodes then allows efficient control of the number of neighbourhoods. Deviation (of each datum) from the learned model is estimated, yielding a series of statistics for anomaly detection. This framework extends the classical {\em changepoint detection} technique, which only works for one dimensional signals. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.
Resumo:
The paper begins with a new characterization of (k,τ)(k,τ)-regular sets. Then, using this result as well as the theory of star complements, we derive a simplex-like algorithm for determining whether or not a graph contains a (0,τ)(0,τ)-regular set. When τ=1τ=1, this algorithm can be applied to solve the efficient dominating set problem which is known to be NP-complete. If −1−1 is not an eigenvalue of the adjacency matrix of the graph, this particular algorithm runs in polynomial time. However, although it does not work in polynomial time in general, we report on its successful application to a vast set of randomly generated graphs.
Resumo:
Consider two graphs G and H. Let H^k[G] be the lexicographic product of H^k and G, where H^k is the lexicographic product of the graph H by itself k times. In this paper, we determine the spectrum of H^k[G]H and H^k when G and H are regular and the Laplacian spectrum of H^k[G] and H^k for G and H arbitrary. Particular emphasis is given to the least eigenvalue of the adjacency matrix in the case of lexicographic powers of regular graphs, and to the algebraic connectivity and the largest Laplacian eigenvalues in the case of lexicographic powers of arbitrary graphs. This approach allows the determination of the spectrum (in case of regular graphs) and Laplacian spectrum (for arbitrary graphs) of huge graphs. As an example, the spectrum of the lexicographic power of the Petersen graph with the googol number (that is, 10^100 ) of vertices is determined. The paper finishes with the extension of some well known spectral and combinatorial invariant properties of graphs to its lexicographic powers.