48 resultados para Protein Interaction Mapping

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We introduce a new topological concept called k-partite protein cliques to study protein interaction (PPI) networks. In particular, we examine functional coherence of proteins in k-partite protein cliques. A k-partite protein clique is a k-partite maximal clique comprising two or more nonoverlapping protein subsets between any two of which full interactions are exhibited. In the detection of PPI’s k-partite maximal cliques, we propose to transform PPI networks into induced K-partite graphs with proteins as vertices where edges only exist among the graph’s partites. Then, we present a k-partite maximal clique mining (MaCMik) algorithm to enumerate k-partite maximal cliques from K-partite graphs. Our MaCMik algorithm is applied to a yeast PPI network. We observe that there does exist interesting and unusually high functional coherence in k-partite protein cliques—most proteins in k-partite protein cliques, especially those in the same partites, share the same functions. Therefore, the idea of k-partite protein cliques suggests a novel approach to characterizing PPI networks, and may help function prediction for unknown proteins.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Protein-protein interaction networks constructed by high throughput technologies provide opportunities for predicting protein functions. A lot of approaches and algorithms have been applied on PPI networks to predict functions of unannotated proteins over recent decades. However, most of existing algorithms and approaches do not consider unannotated proteins and their corresponding interactions in the prediction process. On the other hand, algorithms which make use of unannotated proteins have limited prediction performance. Moreover, current algorithms are usually one-off predictions. In this paper, we propose an iterative approach that utilizes unannotated proteins and their interactions in prediction. We conducted experiments to evaluate the performance and robustness of the proposed iterative approach. The iterative approach maximally improved the prediction performance by 50%-80% when there was a high proportion of unannotated neighborhood protein in the network. The iterative approach also showed robustness in various types of protein interaction network. Importantly, our iterative approach initially proposes an idea that iteratively incorporates the interaction information of unannotated proteins into the protein function prediction and can be applied on existing prediction algorithms to improve prediction performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Neocarzinostatin (NCS) a potent DNA-damaging, anti-tumor toxin extracted from Streptomyces carzinostaticus that recognizes double-stranded DNA bulge and induces DNA damage. 2 Fluoro (2F) Modified EpCAM RNA aptamer is a 23-mer that targets EpCAM protein, expressed on the surface of epithelial tumor cells. Understanding the interaction between NCS and the ligand is important for carrying out the targeted tumor therapy. In this study, we have investigated the biophysical interactions between NCS and 2-fluro Modified EpCAM RNA aptamer using Circular Dichroism (CD) and Infra-Red (IR) spectroscopy. The aromatic amino acid residues spanning the β sheets of NCS are found to participate in intermolecular interactions with 2 F Modified EpCAM RNA aptamer. In-silico modeling and simulation studies corroborate with CD spectra data. Furthermore, it reinforces the involvement of C and D1 strand of NCS in intermolecular interactions with EpCAM RNA aptamer. This the first report on interactions involved in the stabilization of NCS-EpCAM aptamer complex and will aid in the development of therapeutic modalities towards targeted cancer therapy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This research has expanded the knowledge in Bioinformatics and Data mining. It makes an influential contribution to the future research in this area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimental study: Common Pathway Scanning (CPS) and Common Module Profiling (CMP). CPS is based on the assumption that common phenotypes are associated with dysfunction in proteins that participate in the same complex or pathway. CPS applies network data derived from proteinprotein interaction (PPI) and pathway databases to identify relationships between genes. CMP identifies likely candidates using a domain-dependent sequence similarity approach, based on the hypothesis that disruption of genes of similar function will lead to the same phenotype. Both algorithms use two forms of input data: known disease genes or multiple disease loci. When using known disease genes as input, our combined methods have a sensitivity of 0.52 and a specificity of 0.97 and reduce the candidate list by 13-fold. Using multiple loci, our methods successfully identify disease genes for all benchmark diseases with a sensitivity of 0.84 and a specificity of 0.63. Our combined approach prioritizes good candidates and will accelerate the disease gene discovery process.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background The past few years have seen a rapid development in novel high-throughput technologies that have created large-scale data on protein-protein interactions (PPI) across human and most model species. This data is commonly represented as networks, with nodes representing proteins and edges representing the PPIs. A fundamental challenge to bioinformatics is how to interpret this wealth of data to elucidate the interaction of patterns and the biological characteristics of the proteins. One significant purpose of this interpretation is to predict unknown protein functions. Although many approaches have been proposed in recent years, the challenge still remains how to reasonably and precisely measure the functional similarities between proteins to improve the prediction effectiveness.

Results We used a Semantic and Layered Protein Function Prediction (SLPFP) framework to more effectively predict unknown protein functions at different functional levels. The framework relies on a new protein similarity measurement and a clustering-based protein function prediction algorithm. The new protein similarity measurement incorporates the topological structure of the PPI network, as well as the protein's semantic information in terms of known protein functions at different functional layers. Experiments on real PPI datasets were conducted to evaluate the effectiveness of the proposed framework in predicting unknown protein functions.

Conclusion The proposed framework has a higher prediction accuracy compared with other similar approaches. The prediction results are stable even for a large number of proteins. Furthermore, the framework is able to predict unknown functions at different functional layers within the Munich Information Center for Protein Sequence (MIPS) hierarchical functional scheme. The experimental results demonstrated that the new protein similarity measurement reflects more reasonably and precisely relationships between proteins.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

As one of the primary substances in a living organism, protein defines the character of each cell by interacting with the cellular environment to promote the cell’s growth and function [1]. Previous studies on proteomics indicate that the functions of different proteins could be assigned based upon protein structures [2,3]. The knowledge on protein structures gives us an overview of protein fold space and is helpful for the understanding of the evolutionary principles behind structure. By observing the architectures and topologies of the protein families, biological processes can be investigated more directly with much higher resolution and finer detail. For this reason, the analysis of protein, its structure and the interaction with the other materials is emerging as an important problem in bioinformatics. However, the determination of protein structures is experimentally expensive and time consuming, this makes scientists largely dependent on sequence rather than more general structure to infer the function of the protein at the present time. For this reason, data mining technology is introduced into this area to provide more efficient data processing and knowledge discovery approaches.

Unlike many data mining applications which lack available data, the protein structure determination problem and its interaction study, on the contrary, could utilize a vast amount of biologically relevant information on protein and its interaction, such as the protein data bank (PDB) [4], the structural classification of proteins (SCOP) databases [5], CATH databases [6], UniProt [7], and others. The difficulty of predicting protein structures, specially its 3D structures, and the interactions between proteins as shown in Figure 6.1, lies in the computational complexity of the data. Although a large number of approaches have been developed to determine the protein structures such as ab initio modelling [8], homology modelling [9] and threading [10], more efficient and reliable methods are still greatly needed.

In this chapter, we will introduce a state-of-the-art data mining technique, graph mining, which is good at defining and discovering interesting structural patterns in graphical data sets, and take advantage of its expressive power to study protein structures, including protein structure prediction and comparison, and protein-protein interaction (PPI). The current graph pattern mining methods will be described, and typical algorithms will be presented, together with their applications in the protein structure analysis.

The rest of the chapter is organized as follows: Section 6.2 will give a brief introduction of the fundamental knowledge of protein, the publicly accessible protein data resources and the current research status of protein analysis; in Section 6.3, we will pay attention to one of the state-of-the-art data mining methods, graph mining; then Section 6.4 surveys several existing work for protein structure analysis using advanced graph mining methods in the recent decade; finally, in Section 6.5, a conclusion with potential further work will be summarized.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Current approaches of predicting protein functions from a protein-protein interaction (PPI) dataset are based on an assumption that the available functions of the proteins (a.k.a. annotated proteins) will determine the functions of the proteins whose functions are unknown yet at the moment (a.k.a. un-annotated proteins). Therefore, the protein function prediction is a mono-directed and one-off procedure, i.e. from annotated proteins to un-annotated proteins. However, the interactions between proteins are mutual rather than static and mono-directed, although functions of some proteins are unknown for some reasons at present. That means when we use the similarity-based approach to predict functions of un-annotated proteins, the un-annotated proteins, once their functions are predicted, will affect the similarities between proteins, which in turn will affect the prediction results. In other words, the function prediction is a dynamic and mutual procedure. This dynamic feature of protein interactions, however, was not considered in the existing prediction algorithms.

Results: In this paper, we propose a new prediction approach that predicts protein functions iteratively. This iterative approach incorporates the dynamic and mutual features of PPI interactions, as well as the local and global semantic influence of protein functions, into the prediction. To guarantee predicting functions iteratively, we propose a new protein similarity from protein functions. We adapt new evaluation metrics to evaluate the prediction quality of our algorithm and other similar algorithms. Experiments on real PPI datasets were conducted to evaluate the effectiveness of the proposed approach in predicting unknown protein functions.

Conclusions:
The iterative approach is more likely to reflect the real biological nature between proteins when predicting functions. A proper definition of protein similarity from protein functions is the key to predicting functions iteratively. The evaluation results demonstrated that in most cases, the iterative approach outperformed non-iterative ones with higher prediction quality in terms of prediction precision, recall and F-value.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Predicting protein functions computationally from massive proteinprotein interaction (PPI) data generated by high-throughput technology is one of the challenges and fundamental problems in the post-genomic era. Although there have been many approaches developed for computationally predicting protein functions, the mutual correlations among proteins in terms of protein functions have not been thoroughly investigated and incorporated into existing prediction methods, especially in voting based prediction methods. In this paper, we propose an innovative method to predict protein functions from PPI data by aggregating the functional correlations among relevant proteins using the Choquet-Integral in fuzzy theory. This functional aggregation measures the real impact of each relevant protein function on the final prediction results, and reduces the impact of repeated functional information on the prediction. Accordingly, a new protein similarity and a new iterative prediction algorithm are proposed in this paper. The experimental evaluations on real PPI datasets demonstrate the effectiveness of our method.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Current similarity-based approaches of predicting protein functions from protein-protein interaction (PPI) data usually make use of available information in the PPI network to predict functions of un-annotated proteins, and the prediction is a one-off procedure. However the interactions between proteins are more likely to be mutual rather than static and mono-directed. In other words, the un-annotated proteins, once their functions are predicted, will in turn affect the similarities between proteins. In this paper, we propose an innovative iteration algorithm that incorporates this dynamic feature of protein interaction into the protein function prediction, aiming to achieve higher prediction accuracies and get more reasonable results. With our algorithm, instead of one-off function predictions, functions are assigned to an unannotated protein iteratively until the functional similarities between proteins achieve a stable state. The experimental results show that our iterative method can provide better prediction results than one-off prediction methods with higher prediction accuracies, and is stable for large protein datasets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Predicting functions of un-annotated proteins is a significant challenge in the post-genomics era. Among existing computational approaches, exploiting interactions between proteins to predict functions of un-annotated proteins is widely used. However, it remains difficult to extract semantic associations between proteins (i.e. protein associations in terms of protein functionality) from protein interactions and incorporate extracted semantic associations to more effectively predict protein functions. Furthermore, existing approaches and algorithms regard the function prediction as a one-off procedure, ignoring dynamic and mutual associations between proteins. Therefore, deriving and exploiting semantic associations between proteins to dynamically predict functions are a promising and challenging approach for achieving better prediction results. In this paper, we propose an innovative algorithm to incorporate semantic associations between proteins into a dynamic procedure of protein function prediction. The semantic association between two proteins is measured by the semantic similarity of two proteins which is defined by the similarities of functions two proteins possess. To achieve better prediction results, function similarities are also incorporated into the prediction procedure. The algorithm dynamically predicts functions by iteratively selecting functions for the un-annotated protein and updating the similarities between the un-annotated protein and its neighbour annotated proteins until such suitable functions are selected that the similarities no longer change. The experimental results on real protein interaction datasets demonstrated that our method outperformed the similar and non-dynamic function prediction methods. Incorporating semantic associations between proteins into a dynamic procedure of function prediction reflects intrinsic relationships among proteins as well as dynamic features of protein interactions, and therefore, can significantly improve prediction results.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The availability of large amounts of protein-protein interaction (PPI) data makes it feasible to use computational approaches to predict protein functions. The base of existing computational approaches is to exploit the known function information of annotated proteins in the PPI data to predict functions of un-annotated proteins. However, these approaches consider the prediction domain (i.e. the set of proteins from which the functions are predicted) as unchangeable during the prediction procedure. This may lead to valuable information being overwhelmed by the unavoidable noise information in the PPI data when predicting protein functions, and in turn, the prediction results will be distorted. In this paper, we propose a novel method to dynamically predict protein functions from the PPI data. Our method regards the function prediction as a dynamic process of finding a suitable prediction domain, from which representative functions of the domain are selected to predict functions of un-annotated proteins. Our method exploits the topological structural information of a PPI network and the semantic relationship between protein functions to measure the relationship between proteins, dynamically select a suitable prediction domain and predict functions. The evaluation on real PPI datasets demonstrated the effectiveness of our proposed method, and generated better prediction results.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods.