2 resultados para Knowledge-based asets
em National Center for Biotechnology Information - NCBI
Resumo:
We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.
Resumo:
Evolutionary selection of sequences is studied with a knowledge-based Hamiltonian to find the design principle for folding to a model protein structure. With sequences selected by naive energy minimization, the model structure tends to be unstable and the folding ability is low. Sequences with high folding ability have only the low-lying energy minimum but also an energy landscape which is similar to that found for the native sequence over a wide region of the conformation space. Though there is a large fluctuation in foldable sequences, the hydrophobicity pattern and the glycine locations are preserved among them. Implications of the design principle for the molecular mechanism of folding are discussed.