4 resultados para Database accession number
em University of Queensland eSpace - Australia
Resumo:
Limited but significant sequence similarity has been observed between an uncharacterized human protein, SIN1, and the S. pombe SIN1, Dictyostelium RIP3 and S. cerevisiae AVO1 proteins. The human Sin1 gene has been automatically predicted (MAPKAP1; GenBank accession number NM_024117); however, this sequence appears to be incomplete. In this study, we have cloned and characterized the full-length human Sin1 mRNA and identified a highly conserved domain that defines the family of SIN1 orthologues, members of which are widely distributed in the fungal and metazoan kingdoms. We demonstrate that Sin1 transcripts can use alternative polyadenylation signals and describe a number of Sin1 splice variants that potentially encode functionally different isoforms. (C) 2004 Elsevier B.V. All rights reserved.
Resumo:
Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.
Resumo:
Candida albicans is a pathogen commonly infecting patients who receive immunosuppressive drug therapy, long-term catheterization, or those who suffer from acquired immune deficiency syndrome (AIDS). The major factor accountable for pathogenicity of C. albicans is host immune status. Various virulence molecules, or factors, of are also responsible for the disease progression. Virulence proteins are published in public databases but they normally lack detailed functional annotations. We have developed CandiVF, a specialized database of C. albicans virulence factors (http://antigen.i2r.a-star.edu.sg/Templar/DB/CandiVF/) to facilitate efficient extraction and analysis of data aimed to assist research on immune responses, pathogenesis, prevention, and control of candidiasis. CandiVF contains a large number of annotated virulence proteins, including secretory, cell wall-associated, membrane, cytoplasmic, and nuclear proteins. This database has in-built bioinformatics tools including keyword and BLAST search, visualization of 3D-structures, HLA-DR epitope prediction, virulence descriptors, and virulence factors ontology.
Resumo:
With rapid advances in video processing technologies and ever fast increments in network bandwidth, the popularity of video content publishing and sharing has made similarity search an indispensable operation to retrieve videos of user interests. The video similarity is usually measured by the percentage of similar frames shared by two video sequences, and each frame is typically represented as a high-dimensional feature vector. Unfortunately, high complexity of video content has posed the following major challenges for fast retrieval: (a) effective and compact video representations, (b) efficient similarity measurements, and (c) efficient indexing on the compact representations. In this paper, we propose a number of methods to achieve fast similarity search for very large video database. First, each video sequence is summarized into a small number of clusters, each of which contains similar frames and is represented by a novel compact model called Video Triplet (ViTri). ViTri models a cluster as a tightly bounded hypersphere described by its position, radius, and density. The ViTri similarity is measured by the volume of intersection between two hyperspheres multiplying the minimal density, i.e., the estimated number of similar frames shared by two clusters. The total number of similar frames is then estimated to derive the overall similarity between two video sequences. Hence the time complexity of video similarity measure can be reduced greatly. To further reduce the number of similarity computations on ViTris, we introduce a new one dimensional transformation technique which rotates and shifts the original axis system using PCA in such a way that the original inter-distance between two high-dimensional vectors can be maximally retained after mapping. An efficient B+-tree is then built on the transformed one dimensional values of ViTris' positions. Such a transformation enables B+-tree to achieve its optimal performance by quickly filtering a large portion of non-similar ViTris. Our extensive experiments on real large video datasets prove the effectiveness of our proposals that outperform existing methods significantly.