34 resultados para evolutionary genomics
Resumo:
Background: In the post-genomic era where sequences are being determined at a rapid rate, we are highly reliant on computational methods for their tentative biochemical characterization. The Pfam database currently contains 3,786 families corresponding to ``Domains of Unknown Function'' (DUF) or ``Uncharacterized Protein Family'' (UPF), of which 3,087 families have no reported three-dimensional structure, constituting almost one-fourth of the known protein families in search for both structure and function. Results: We applied a `computational structural genomics' approach using five state-of-the-art remote similarity detection methods to detect the relationship between uncharacterized DUFs and domain families of known structures. The association with a structural domain family could serve as a start point in elucidating the function of a DUF. Amongst these five methods, searches in SCOP-NrichD database have been applied for the first time. Predictions were classified into high, medium and low-confidence based on the consensus of results from various approaches and also annotated with enzyme and Gene ontology terms. 614 uncharacterized DUFs could be associated with a known structural domain, of which high confidence predictions, involving at least four methods, were made for 54 families. These structure-function relationships for the 614 DUF families can be accessed on-line at http://proline.biochem.iisc.ernet.in/RHD_DUFS/. For potential enzymes in this set, we assessed their compatibility with the associated fold and performed detailed structural and functional annotation by examining alignments and extent of conservation of functional residues. Detailed discussion is provided for interesting assignments for DUF3050, DUF1636, DUF1572, DUF2092 and DUF659. Conclusions: This study provides insights into the structure and potential function for nearly 20 % of the DUFs. Use of different computational approaches enables us to reliably recognize distant relationships, especially when they converge to a common assignment because the methods are often complementary. We observe that while pointers to the structural domain can offer the right clues to the function of a protein, recognition of its precise functional role is still `non-trivial' with many DUF domains conserving only some of the critical residues. It is not clear whether these are functional vestiges or instances involving alternate substrates and interacting partners. Reviewers: This article was reviewed by Drs Eugene Koonin, Frank Eisenhaber and Srikrishna Subramanian.
Resumo:
We consider a two timescale model of learning by economic agents wherein active or 'ontogenetic' learning by individuals takes place on a fast scale and passive or 'phylogenetic' learning by society as a whole on a slow scale, each affecting the evolution of the other. The former is modelled by the Monte Carlo dynamics of physics, while the latter is modelled by the replicator dynamics of evolutionary biology. Various qualitative aspects of the dynamics are studied in some simple cases, both analytically and numerically, and its role as a useful modelling device is emphasized.
Resumo:
Background: Phosphorylation by protein kinases is a common event in many cellular processes. Further, many kinases perform specialized roles and are regulated by non-kinase domains tethered to kinase domain. Perturbation in the regulation of kinases leads to malignancy. We have identified and analysed putative protein kinases encoded in the genome of chimpanzee which is a close evolutionary relative of human. Result: The shared core biology between chimpanzee and human is characterized by many orthologous protein kinases which are involved in conserved pathways. Domain architectures specific to chimp/human kinases have been observed. Chimp kinases with unique domain architectures are characterized by deletion of one or more non-kinase domains in the human kinases. Interestingly, counterparts of some of the multi-domain human kinases in chimp are characterized by identical domain architectures but with kinase-like non-kinase domain. Remarkably, out of 587 chimpanzee kinases no human orthologue with greater than 95% sequence identity could be identified for 160 kinases. Variations in chimpanzee kinases compared to human kinases are brought about also by differences in functions of domains tethered to the catalytic kinase domain. For example, the heterodimer forming PB1 domain related to the fold of ubiquitin/Ras-binding domain is seen uniquely tethered to PKC-like chimpanzee kinase. Conclusion: Though the chimpanzee and human are evolutionary very close, there are chimpanzee kinases with no close counterpart in the human suggesting differences in their functions. This analysis provides a direction for experimental analysis of human and chimpanzee protein kinases in order to enhance our understanding on their specific biological roles.
Resumo:
n this paper, a multistage evolutionary scheme is proposed for clustering in a large data base, like speech data. This is achieved by clustering a small subset of the entire sample set in each stage and treating the cluster centroids so obtained as samples, together with another subset of samples not considered previously, as input data to the next stage. This is continued till the whole sample set is exhausted. The clustering is accomplished by constructing a fuzzy similarity matrix and using the fuzzy techniques proposed here. The technique is illustrated by an efficient scheme for voiced-unvoiced-silence classification of speech.
Resumo:
In this study, we have identified the possible genetic factors responsible for fowl-adaptation of Salmonella enterica serovar Gallinarum (S. Gallinarum). By comparing the genes related to Salmonella pathogenicity islands (SPI) of S. Gallinarum with those of Salmonella enterica serovar Enteritidis (S. Enteritidis) we have identified twenty-four positively selected genes. Our results suggest that the genes encoding the structural components of SPI-2 encoded type three secretion apparatus (TTSS) and the effector proteins that are secreted via SPI-1 encoded TTSS have evolved under positive selection pressure in these serovars. We propose that these positively selected genes play important roles in conferring different host-specificities to S. Gallinarum and S. Enteritidis.
Resumo:
Background: The members of cupin superfamily exhibit large variations in their sequences, functions, organization of domains, quaternary associations and the nature of bound metal ion, despite having a conserved beta-barrel structural scaffold. Here, an attempt has been made to understand structure-function relationships among the members of this diverse superfamily and identify the principles governing functional diversity. The cupin superfamily also contains proteins for which the structures are available through world-wide structural genomics initiatives but characterized as ``hypothetical''. We have explored the feasibility of obtaining clues to functions of such proteins by means of comparative analysis with cupins of known structure and function. Methodology/Principal Findings: A 3-D structure-based phylogenetic approach was undertaken. Interestingly, a dendrogram generated solely on the basis of structural dissimilarity measure at the level of domain folds was found to cluster functionally similar members. This clustering also reflects an independent evolution of the two domains in bicupins. Close examination of structural superposition of members across various functional clusters reveals structural variations in regions that not only form the active site pocket but are also involved in interaction with another domain in the same polypeptide or in the oligomer. Conclusions/Significance: Structure-based phylogeny of cupins can influence identification of functions of proteins of yet unknown function with cupin fold. This approach can be extended to other proteins with a common fold that show high evolutionary divergence. This approach is expected to have an influence on the function annotation in structural genomics initiatives.
Resumo:
Acyl carrier protein is an integral component of many cellular metabolic processes. A number of studies have reported self-acylation behavior in acyl carrier proteins. Although AM exhibit high levels of similarity in their primary and tertiary structures, self-acylation behavior is restricted to only some ACPs that can be classified into two major families based on their function. The first family of ACPs is involved in polyketide biosynthesis, whereas the second family participates in fatty acid synthesis. Facilitated by the growing number of genome sequences available for analyses, large-scale phylogenetic studies were used in these studies to uncover as to how self-acylation behavior of acyl carrier proteins is linked with the evolution of metabolic pathways in organisms. These studies show that self-acylation behavior in acyl carrier proteins was lost during the course of evolution, with certain organisms and organelles viz. plastids, retaining it for specified functions. (C) 2009 IUBMB IUBMB Life, 61(8): 853-859, 2009
Resumo:
The pattern of expression of the genes involved in the utilization of aryl beta-glucosides such as arbutin and salicin is different in the genus Shigella compared to Escherichia coli. The results presented here indicate that the homologue of the cryptic bgl operon of E. coli is conserved in Shigella sonnei and is the primary system involved in beta-glucoside utilization in the organism. The organization of the bgl genes in 5. sonnei is similar to that of E. coli; however there are three major differences in terms of their pattern of expression. (i) The bglB gene, encoding phospho-beta-glucosidase B, is insertionally inactivated in 5. sonnei. As a result, mutational activation of the silent bgl promoter confers an Arbutin-positive (Arb(+)) phenotype to the cells in a single step; however, acquiring a Salicin-positive (Sal(+)) phenotype requires the reversion or suppression of the bglB mutation in addition. (ii) Unlike in E. coli, a majority of the activating mutations (conferring the Arb(+) phenotype) map within the unlinked hns locus, whereas activation of the E. coli bgl operon under the same conditions is predominantly due to insertions within the bglR locus. (iii) Although the bgl promoter is silent in the wild-type strain of 5. sonnei (as in the case of E. coli), transcriptional and functional analyses indicated a higher basal level of transcription of the downstream genes. This was correlated with a 1 bp deletion within the putative Rho-independent terminator present in the leader sequence preceding the homologue of the bglG gene. The possible evolutionary implications of these differences for the maintenance of the genes in the cryptic state are discussed.
Resumo:
The 3' terminal 1255 nt sequence of Physalis mottle virus (PhMV) genomic RNA has been determined from a set of overlapping cDNA clones. The open reading frame (ORF) at the 3' terminus corresponds to the amino acid sequence of the coat protein (CP) determined earlier except for the absence of the dipeptide, Lys-Leu, at position 110-111. In addiition, the sequence upstream of the CP gene contains the message coding for 178 amino acid residues of the C-terminus of the putative replicase protein (RP). The sequence downstream of the CP gene contains an untranslated region whose terminal 80 nucleotides can be folded into a characteristic tRNA-like structure. A phylogenetic tree constructed after aligning separately the sequence of the CP, the replicase protein (RP) and the tRNA-like structure determined in this study with the corresponding sequences of other tymoviruses shows that PhMV wrongly named belladonna mottle virus [BDMV(I)] is a separate tymovirus and not another strain of BDMV(E) as originally envisaged. The phylogenetic tree in all the three cases is identical showing that any subset of genomic sequence of sufficient length can be used for establishing evolutionary relationships among tymoviruses.
Resumo:
The complete amino-acid sequence of sheep liver cytosolic serine hydroxymethyltransferase was determined from an analysis of tryptic, chymotryptic, CNBr and hydroxylamine peptides. Each subunit of sheep liver serine hydroxymethyltransferase consisted of 483 amino-acid residues. A comparison of this sequence with 8 other serine hydroxymethyltransferases revealed that a possible gene duplication event could have occurred after the divergence of animals and fungi. This analysis also showed independent duplication of SHMT genes in Neurospora crassa. At the secondary structural level, all the serine hydroxymethyltransferases belong to the alpha/beta category of proteins. The predicted secondary structure of sheep liver serine hydroxymethyltransferase was similar to that of the observed structure of tryptophan synthase, another pyridoxal 5'-phosphate containing enzyme, suggesting that sheep liver serine hydroxymethyltransferase might have a similar pyridoxal 5'-phosphate binding domain. In addition, a conserved glycine rich region, G L Q G G P, was identified in all the serine hydroxymethyltransferases and could be important in pyridoxal 5'-phosphate binding. A comparison of the cytosolic serine hydroxymethyltransferases from rabbit and sheep liver with other proteins sequenced from both these sources showed that serine hydroxymethyltransferase was a highly conserved protein. It was slightly less conserved than cytochrome c but better conserved than myoglobin, both of which are well known evolutionary markers. C67 and C203 were specifically protected by pyridoxal 5'-phosphate against modification with [C-14]iodoacetic acid, while C247 and C261 were buried in the native serine hydroxymethyltransferase. However, the cysteines are not conserved among the various serine hydroxymethyltransferases. The exact role of the cysteines in the reaction catalyzed by serine hydroxymethyltransferase remains to be elucidated.
Resumo:
This paper presents a dan-based evolutionary approach for solving control problems. Three selected control problems, viz. linear-quadratic, harvest, and push-cart problems, are solved using the proposed approach. Results are compared with those of the evolutionary programming (EP) approach. In most of the cases, the proposed approach is successful in obtaining (near) optimal solutions for these selected problems.
Resumo:
Owing to the increased customer demands for make-to-order products and smaller product life-cycles, today assembly lines are designed to ensure a quick switch-over from one product model to another for companies' survival in market place. The complexity associated with the decisions pertaining to the type of training and number of workers and their exposition to the different tasks especially in the current era of customized production is a serious problem that the managers and the HRD gurus are facing in industry. This paper aims to determine the amount of cross-training and dynamic deployment policy caused by workforce flexibility for a make-to-order assembly. The aforementioned issues have been dealt with by adopting the concept of evolutionary fuzzy system because of the linguistic nature of the attributes associated with product variety and task complexity. A fuzzy system-based methodology is proposed to determine the amount of cross-training and dynamic deployment policy. The proposed methodology is tested on 10 sample products of varying complexities and the results obtained are in line with the conclusions drawn by previous researchers.
Resumo:
Stirred tank bioreactors, employed in the production of a variety of biologically active chemicals, are often operated in batch, fed-batch, and continuous modes of operation. The optimal design of bioreactor is dependent on the kinetics of the biological process, as well as the performance criteria (yield, productivity, etc.) under consideration. In this paper, a general framework is proposed for addressing the two key issues related to the optimal design of a bioreactor, namely, (i) choice of the best operating mode and (ii) the corresponding flow rate trajectories. The optimal bioreactor design problem is formulated with initial conditions and inlet and outlet flow rate trajectories as decision variables to maximize more than one performance criteria (yield, productivity, etc.) as objective functions. A computational methodology based on genetic algorithm approach is developed to solve this challenging multiobjective optimization problem with multiple decision variables. The applicability of the algorithm is illustrated by solving two challenging problems from the bioreactor optimization literature.