26 resultados para Computational biology and bioinformatics
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
In [1], the authors proposed a framework for automated clustering and visualization of biological data sets named AUTO-HDS. This letter is intended to complement that framework by showing that it is possible to get rid of a user-defined parameter in a way that the clustering stage can be implemented more accurately while having reduced computational complexity
Resumo:
Intron splicing is one of the most important steps involved in the maturation process of a pre-mRNA. Although the sequence profiles around the splice sites have been studied extensively, the levels of sequence identity between the exonic sequences preceding the donor sites and the intronic sequences preceding the acceptor sites has not been examined as thoroughly. In this study we investigated identity patterns between the last 15 nucleotides of the exonic sequence preceding the 5' splice site and the intronic sequence preceding the 3' splice site in a set of human protein-coding genes that do not exhibit intron retention. We found that almost 60% of consecutive exons and introns in human protein-coding genes share at least two identical nucleotides at their 3' ends and, on average, the sequence identity length is 2.47 nucleotides. Based on our findings we conclude that the 3' ends of exons and introns tend to have longer identical sequences within a gene than when being taken from different genes. Our results hold even if the pairs are non-consecutive in the transcription order. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
In protein databases there is a substantial number of proteins structurally determined but without function annotation. Understanding the relationship between function and structure can be useful to predict function on a large scale. We have analyzed the similarities in global physicochemical parameters for a set of enzymes which were classified according to the four Enzyme Commission (EC) hierarchical levels. Using relevance theory we introduced a distance between proteins in the space of physicochemical characteristics. This was done by minimizing a cost function of the metric tensor built to reflect the EC classification system. Using an unsupervised clustering method on a set of 1025 enzymes, we obtained no relevant clustering formation compatible with EC classification. The distance distributions between enzymes from the same EC group and from different EC groups were compared by histograms. Such analysis was also performed using sequence alignment similarity as a distance. Our results suggest that global structure parameters are not sufficient to segregate enzymes according to EC hierarchy. This indicates that features essential for function are rather local than global. Consequently, methods for predicting function based on global attributes should not obtain high accuracy in main EC classes prediction without relying on similarities between enzymes from training and validation datasets. Furthermore, these results are consistent with a substantial number of studies suggesting that function evolves fundamentally by recruitment, i.e., a same protein motif or fold can be used to perform different enzymatic functions and a few specific amino acids (AAs) are actually responsible for enzyme activity. These essential amino acids should belong to active sites and an effective method for predicting function should be able to recognize them. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
This study describes the spatio-temporal distribution, population biology, and diet of the puffer fish Lagocephalus laevigatus in Caraguatatuba Bay, south-eastern Brazil. Monthly samples were taken between August 2003 and October 2004 by trawls in two areas, south and north, at depths of 1 to 4 m. The fish were measured and their sex and reproductive stage determined. The abundance of this species was compared between areas and among months, and the items in the diet were identified and quantified. Lagocephalus laevigatus was rare in Caraguatatuba Bay, where only 199 small individuals (4.8 to 15.4 cm) were obtained in the entire study period, suggesting that this species uses the estuary as a nursery. None of the specimens of L. laevigatus captured in Caraguatatuba Bay were sexually mature. Higher densities of L. laevigatus in the bay were recorded in the south area and between October and December 2003, i.e. in the spring, suggesting that spawning may occur from late winter to spring (August through to November). The diet items consumed by L. laevigatus in Caraguatatuba Bay were, as expected from the current literature, crustaceans, mainly amphipods, and fish. However, the most-consumed item was the sea whip Leptogorgia setacea (Cnidaria). This feeding habit may be related to the presence of toxins (tetrodotoxin and saxitoxin) that are frequently found in the skin and viscera of L. laevigatus, which may be sequestered from the sea whip, which possibility still needs to be specifically evaluated.
Resumo:
This study evaluated the spatio-temporal distribution, population biology and diet of Menticirrhus americanus in Caraguatatuba Bay. Samples were taken monthly between August 2003 and October 2004, by trawling in two previously selected areas. The northern area is more exposed to wave activity and is influenced by a river, functioning as a small estuary. In contrast, the southern area is relatively sheltered from wave energy and influenced to a lesser degree by smaller rivers. The fishes' length was measured, and the sex and gonadal stage macroscopically identified. The abundance of this species was compared between areas and among months. The diet was identified and quantified. M. americanus occurred in equal proportions in the two study areas, being most abundant in April 2004, followed by December 2003 and January 2004. The population was dominated by small immature individuals. The few individuals in maturation or mature that were captured showed no seasonal pattern of distribution. This species had a varied diet, feeding on worms (nemerteans, sipunculans and echiurans), mollusks (bivalves and cephalopods), polychaetes, crustaceans and fish. The presence of intact nematodes in the intestine suggests that these are parasites. The results demonstrated that M. americanus has a homogeneous spatial and temporal distribution in Caraguatatuba Bay, being uniformly distributed between the south and north areas as well as across the months. This species can be considered a carnivorous predator, showing a preference for consuming benthic sandy-beach species such as glycerids and other polychaetes, crustaceans, and bivalve siphons.
Resumo:
The male of Potiicoara brasiliensis is reported for the first time with evidence of sexual dimorphism. Male diagnostic characters are described and compared with the other three species of Spelaeogriphidae. Males present differential morphology on both distal podomere articles of the antennula and antenna, an elongate and curved bare endopod on pleopod 2, a pair of short round penes on the sternum near the base of pereopod 7, and telson with dorsum almost smooth and apex straight. Material is sampled for the first time from karstic areas north of the species type-locality, Gruta Ricardo Franco near Corumba City, and Gruta do Curupira in the Araras Mountains. These new findings expand the distribution of the species over seven hundred kilometers. Comparisons between exemplars of both sexes are presented. A hypothesis on the distributional pattern of P. brasiliensis is introduced based on the geological history of Central-West Brazil.
Resumo:
Apomictic plants are less dependent on pollinator services and able to occupy more diverse habitats than sexual species. However, such assumptions are based on temperate species, and comparable evaluation for species-rich Neotropical taxa is lacking. In this context, the Melastomataceae is a predominantly Neotropical angiosperm family with many apomictic species, which is common in the Campos Rupestres, endemism-rich vegetation on rocky outcrops in central Brazil. In this study, the breeding system of some Campo Rupestre Melastomataceae was evaluated, and breeding system studies for New World species were surveyed to test the hypothesis that apomixis is associated with wide distributions, whilst sexual species have more restricted areas. The breeding systems of 20 Campo Rupestre Melastomataceae were studied using hand pollinations and pollen-tube growth analysis. In addition, breeding system information was compiled for 124 New World species of Melastomataceae with either wide (1000 km) or restricted distributions. Most (80 ) of the Campo Rupestre species studied were self-compatible. Self-incompatibility in Microlicia viminalis was associated with pollen-tube arrest in the style, as described for other Melastomataceae, but most self-incompatible species analysed showed pollen-tube growth to the ovary irrespective of pollination treatment. Apomictic species showed lower pollen viability and were less frequent among the Campo Rupestre plants. Among the New World species compiled, 43 were apomictic and 77 sexual (24 self-incompatible and 53 self-compatible). Most apomictic (86 ) and self-incompatible species (71 ) presented wide distributions, whilst restricted distributions predominate only among the self-compatible ones (53 ). Self-compatibility and dependence on biotic pollination were characteristic of Campo Rupestre and narrowly distributed New World Melastomataceae species, whilst apomictics are widely distributed. This is, to a certain extent, similar to the geographical parthenogenesis pattern of temperate apomictics.
Resumo:
Several recent studies in literature have identified brain morphological alterations associated to Borderline Personality Disorder (BPD) patients. These findings are reported by studies based on voxel-based-morphometry analysis of structural MRI data, comparing mean gray-matter concentration between groups of BPD patients and healthy controls. On the other hand, mean differences between groups are not informative about the discriminative value of neuroimaging data to predict the group of individual subjects. In this paper, we go beyond mean differences analyses, and explore to what extent individual BPD patients can be differentiated from controls (25 subjects in each group), using a combination of automated-morphometric tools for regional cortical thickness/volumetric estimation and Support Vector Machine classifier. The approach included a feature selection step in order to identify the regions containing most discriminative information. The accuracy of this classifier was evaluated using the leave-one-subject-out procedure. The brain regions indicated as containing relevant information to discriminate groups were the orbitofrontal, rostral anterior cingulate, posterior cingulate, middle temporal cortices, among others. These areas, which are distinctively involved in emotional and affect regulation of BPD patients, were the most informative regions to achieve both sensitivity and specificity values of 80% in SVM classification. The findings suggest that this new methodology can add clinical and potential diagnostic value to neuroimaging of psychiatric disorders. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
This study evaluated the spatio-temporal distribution, population biology and diet of Menticirrhus americanus in Caraguatatuba Bay. Samples were taken monthly between August 2003 and October 2004, by trawling in two previously selected areas. The northern area is more exposed to wave activity and is influenced by a river, functioning as a small estuary. In contrast, the southern area is relatively sheltered from wave energy and influenced to a lesser degree by smaller rivers. The fishes' length was measured, and the sex and gonadal stage macroscopically identified. The abundance of this species was compared between areas and among months. The diet was identified and quantified. M. americanus occurred in equal proportions in the two study areas, being most abundant in April 2004, followed by December 2003 and January 2004. The population was dominated by small immature individuals. The few individuals in maturation or mature that were captured showed no seasonal pattern of distribution. This species had a varied diet, feeding on worms (nemerteans, sipunculans and echiurans), mollusks (bivalves and cephalopods), polychaetes, crustaceans and fish. The presence of intact nematodes in the intestine suggests that these are parasites. The results demonstrated that M. americanus has a homogeneous spatial and temporal distribution in Caraguatatuba Bay, being uniformly distributed between the south and north areas as well as across the months. This species can be considered a carnivorous predator, showing a preference for consuming benthic sandy-beach species such as glycerids and other polychaetes, crustaceans, and bivalve siphons.
Resumo:
Abstract Background The thymus is a central lymphoid organ, in which bone marrow-derived T cell precursors undergo a complex process of maturation. Developing thymocytes interact with thymic microenvironment in a defined spatial order. A component of thymic microenvironment, the thymic epithelial cells, is crucial for the maturation of T-lymphocytes through cell-cell contact, cell matrix interactions and secretory of cytokines/chemokines. There is evidence that extracellular matrix molecules play a fundamental role in guiding differentiating thymocytes in both cortical and medullary regions of the thymic lobules. The interaction between the integrin α5β1 (CD49e/CD29; VLA-5) and fibronectin is relevant for thymocyte adhesion and migration within the thymic tissue. Our previous results have shown that adhesion of thymocytes to cultured TEC line is enhanced in the presence of fibronectin, and can be blocked with anti-VLA-5 antibody. Results Herein, we studied the role of CD49e expressed by the human thymic epithelium. For this purpose we knocked down the CD49e by means of RNA interference. This procedure resulted in the modulation of more than 100 genes, some of them coding for other proteins also involved in adhesion of thymocytes; others related to signaling pathways triggered after integrin activation, or even involved in the control of F-actin stress fiber formation. Functionally, we demonstrated that disruption of VLA-5 in human TEC by CD49e-siRNA-induced gene knockdown decreased the ability of TEC to promote thymocyte adhesion. Such a decrease comprised all CD4/CD8-defined thymocyte subsets. Conclusion Conceptually, our findings unravel the complexity of gene regulation, as regards key genes involved in the heterocellular cell adhesion between developing thymocytes and the major component of the thymic microenvironment, an interaction that is a mandatory event for proper intrathymic T cell differentiation.
Resumo:
Abstract Background Recent medical and biological technology advances have stimulated the development of new testing systems that have been providing huge, varied amounts of molecular and clinical data. Growing data volumes pose significant challenges for information processing systems in research centers. Additionally, the routines of genomics laboratory are typically characterized by high parallelism in testing and constant procedure changes. Results This paper describes a formal approach to address this challenge through the implementation of a genetic testing management system applied to human genome laboratory. We introduced the Human Genome Research Center Information System (CEGH) in Brazil, a system that is able to support constant changes in human genome testing and can provide patients updated results based on the most recent and validated genetic knowledge. Our approach uses a common repository for process planning to ensure reusability, specification, instantiation, monitoring, and execution of processes, which are defined using a relational database and rigorous control flow specifications based on process algebra (ACP). The main difference between our approach and related works is that we were able to join two important aspects: 1) process scalability achieved through relational database implementation, and 2) correctness of processes using process algebra. Furthermore, the software allows end users to define genetic testing without requiring any knowledge about business process notation or process algebra. Conclusions This paper presents the CEGH information system that is a Laboratory Information Management System (LIMS) based on a formal framework to support genetic testing management for Mendelian disorder studies. We have proved the feasibility and showed usability benefits of a rigorous approach that is able to specify, validate, and perform genetic testing using easy end user interfaces.
Resumo:
Abstract Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.
Resumo:
The `Critically Endangered` Cone-billed Tanager Conothraupis mesoleuca was described in 71 93 9, based on a single specimen collected in the state of Mato Grosso, western Brazil. Not seen again in the wild until 2003, this poorly-known species was rediscovered in Emas National Park, in the Brazilian state of Goias. We describe here the discovery of a new population of Cone-billed Tanager in Chapada dos Parecis, along the upper Juruena River basin, in the state of Mato Grosso. The birds were always detected in (or near) flooded habitats along rivers. At least 40 individuals were found, but the population may be larger since areas of potential habitat are available in the upper Juruena basin and these have not yet been surveyed. We also provide here the first information on the biology and behaviour of the species based on observations in Juruena and Emas, as well as a first description of the female. Historical documents and our records support our suggestion that ""Juruena"", i.e. the type locality of the Cone-billed Tanager, refers to the Juruena telegraph station (12 degrees 50`S, 58 degrees 55`W). Considering that the range of the species is being settled, research on different aspects of its biology are urgent.
Resumo:
Background: This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. Results: The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. Conclusions: We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.