20 resultados para Computational Biology


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract Background A large number of probabilistic models used in sequence analysis assign non-zero probability values to most input sequences. To decide when a given probability is sufficient the most common way is bayesian binary classification, where the probability of the model characterizing the sequence family of interest is compared to that of an alternative probability model. We can use as alternative model a null model. This is the scoring technique used by sequence analysis tools such as HMMER, SAM and INFERNAL. The most prevalent null models are position-independent residue distributions that include: the uniform distribution, genomic distribution, family-specific distribution and the target sequence distribution. This paper presents a study to evaluate the impact of the choice of a null model in the final result of classifications. In particular, we are interested in minimizing the number of false predictions in a classification. This is a crucial issue to reduce costs of biological validation. Results For all the tests, the target null model presented the lowest number of false positives, when using random sequences as a test. The study was performed in DNA sequences using GC content as the measure of content bias, but the results should be valid also for protein sequences. To broaden the application of the results, the study was performed using randomly generated sequences. Previous studies were performed on aminoacid sequences, using only one probabilistic model (HMM) and on a specific benchmark, and lack more general conclusions about the performance of null models. Finally, a benchmark test with P. falciparum confirmed these results. Conclusions Of the evaluated models the best suited for classification are the uniform model and the target model. However, the use of the uniform model presents a GC bias that can cause more false positives for candidate sequences with extreme compositional bias, a characteristic not described in previous studies. In these cases the target model is more dependable for biological validation due to its higher specificity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract Background The thymus is a central lymphoid organ, in which bone marrow-derived T cell precursors undergo a complex process of maturation. Developing thymocytes interact with thymic microenvironment in a defined spatial order. A component of thymic microenvironment, the thymic epithelial cells, is crucial for the maturation of T-lymphocytes through cell-cell contact, cell matrix interactions and secretory of cytokines/chemokines. There is evidence that extracellular matrix molecules play a fundamental role in guiding differentiating thymocytes in both cortical and medullary regions of the thymic lobules. The interaction between the integrin α5β1 (CD49e/CD29; VLA-5) and fibronectin is relevant for thymocyte adhesion and migration within the thymic tissue. Our previous results have shown that adhesion of thymocytes to cultured TEC line is enhanced in the presence of fibronectin, and can be blocked with anti-VLA-5 antibody. Results Herein, we studied the role of CD49e expressed by the human thymic epithelium. For this purpose we knocked down the CD49e by means of RNA interference. This procedure resulted in the modulation of more than 100 genes, some of them coding for other proteins also involved in adhesion of thymocytes; others related to signaling pathways triggered after integrin activation, or even involved in the control of F-actin stress fiber formation. Functionally, we demonstrated that disruption of VLA-5 in human TEC by CD49e-siRNA-induced gene knockdown decreased the ability of TEC to promote thymocyte adhesion. Such a decrease comprised all CD4/CD8-defined thymocyte subsets. Conclusion Conceptually, our findings unravel the complexity of gene regulation, as regards key genes involved in the heterocellular cell adhesion between developing thymocytes and the major component of the thymic microenvironment, an interaction that is a mandatory event for proper intrathymic T cell differentiation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract Background Recent medical and biological technology advances have stimulated the development of new testing systems that have been providing huge, varied amounts of molecular and clinical data. Growing data volumes pose significant challenges for information processing systems in research centers. Additionally, the routines of genomics laboratory are typically characterized by high parallelism in testing and constant procedure changes. Results This paper describes a formal approach to address this challenge through the implementation of a genetic testing management system applied to human genome laboratory. We introduced the Human Genome Research Center Information System (CEGH) in Brazil, a system that is able to support constant changes in human genome testing and can provide patients updated results based on the most recent and validated genetic knowledge. Our approach uses a common repository for process planning to ensure reusability, specification, instantiation, monitoring, and execution of processes, which are defined using a relational database and rigorous control flow specifications based on process algebra (ACP). The main difference between our approach and related works is that we were able to join two important aspects: 1) process scalability achieved through relational database implementation, and 2) correctness of processes using process algebra. Furthermore, the software allows end users to define genetic testing without requiring any knowledge about business process notation or process algebra. Conclusions This paper presents the CEGH information system that is a Laboratory Information Management System (LIMS) based on a formal framework to support genetic testing management for Mendelian disorder studies. We have proved the feasibility and showed usability benefits of a rigorous approach that is able to specify, validate, and perform genetic testing using easy end user interfaces.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The enzyme chitinase from Moniliophthora perniciosa the causative agent of the witches' broom disease in Theobroma cacao, was partially purified with ammonium sulfate and filtration by Sephacryl S-200 using sodium phosphate as an extraction buffer. Response surface methodology (RSM) was used to determine the optimum pH and temperature conditions. Four different isoenzymes were obtained: ChitMp I, ChitMp II, ChitMp III and ChitMp IV. ChitMp I had an optimum temperature at 44-73ºC and an optimum pH at 7.0-8.4. ChitMp II had an optimum temperature at 45-73ºC and an optimum pH at 7.0-8.4. ChitMp III had an optimum temperature at 54-67ºC and an optimum pH at 7.3-8.8. ChitMp IV had an optimum temperature at 60ºC and an optimum pH at 7.0. For the computational biology, the primary sequence was determined in silico from the database of the Genome/Proteome Project of M. perniciosa, yielding a sequence with 564 bp and 188 amino acids that was used for the three-dimensional design in a comparative modeling methodology. The generated models were submitted to validation using Procheck 3.0 and ANOLEA. The model proposed for the chitinase was subjected to a dynamic analysis over a 1 ns interval, resulting in a model with 91.7% of the residues occupying favorable places on the Ramachandran plot and an RMS of 2.68.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Abstract Background The study and analysis of gene expression measurements is the primary focus of functional genomics. Once expression data is available, biologists are faced with the task of extracting (new) knowledge associated to the underlying biological phenomenon. Most often, in order to perform this task, biologists execute a number of analysis activities on the available gene expression dataset rather than a single analysis activity. The integration of heteregeneous tools and data sources to create an integrated analysis environment represents a challenging and error-prone task. Semantic integration enables the assignment of unambiguous meanings to data shared among different applications in an integrated environment, allowing the exchange of data in a semantically consistent and meaningful way. This work aims at developing an ontology-based methodology for the semantic integration of gene expression analysis tools and data sources. The proposed methodology relies on software connectors to support not only the access to heterogeneous data sources but also the definition of transformation rules on exchanged data. Results We have studied the different challenges involved in the integration of computer systems and the role software connectors play in this task. We have also studied a number of gene expression technologies, analysis tools and related ontologies in order to devise basic integration scenarios and propose a reference ontology for the gene expression domain. Then, we have defined a number of activities and associated guidelines to prescribe how the development of connectors should be carried out. Finally, we have applied the proposed methodology in the construction of three different integration scenarios involving the use of different tools for the analysis of different types of gene expression data. Conclusions The proposed methodology facilitates the development of connectors capable of semantically integrating different gene expression analysis tools and data sources. The methodology can be used in the development of connectors supporting both simple and nontrivial processing requirements, thus assuring accurate data exchange and information interpretation from exchanged data.