992 resultados para Computational Identification


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Chondrocyte gene regulation is important for the generation and maintenance of cartilage tissues. Several regulatory factors have been identified that play a role in chondrogenesis, including the positive transacting factors of the SOX family such as SOX9, SOX5, and SOX6, as well as negative transacting factors such as C/EBP and delta EF1. However, a complete understanding of the intricate regulatory network that governs the tissue-specific expression of cartilage genes is not yet available. We have taken a computational approach to identify cis-regulatory, transcription factor (TF) binding motifs in a set of cartilage characteristic genes to better define the transcriptional regulatory networks that regulate chondrogenesis. Our computational methods have identified several TFs, whose binding profiles are available in the TRANSFAC database, as important to chondrogenesis. In addition, a cartilage-specific SOX-binding profile was constructed and used to identify both known, and novel, functional paired SOX-binding motifs in chondrocyte genes. Using DNA pattern-recognition algorithms, we have also identified cis-regulatory elements for unknown TFs. We have validated our computational predictions through mutational analyses in cell transfection experiments. One novel regulatory motif, N1, found at high frequency in the COL2A1 promoter, was found to bind to chondrocyte nuclear proteins. Mutational analyses suggest that this motif binds a repressive factor that regulates basal levels of the COL2A1 promoter.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Abstract Background The sequencing of the D.melanogaster genome revealed an unexpected small number of genes (~ 14,000) indicating that mechanisms acting on generation of transcript diversity must have played a major role in the evolution of complex metazoans. Among the most extensively used mechanisms that accounts for this diversity is alternative splicing. It is estimated that over 40% of Drosophila protein-coding genes contain one or more alternative exons. A recent transcription map of the Drosophila embryogenesis indicates that 30% of the transcribed regions are unannotated, and that 1/3 of this is estimated as missed or alternative exons of previously characterized protein-coding genes. Therefore, the identification of the variety of expressed transcripts depends on experimental data for its final validation and is continuously being performed using different approaches. We applied the Open Reading Frame Expressed Sequence Tags (ORESTES) methodology, which is capable of generating cDNA data from the central portion of rare transcripts, in order to investigate the presence of hitherto unnanotated regions of Drosophila transcriptome. Results Bioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury. Conclusion Using the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Abstract Background MicroRNAs (miRNAs) are small regulatory RNAs, some of which are conserved in diverse plant genomes. Therefore, computational identification and further experimental validation of miRNAs from non-model organisms is both feasible and instrumental for addressing miRNA-based gene regulation and evolution. Sugarcane (Saccharum spp.) is an important biofuel crop with publicly available expressed sequence tag and genomic survey sequence databases, but little is known about miRNAs and their targets in this highly polyploid species. Results In this study, we have computationally identified 19 distinct sugarcane miRNA precursors, of which several are highly similar with their sorghum homologs at both nucleotide and secondary structure levels. The accumulation pattern of mature miRNAs varies in organs/tissues from the commercial sugarcane hybrid as well as in its corresponding founder species S. officinarum and S. spontaneum. Using sugarcane MIR827 as a query, we found a novel MIR827 precursor in the sorghum genome. Based on our computational tool, a total of 46 potential targets were identified for the 19 sugarcane miRNAs. Several targets for highly conserved miRNAs are transcription factors that play important roles in plant development. Conversely, target genes of lineage-specific miRNAs seem to play roles in diverse physiological processes, such as SsCBP1. SsCBP1 was experimentally confirmed to be a target for the monocot-specific miR528. Our findings support the notion that the regulation of SsCBP1 by miR528 is shared at least within graminaceous monocots, and this miRNA-based post-transcriptional regulation evolved exclusively within the monocots lineage after the divergence from eudicots. Conclusions Using publicly available nucleotide databases, 19 sugarcane miRNA precursors and one new sorghum miRNA precursor were identified and classified into 14 families. Comparative analyses between sugarcane and sorghum suggest that these two species retain homologous miRNAs and targets in their genomes. Such conservation may help to clarify specific aspects of miRNA regulation and evolution in the polyploid sugarcane. Finally, our dataset provides a framework for future studies on sugarcane RNAi-dependent regulatory mechanisms.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recently, we identified a large number of ultraconserved (uc) sequences in noncoding regions of human, mouse, and rat genomes that appear to be essential for vertebrate and amniote ontogeny. Here, we used similar methods to identify ultraconserved genomic regions between the insect species Drosophila melanogaster and Drosophila pseudoobscura, as well as the more distantly related Anopheles gambiae. As with vertebrates, ultraconserved sequences in insects appear to Occur primarily in intergenic and intronic sequences, and at intron-exon junctions. The sequences are significantly associated with genes encoding developmental regulators and transcription factors, but are less frequent and are smaller in size than in vertebrates. The longest identical, nongapped orthologous match between the three genomes was found within the homothorax (hth) gene. This sequence spans an internal exon-intron junction, with the majority located within the intron, and is predicted to form a highly stable stem-loop RNA structure. Real-time quantitative PCR analysis of different hth splice isoforms and Northern blotting showed that the conserved element is associated with a high incidence of intron retention in hth pre-mRNA, suggesting that the conserved intronic element is critically important in the post-transcriptional regulation of hth expression in Diptera.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Biosignals processing, Biological Nonlinear and time-varying systems identification, Electomyograph signals recognition, Pattern classification, Fuzzy logic and neural networks methods

Relevância:

40.00% 40.00%

Publicador:

Resumo:

System identification deals with the problem of building mathematical models of dynamical systems based on observed data from the system" [1]. In the context of civil engineering, the system refers to a large scale structure such as a building, bridge, or an offshore structure, and identification mostly involves the determination of modal parameters (the natural frequencies, damping ratios, and mode shapes). This paper presents some modal identification results obtained using a state-of-the-art time domain system identification method (data-driven stochastic subspace algorithms [2]) applied to the output-only data measured in a steel arch bridge. First, a three dimensional finite element model was developed for the numerical analysis of the structure using ANSYS. Modal analysis was carried out and modal parameters were extracted in the frequency range of interest, 0-10 Hz. The results obtained from the finite element modal analysis were used to determine the location of the sensors. After that, ambient vibration tests were conducted during April 23-24, 2009. The response of the structure was measured using eight accelerometers. Two stations of three sensors were formed (triaxial stations). These sensors were held stationary for reference during the test. The two remaining sensors were placed at the different measurement points along the bridge deck, in which only vertical and transversal measurements were conducted (biaxial stations). Point estimate and interval estimate have been carried out in the state space model using these ambient vibration measurements. In the case of parametric models (like state space), the dynamic behaviour of a system is described using mathematical models. Then, mathematical relationships can be established between modal parameters and estimated point parameters (thus, it is common to use experimental modal analysis as a synonym for system identification). Stable modal parameters are found using a stabilization diagram. Furthermore, this paper proposes a method for assessing the precision of estimates of the parameters of state-space models (confidence interval). This approach employs the nonparametric bootstrap procedure [3] and is applied to subspace parameter estimation algorithm. Using bootstrap results, a plot similar to a stabilization diagram is developed. These graphics differentiate system modes from spurious noise modes for a given order system. Additionally, using the modal assurance criterion, the experimental modes obtained have been compared with those evaluated from a finite element analysis. A quite good agreement between numerical and experimental results is observed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: There are several studies in the literature depicting measurement error in gene expression data and also, several others about regulatory network models. However, only a little fraction describes a combination of measurement error in mathematical regulatory networks and shows how to identify these networks under different rates of noise. Results: This article investigates the effects of measurement error on the estimation of the parameters in regulatory networks. Simulation studies indicate that, in both time series (dependent) and non-time series (independent) data, the measurement error strongly affects the estimated parameters of the regulatory network models, biasing them as predicted by the theory. Moreover, when testing the parameters of the regulatory network models, p-values computed by ignoring the measurement error are not reliable, since the rate of false positives are not controlled under the null hypothesis. In order to overcome these problems, we present an improved version of the Ordinary Least Square estimator in independent (regression models) and dependent (autoregressive models) data when the variables are subject to noises. Moreover, measurement error estimation procedures for microarrays are also described. Simulation results also show that both corrected methods perform better than the standard ones (i.e., ignoring measurement error). The proposed methodologies are illustrated using microarray data from lung cancer patients and mouse liver time series data. Conclusions: Measurement error dangerously affects the identification of regulatory network models, thus, they must be reduced or taken into account in order to avoid erroneous conclusions. This could be one of the reasons for high biological false positive rates identified in actual regulatory network models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An important topic in genomic sequence analysis is the identification of protein coding regions. In this context, several coding DNA model-independent methods based on the occurrence of specific patterns of nucleotides at coding regions have been proposed. Nonetheless, these methods have not been completely suitable due to their dependence on an empirically predefined window length required for a local analysis of a DNA region. We introduce a method based on a modified Gabor-wavelet transform (MGWT) for the identification of protein coding regions. This novel transform is tuned to analyze periodic signal components and presents the advantage of being independent of the window length. We compared the performance of the MGWT with other methods by using eukaryote data sets. The results show that MGWT outperforms all assessed model-independent methods with respect to identification accuracy. These results indicate that the source of at least part of the identification errors produced by the previous methods is the fixed working scale. The new method not only avoids this source of errors but also makes a tool available for detailed exploration of the nucleotide occurrence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have used various computational methodologies including molecular dynamics, density functional theory, virtual screening, ADMET predictions and molecular interaction field studies to design and analyze four novel potential inhibitors of farnesyltransferase (FTase). Evaluation of two proposals regarding their drug potential as well as lead compounds have indicated them as novel promising FTase inhibitors, with theoretically interesting pharmacotherapeutic profiles, when Compared to the very active and most cited FTase inhibitors that have activity data reported, which are launched drugs or compounds in clinical tests. One of our two proposals appears to be a more promising drug candidate and FTase inhibitor, but both derivative molecules indicate potentially very good pharmacotherapeutic profiles in comparison with Tipifarnib and Lonafarnib, two reference pharmaceuticals. Two other proposals have been selected with virtual screening approaches and investigated by LIS, which suggest novel and alternatives scaffolds to design future potential FTase inhibitors. Such compounds can be explored as promising molecules to initiate a research protocol in order to discover novel anticancer drug candidates targeting farnesyltransferase, in the fight against cancer. (C) 2009 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computer models can be combined with laboratory experiments for the efficient determination of (i) peptides that bind MHC molecules and (ii) T-cell epitopes. For maximum benefit, the use of computer models must be treated as experiments analogous to standard laboratory procedures. This requires the definition of standards and experimental protocols for model application. We describe the requirements for validation and assessment of computer models. The utility of combining accurate predictions with a limited number of laboratory experiments is illustrated by practical examples. These include the identification of T-cell epitopes from IDDM-, melanoma- and malaria-related antigens by combining computational and conventional laboratory assays. The success rate in determining antigenic peptides, each in the context of a specific HLA molecule, ranged from 27 to 71%, while the natural prevalence of MHC-binding peptides is 0.1-5%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The explosive growth in biotechnology combined with major advancesin information technology has the potential to radically transformimmunology in the postgenomics era. Not only do we now have readyaccess to vast quantities of existing data, but new data with relevanceto immunology are being accumulated at an exponential rate. Resourcesfor computational immunology include biological databases and methodsfor data extraction, comparison, analysis and interpretation. Publiclyaccessible biological databases of relevance to immunologists numberin the hundreds and are growing daily. The ability to efficientlyextract and analyse information from these databases is vital forefficient immunology research. Most importantly, a new generationof computational immunology tools enables modelling of peptide transportby the transporter associated with antigen processing (TAP), modellingof antibody binding sites, identification of allergenic motifs andmodelling of T-cell receptor serial triggering.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Allergy is a major cause of morbidity worldwide. The number of characterized allergens and related information is increasing rapidly creating demands for advanced information storage, retrieval and analysis. Bioinformatics provides useful tools for analysing allergens and these are complementary to traditional laboratory techniques for the study of allergens. Specific applications include structural analysis of allergens, identification of B- and T-cell epitopes, assessment of allergenicity and cross-reactivity, and genome analysis. In this paper, the most important bioinformatic tools and methods with relevance to the study of allergy have been reviewed.