101 resultados para Critical Sequence
Resumo:
The amalgamation operation is frequently used to reduce the number of parts of compositional data but it is a non-linear operation in the simplex with the usual geometry,the Aitchison geometry. The concept of balances between groups, a particular coordinate system designed over binary partitions of the parts, could be an alternative to theamalgamation in some cases. In this work we discuss the proper application of bothconcepts using a real data set corresponding to behavioral measures of pregnant sows
Resumo:
When underwater vehicles perform navigation close to the ocean floor, computer vision techniques can be applied to obtain quite accurate motion estimates. The most crucial step in the vision-based estimation of the vehicle motion consists on detecting matchings between image pairs. Here we propose the extensive use of texture analysis as a tool to ameliorate the correspondence problem in underwater images. Once a robust set of correspondences has been found, the three-dimensional motion of the vehicle can be computed with respect to the bed of the sea. Finally, motion estimates allow the construction of a map that could aid to the navigation of the robot
Resumo:
This paper presents an approach to ameliorate the reliability of the correspondence points relating two consecutive images of a sequence. The images are especially difficult to handle, since they have been acquired by a camera looking at the sea floor while carried by an underwater robot. Underwater images are usually difficult to process due to light absorption, changing image radiance and lack of well-defined features. A new approach based on gray-level region matching and selective texture analysis significantly improves the matching reliability
Resumo:
This paper focus on the problem of locating single-phase faults in mixed distribution electric systems, with overhead lines and underground cables, using voltage and current measurements at the sending-end and sequence model of the network. Since calculating series impedance for underground cables is not as simple as in the case of overhead lines, the paper proposes a methodology to obtain an estimation of zero-sequence impedance of underground cables starting from previous single-faults occurred in the system, in which an electric arc occurred at the fault location. For this reason, the signal is previously pretreated to eliminate its peaks voltage and the analysis can be done working with a signal as close as a sinus wave as possible
Resumo:
We study the damage enhanced creep rupture of disordered materials by means of a fiber bundle model. Broken fibers undergo a slow stress relaxation modeled by a Maxwell element whose stress exponent m can vary in a broad range. Under global load sharing we show that due to the strength disorder of fibers, the lifetime ʧ of the bundle has sample-to-sample fluctuations characterized by a log-normal distribution independent of the type of disorder. We determine the Monkman-Grant relation of the model and establish a relation between the rupture life tʄ and the characteristic time tm of the intermediate creep regime of the bundle where the minimum strain rate is reached, making possible reliable estimates of ʧ from short term measurements. Approaching macroscopic failure, the deformation rate has a finite time power law singularity whose exponent is a decreasing function of m. On the microlevel the distribution of waiting times is found to have a power law behavior with m-dependent exponents different below and above the critical load of the bundle. Approaching the critical load from above, the cutoff value of the distributions has a power law divergence whose exponent coincides with the stress exponent of Maxwell elements
Resumo:
Different procedures to obtain atom condensed Fukui functions are described. It is shown how the resulting values may differ depending on the exact approach to atom condensed Fukui functions. The condensed Fukui function can be computed using either the fragment of molecular response approach or the response of molecular fragment approach. The two approaches are nonequivalent; only the latter approach corresponds in general with a population difference expression. The Mulliken approach does not depend on the approach taken but has some computational drawbacks. The different resulting expressions are tested for a wide set of molecules. In practice one must make seemingly arbitrary choices about how to compute condensed Fukui functions, which suggests questioning the role of these indicators in conceptual density-functional theory
Resumo:
The computational approach to the Hirshfeld [Theor. Chim. Acta 44, 129 (1977)] atom in a molecule is critically investigated, and several difficulties are highlighted. It is shown that these difficulties are mitigated by an alternative, iterative version, of the Hirshfeld partitioning procedure. The iterative scheme ensures that the Hirshfeld definition represents a mathematically proper information entropy, allows the Hirshfeld approach to be used for charged molecules, eliminates arbitrariness in the choice of the promolecule, and increases the magnitudes of the charges. The resulting "Hirshfeld-I charges" correlate well with electrostatic potential derived atomic charges
Resumo:
Rho GTPases are conformational switches that control a wide variety of signaling pathways critical for eukaryotic cell development and proliferation. They represent attractive targets for drug design as their aberrant function and deregulated activity is associated with many human diseases including cancer. Extensive high-resolution structures (.100) and recent mutagenesis studies have laid the foundation for the design of new structure-based chemotherapeutic strategies. Although the inhibition of Rho signaling with drug-like compounds is an active area of current research, very little attention has been devoted to directly inhibiting Rho by targeting potential allosteric non-nucleotide binding sites. By avoiding the nucleotide binding site, compounds may minimize the potential for undesirable off-target interactions with other ubiquitous GTP and ATP binding proteins. Here we describe the application of molecular dynamics simulations, principal component analysis, sequence conservation analysis, and ensemble small-molecule fragment mapping to provide an extensive mapping of potential small-molecule binding pockets on Rho family members. Characterized sites include novel pockets in the vicinity of the conformationaly responsive switch regions as well as distal sites that appear to be related to the conformations of the nucleotide binding region. Furthermore the use of accelerated molecular dynamics simulation, an advanced sampling method that extends the accessible time-scale of conventional simulations, is found to enhance the characterization of novel binding sites when conformational changes are important for the protein mechanism.
Resumo:
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors.
Resumo:
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are “genomic fossils” valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome’s structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction (∼80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.
Resumo:
The goals of the human genome project did not include sequencing of the heterochromatic regions. We describe here an initial sequence of 1.1 Mb of the short arm of human chromosome 21 (HSA21p), estimated to be 10% of 21p. This region contains extensive euchromatic-like sequence and includes on average one transcript every 100 kb. These transcripts show multiple inter- and intrachromosomal copies, and extensive copy number and sequence variability. The sequencing of the "heterochromatic" regions of the human genome is likely to reveal many additional functional elements and provide important evolutionary information.
Resumo:
The construction of metagenomic libraries has permitted the study of microorganisms resistant to isolation and the analysis of 16S rDNA sequences has been used for over two decades to examine bacterial biodiversity. Here, we show that the analysis of random sequence reads (RSRs) instead of 16S is a suitable shortcut to estimate the biodiversity of a bacterial community from metagenomic libraries. We generated 10,010 RSRs from a metagenomic library of microorganisms found in human faecal samples. Then searched them using the program BLASTN against a prokaryotic sequence database to assign a taxon to each RSR. The results were compared with those obtained by screening and analysing the clones containing 16S rDNA sequences in the whole library. We found that the biodiversity observed by RSR analysis is consistent with that obtained by 16S rDNA. We also show that RSRs are suitable to compare the biodiversity between different metagenomic libraries. RSRs can thus provide a good estimate of the biodiversity of a metagenomic library and, as an alternative to 16S, this approach is both faster and cheaper.
Resumo:
A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.
Resumo:
Background: Single nucleotide polymorphisms (SNPs) are the most frequent type of sequence variation between individuals, and represent a promising tool for finding genetic determinants of complex diseases and understanding the differences in drug response. In this regard, it is of particular interest to study the effect of non-synonymous SNPs in the context of biological networks such as cell signalling pathways. UniProt provides curated information about the functional and phenotypic effects of sequence variation, including SNPs, as well as on mutations of protein sequences. However, no strategy has been developed to integrate this information with biological networks, with the ultimate goal of studying the impact of the functional effect of SNPs in the structure and dynamics of biological networks. Results: First, we identified the different challenges posed by the integration of the phenotypic effect of sequence variants and mutations with biological networks. Second, we developed a strategy for the combination of data extracted from public resources, such as UniProt, NCBI dbSNP, Reactome and BioModels. We generated attribute files containing phenotypic and genotypic annotations to the nodes of biological networks, which can be imported into network visualization tools such as Cytoscape. These resources allow the mapping and visualization of mutations and natural variations of human proteins and their phenotypic effect on biological networks (e.g. signalling pathways, protein-protein interaction networks, dynamic models). Finally, an example on the use of the sequence variation data in the dynamics of a network model is presented. Conclusion: In this paper we present a general strategy for the integration of pathway and sequence variation data for visualization, analysis and modelling purposes, including the study of the functional impact of protein sequence variations on the dynamics of signalling pathways. This is of particular interest when the SNP or mutation is known to be associated to disease. We expect that this approach will help in the study of the functional impact of disease-associated SNPs on the behaviour of cell signalling pathways, which ultimately will lead to a better understanding of the mechanisms underlying complex diseases.
Resumo:
Background: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. Results: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. Conclusion: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone.