963 resultados para Genomic sequence database


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Repeats are two or more contiguous segments of amino acid residues that are believed to have arisen as a result of intragenic duplication, recombination and mutation events. These repeats can be utilized for protein structure prediction and can provide insights into the protein evolution and phylogenetic relationship. Therefore, to aid structural biologists and phylogeneticists in their research, a computing resource (a web server and a database), Repeats in Protein Sequences (RPS), has been created. Using RPS, users can obtain useful information regarding identical, similar and distant repeats (of varying lengths) in protein sequences. In addition, users can check the frequency of occurrence of the repeats in sequence databases such as the Genome Database, PIR and SWISS-PROT and among the protein sequences available in the Protein Data Bank archive. Furthermore, users can view the three-dimensional structure of the repeats using the Java visualization plug-in Jmol. The proposed computing resource can be accessed over the World Wide Web at http://bioserver1.physics.iisc.ernet.in/rps/.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Differential organisation of homologous chromosomes is related to both sex determination and genomic imprinting in coccid insects, the mealybugs. We report here the identification of two middle repetitive sequences that are differentially organised between the two sexes and also within the same diploid nucleus. These two sequences form a part of the male-specific nuclease-resistant chromatin (NRC) fraction of a mealybug Planococcus lilacinus. To understand the phenomenon of differential organisation we have analysed the components of NRC by cloning the DNA sequences present, deciphering their primary sequence, nucleosomal organisation, genomic distribution and cytological localisation, Our observations suggest that the middle repetitive sequences within NRC are functionally significant and we discuss their probable involvement in male-specific chromatin organisation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we report an analysis of the protein sequence length distribution for 13 bacteria, four archaea and one eukaryote whose genomes have been completely sequenced, The frequency distribution of protein sequence length for all the 18 organisms are remarkably similar, independent of genome size and can be described in terms of a lognormal probability distribution function. A simple stochastic model based on multiplicative processes has been proposed to explain the sequence length distribution. The stochastic model supports the random-origin hypothesis of protein sequences in genomes. Distributions of large proteins deviate from the overall lognormal behavior. Their cumulative distribution follows a power-law analogous to Pareto's law used to describe the income distribution of the wealthy. The protein sequence length distribution in genomes of organisms has important implications for microbial evolution and applications. (C) 1999 Elsevier Science B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We give a detailed construction of a finite-state transition system for a com-connected Message Sequence Graph. Though this result is well-known in the literature and forms the basis for the solution to several analysis and verification problems concerning MSG specifications, the constructions given in the literature are either not amenable to implementation, or imprecise, or simply incorrect. In contrast we give a detailed construction along with a proof of its correctness. Our transition system is amenable to implementation, and can also be used for a bounded analysis of general (not necessarily com-connected) MSG specifications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Walker sequence, GXXXXGKT, present in all the six subunits of F-1-ATPase exists in a folded form, known as phosphate-binding loop (P-loop). Analysis of the Ramachandran angles showed only small RMS deviation between the nucleotide-bound and nucleotide-free forms. This indicated a good overlap of the backbone loops. The catalytic beta-subunits (chains D, E and F) showed significant changes in the Ramachandran angles and the side chain torsion angles, but not the structural alpha-subunits (chains A, B and C). Most striking among these are the changes associated with Val160 and Gly161 corresponding to a flip in the peptide unit between them when a nucleotide is bound (chains D or F compared to nucleotide-free chain E). The conformational analysis further revealed a hitherto unnoticed hydrogen bond between amide-N of the flipped Gly161 and terminal phosphate-O of the nucleotide. This assigns a role for this conserved amino acid, otherwise ignored, of making an unusual direct interaction between the peptide backbone of the enzyme protein and the incoming nucleotide substrate. Significance of this interaction is enhanced, as it is limited only to the catalytic subunits, and also likely to involve a mechanical rotation of bonds of the peptide unit. Hopefully this is part of the overall events that link the chemical hydrolysis of ATP with the mechanical rotation of this molecule, now famous as tiny molecular motor.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rv2118c belongs to the class of conserved hypothetical proteins from Mycobacterium tuberculosis H37Rv. The crystal structure of Rv2118c in complex with S-adenosyl-Image -methionine (AdoMet) has been determined at 1.98 Å resolution. The crystallographic asymmetric unit consists of a monomer, but symmetry-related subunits interact extensively, leading to a tetrameric structure. The structure of the monomer can be divided functionally into two domains: the larger catalytic C-terminal domain that binds the cofactor AdoMet and is involved in the transfer of methyl group from AdoMet to the substrate and a smaller N-terminal domain. The structure of the catalytic domain is very similar to that of other AdoMet-dependent methyltransferases. The N-terminal domain is primarily a β-structure with a fold not found in other methyltransferases of known structure. Database searches reveal a conserved family of Rv2118c-like proteins from various organisms. Multiple sequence alignments show several regions of high sequence similarity (motifs) in this family of proteins. Structure analysis and homology to yeast Gcd14p suggest that Rv2118c could be an RNA methyltransferase, but further studies are required to establish its functional role conclusively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A fundamental task in bioinformatics involves a transfer of knowledge from one protein molecule onto another by way of recognizing similarities. Such similarities are obtained at different levels, that of sequence, whole fold, or important substructures. Comparison of binding sites is important to understand functional similarities among the proteins and also to understand drug cross-reactivities. Current methods in literature have their own merits and demerits, warranting exploration of newer concepts and algorithms, especially for large-scale comparisons and for obtaining accurate residue-wise mappings. Here, we report the development of a new algorithm, PocketAlign, for obtaining structural superpositions of binding sites. The software is available as a web-service at http://proline.physicslisc.emetin/pocketalign/. The algorithm encodes shape descriptors in the form of geometric perspectives, supplemented by chemical group classification. The shape descriptor considers several perspectives with each residue as the focus and captures relative distribution of residues around it in a given site. Residue-wise pairings are computed by comparing the set of perspectives of the first site with that of the second, followed by a greedy approach that incrementally combines residue pairings into a mapping. The mappings in different frames are then evaluated by different metrics encoding the extent of alignment of individual geometric perspectives. Different initial seed alignments are computed, each subsequently extended by detecting consequential atomic alignments in a three-dimensional grid, and the best 500 stored in a database. Alignments are then ranked, and the top scoring alignments reported, which are then streamed into Pymol for visualization and analyses. The method is validated for accuracy and sensitivity and benchmarked against existing methods. An advantage of PocketAlign, as compared to some of the existing tools available for binding site comparison in literature, is that it explores different schemes for identifying an alignment thus has a better potential to capture similarities in ligand recognition abilities. PocketAlign, by finding a detailed alignment of a pair of sites, provides insights as to why two sites are similar and which set of residues and atoms contribute to the similarity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A simple and convenient tandem methodology for the enantiospecific generation of functionalised bicyclo[3.3.1] nonanes 9,14-18, via intermolecular alkylation of Michael donors with 10-bromocarvones 7, 10 and 11, followed by intramolcular Michael addition, is achieved. An unsuccessful attempt for the extension of the methodology for a possible short enantiospecific approach to AB-ring system 22 of taxanes via the allyl bromide 21, is also described.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tuberous sclerosis complex (TSC) is an autosomal dominant disorder with loci on chromosome 9q34.12 (TSC1) and chromosome 16p13.3 (TSC2). Genes for both loci have been isolated and characterized. The promoters of both genes have not been characterized so far and little is known about the regulation of these genes. This study reports the characterization of the human TSC1 promoter region for the first time. We have identified a novel alternative isoform in the 5' untranslated region (UTR) of the TSC1 gene transcript involving exon 1. Alternative isoforms in the 5' UTR of the mouse Tsc1 gene transcript involving exon I and exon 2 have also been identified. We have identified three upstream open reading frames (uORFs) in the 5' UTR of the TSC1/Tsc1 gene. A comparative study of the 5' UTR of TSC1/Tsc1 gene has revealed that there is a high degree of similarity not only in the sequence but also in the splicing pattern of both human and mouse TSC1 genes. We have used PCR methodology to isolate approximately 1.6 kb genomic DNA 5' to the TSC1 cDNA. This sequence has directed a high level of expression of luciferase activity in both HeLa and HepG2 cells. Successive 5' and 3' deletion analysis has suggested that a -587 bp region, from position +77 to -510 from the transcription start site (TSS), contains the promoter activity. Interestingly, this region contains no consensus TATA box or CAAT box. However, a 521-bp fragment surrounding the TSS exhibits the characteristics of a CpG island which overlaps with the promoter region. The identification of the TSC1 promoter region will help in designing a suitable strategy to identify mutations in this region in patients who do not show any mutations in the coding regions. It will also help to study the regulation of the TSC1 gene and its role in tumorigenesis. (C) 2003 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Two families of low correlation QAM sequences are presented here. In a CDMA setting, these sequences have the ability to transport a large amount of data as well as enable variable-rate signaling on the reverse link. The first family Á2SQ - B2− is constructed by interleaving 2 selected QAM sequences. This family is defined over M 2-QAM, where M = 2 m , m ≥ 2. Over 16-QAM, the normalized maximum correlation [`(q)]maxmax is bounded above by <~1.17 ÖNUnknown control sequence '\lesssim' , where N is the period of the sequences in the family. This upper bound on [`(q)]maxmax is the lowest among all known sequence families over 16-QAM.The second family Á4SQ4 is constructed by interleaving 4 selected QAM sequences. This family is defined over M 2-QAM, where M = 2 m , m ≥ 3, i.e., 64-QAM and beyond. The [`(q)]maxmax for sequences in this family over 64-QAM is upper bounded by <~1.60 ÖNUnknown control sequence '\lesssim' . For large M, [`(q)]max <~1.64 ÖNUnknown control sequence '\lesssim' . These upper bounds on [`(q)]maxmax are the lowest among all known sequence families over M 2-QAM, M = 2 m , m ≥ 3.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conventional hardware implementation techniques for FIR filters require the computation of filter coefficients in software and have them stored in memory. This approach is static in the sense that any further fine tuning of the filter requires computation of new coefficients in software. In this paper, we propose an alternate technique for implementing FIR filters in hardware. We store a considerably large number of impulse response coefficients of the ideal filter (having box type frequency response) in memory. We then do the windowing process, on these coefficients, in hardware using integer sequences as window functions. The integer sequences are also generated in hardware. This approach offers the flexibility in fine tuning the filter, like varying the transition bandwidth around a particular cutoff frequency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this article, we consider the single-machine scheduling problem with past-sequence-dependent (p-s-d) setup times and a learning effect. The setup times are proportional to the length of jobs that are already scheduled; i.e. p-s-d setup times. The learning effect reduces the actual processing time of a job because the workers are involved in doing the same job or activity repeatedly. Hence, the processing time of a job depends on its position in the sequence. In this study, we consider the total absolute difference in completion times (TADC) as the objective function. This problem is denoted as 1/LE, (Spsd)/TADC in Kuo and Yang (2007) ('Single Machine Scheduling with Past-sequence-dependent Setup Times and Learning Effects', Information Processing Letters, 102, 22-26). There are two parameters a and b denoting constant learning index and normalising index, respectively. A parametric analysis of b on the 1/LE, (Spsd)/TADC problem for a given value of a is applied in this study. In addition, a computational algorithm is also developed to obtain the number of optimal sequences and the range of b in which each of the sequences is optimal, for a given value of a. We derive two bounds b* for the normalising constant b and a* for the learning index a. We also show that, when a < a* or b > b*, the optimal sequence is obtained by arranging the longest job in the first position and the rest of the jobs in short processing time order.