984 resultados para (D)-SEQUENCES
Resumo:
Genome sequence information has generated increasing evidence for the claim that repetitive DNA sequences present within and around genes could play a important role in the regulation of gene expression. Polypurine/polypyrimidine sequences [poly(Pu/Py)] have been observed in the vicinity of promoters and within the transcribed regions of many genes. To understand whether such sequences influence the level of gene expression, we constructed several prokaryotic and eukaryotic expression vectors incorporating poly(Pu/Py) repeats both within and upstream of a reporter gene, lacZ (encoding β-galactosidase), and studied its expression in vivo. We find that, in contrast to the situation in Escherichia coli, the presence of poly(Pu/Py) sequences within the gene does not significantly inhibit gene expression in mammalian cells. On the other hand, the presence of such sequences upstream of lacZ leads to a several-fold reduction of gene expression in mammalian cells. Similar down-regulation was observed when a structural cassette containing poly(Pu/Py) sequences upstream of lacZ was integrated into yeast chromosome V. Sequence analysis of the nine totally sequenced yeast chromosomes shows that a large number of such sequences occur upstream of ORFs. On the basis of our experimental results and DNA sequence analysis, we propose that these sequences can function as cis-acting transcriptional regulators.
Resumo:
Plant seeds contain a large number of protease inhibitors of animal, fungal, and bacterial origin. One of the well-studied families of these inhibitors is the Bowman-Birk family(BBI). The BBIs from dicotyledonous seeds are 8K, double-headed proteins. In contrast, the 8K inhibitors from monocotyledonous seeds are single headed. Monocots also have a 16K, double-headed inhibitor. We have determined the primary structure of a Bowman-Birk inhibitor from a dicot, horsegram, by sequential edman analysis of the intact protein and peptides derived from enzymatic and chemical cleavage. The 76-residue-long inhibitor is very similar to that ofMacrotyloma axillare. An analysis of this inhibitor along with 26 other Bowman-Birk inhibitor domains (MW 8K) available in the SWISSPROT databank revealed that the proteins from monocots and dicots belong to related but distinct families. Inhibitors from monocots show larger variation in sequence. Sequence comparison shows that a crucial disulphide which connects the amino and carboxy termini of the active site loop is lost in monocots. The loss of a reactive site in monocots seems to be correlated to this. However, it appears that this disulphide is not absolutely essential for retention of inhibitory function. Our analysis suggests that gene duplication leading to a 16K inhibitor in monocots has occurred, probably after the divergence of monocots and dicots, and also after the loss of second reactive site in monocots.
Resumo:
The crystal structure of a hexamer duplex d(CACGTG)(2) has been determined and refined to an R-factor of 18.3% using X-ray data up to 1.2 angstrom resolution. The sequence crystallizes as a left-handed Z-form double helix with Watson-Crick base pairing. There is one hexamer duplex, a spermine molecule, 71 water molecules, and an unexpected diamine (Z-5, 1,3-propanediamine, C3H10N2)) in the asymmetric unit. This is the high-resolution non-disordered structure of a Z-DNA hexamer containing two AT base pairs in the interior of a duplex with no modifications such as bromination or methylation on cytosine bases. This structure does not possess multivalent cations such as cobalt hexaammine that are known to stabilize Z-DNA. The overall duplex structure and its crystal interactions are similar to those of the pure-spermine form of the d(CGCGCG)(2) structure. The spine of hydration in the minor groove is intact except in the vicinity of the T5A8 base pair. The binding of the Z-5 molecule in the minor grove of the d(CACGTG)(2) duplex appears to have a profound effect in conferring stability to a Z-DNA conformation via electrostatic complementarity and hydrogen bonding interactions. The successive base stacking geometry in d(CACGTG)(2) is similar to the corresponding steps in d(CG)(3). These results suggest that specific polyamines such as Z-5 could serve as powerful inducers of Z-type conformation in unmodified DNA sequences with AT base pairs. This structure provides a molecular basis for stabilizing AT base pairs incorporated into an alternating d(CG) sequence.
Resumo:
The crystal and molecular structure of the ammonium salt of deoxycytidylyl-(3'-5')-deoxyguanosine has been determined from 0.85 A resolution single crystal X-ray diffraction data. The crystals obtained by acetone diffusion technique at -20 degrees C, are orthorhombic, P212121, a = 12.880(2), b = 17444(2) and c = 27.642(2) A. The structure was solved by high resolution Patterson and Fourier methods and refined to R = 0.136. There are two d(CpG) molecules in the asymmetric unit forming a mini left handed Z-DNA helix. This is in contrast to the earlier reported forms of d(CpG) where the molecules form self base paired duplexes. There are two ammonium ions in the asymmetric unit. The major groove NH+4 ion interacts with N7 of guanines through water bridges besides making H-bonded interactions directly with the phosphate oxygen atoms. A second NH+4 ion is found in the minor groove interacting directly with the phosphate oxygen atoms. Symmetry related molecules pack in such a way that the cytosine base stacks on cytosine and guanine base on guanine. Our structure demonstrates that alternating d(CpG) sequences have the ability to adopt the left handed Z-DNA structure even at the dimer level i.e., in a sequence which is only two base pairs long.
Resumo:
Oligonucleotides containing alternating purines-pyrimidines with AT base pairs have been shown to exist in the Z-form preferably in solid state. We report that oligodeoxyribonucleotides with GG, TG and CA interruptions in their alternating CG sequences can undergo B to Z transition in solution in the absence of any chemical modification or topological constraint. The sequences, d(CGCGCGGCGCGC) and d(CGTGCGCACG) have been synthesised and shown to adopt Z- conformation in presence of millimolar concentrations of Ni2+ under low water activity conditions. Significance of GG, TG and CA interruptions in the B to Z transition is discussed.
Resumo:
L-Lysine D-glutamate crystallizes in the monoclinic space group P2(1) with a = 4.902, b = 30.719, c = 9.679 A, beta = 90 degrees and Z = 4. The crystals of L-lysine D-aspartate monohydrate belong to the orthorhombic space group P2(1)2(1)2(1) with a = 5.458, b = 7.152, c = 36.022 A and Z = 4. The structures were solved by the direct methods and refined to R values of 0.125 and 0.040 respectively for 1412 and 1503 observed reflections. The glutamate complex is highly pseudosymmetric. The lysine molecules in it assume a conformation with the side chain staggered between the alpha-amino and the alpha-carboxylate groups. The interactions of the side chain amino groups of lysine in the two complexes are such that they form infinite sequences containing alternating amino and carboxylate groups. The molecular aggregation in the glutamate complex is very similar to that observed in L-arginine D-aspartate and L-arginine D-glutamate trihydrate, with the formation of double layers consisting of both types of molecules. In contrast to the situation in the other three LD complexes, the unlike molecules in L-lysine D-aspartate monohydrate aggregate into alternating layers as in the case of most LL complexes. The arrangement of molecules in the lysine layer is nearly the same as in L-lysine L-aspartate, with head-to-tail sequences as the central feature. The arrangement of aspartate ions in the layers containing them is, however, somewhat unusual. Thus the comparison between the LL and the LD complexes analyzed so far indicates that the reversal of chirality of one of the components in a complex leads to profound changes in molecular aggregation, but these changes could be of more than one type.
Resumo:
CsHllNO2.C9HilNO2, Mr = 282.3, P1, a = 5.245 (1), b = 5.424 (1), c = 14.414 (2) A, a = 97.86 (1), fl = 93-69 (2), y = 70-48 (2) °, V= 356 A 3, Z = 1, O m = 1-32 (2), Dx = 1.32 g cm-3, h(Mo Ka) = 0-7107 A, g = 5-9 cm-1, F(000) = 158, T= 298 K, R=0.035 for 1518 observed reflections with I>2tr(I). The molecules aggregate in double layers, one ayer made up of L-phenylalanine molecules and the other of D-valine molecules. Each double layer is stabilized by interactions involving main-chain atoms of both types of molecules. The interactions include hydrogen bonds which give rise to two head-to-tail sequences. The arrangement of molecules in the complex is almost the same as that in the structure of DL-valine (and DL-leucine and DL-isoleucine) except for the change in the side chain of L molecules. The molecules in crystals containing an equal number of L and O hydrophobic amino-acid molecules thus appear to aggregate in a similar fashion, irrespective of the precise details of the side chain.
Resumo:
Sequence repeats constituting the telomeric regions of chromosomes are known to adopt a variety of unusual structures, consisting of a G tetraplex stem and short stretches of thymines or thymines and adenines forming loops over the stem. Detailed model building and molecular mechanics studies have been carried out for these telomeric sequences to elucidate different types of loop orientations and possible conformations of thymines in the loop. The model building studies indicate that a minimum of two thymines have to be interspersed between guanine stretches to form folded-back structures with loops across adjacent strands in a G tetraplex (both over the small as well as large groove), while the minimum number of thymines required to build a loop across the diagonal strands in a G tetraplex is three. For two repeat sequences, these hairpins, resulting from different types of folding, can dimerize in three distinct ways-i.e., with loops across adjacent strands and on same side, with loops across adjacent strands and on opposite sides, and with loops across diagonal strands and on opposite sides-to form hairpin dimer structures. Energy minimization studies indicate that all possible hairpin dimers have very similar total energy values, though different structures are stabilized by different types of interactions. When the two loops are on the same side, in the hairpin dimer structures of d(G(4)T(n)G(4)), the thymines form favorably stacked tetrads in the loop region and there is interloop hydrogen bonding involving two hydrogen bonds for each thymine-thymine pair. Our molecular mechanics calculations on various folded-back as well as parallel tetraplex structures of these telomeric sequences provide a theoretical rationale for the experimentally observed feature that the presence of intervening thymine stretches stabilizes folded-back structures, while isolated stretches of guanines adopt a parallel tetraplex structure
Resumo:
Physical clustering of genes has been shown in plants; however, little is known about gene clusters that have different functions, particularly those expressed in the tomato fruit. A class I 17.6 small heat shock protein (Sl17.6 shsp) gene was cloned and used as a probe to screen a tomato (Solanum lycopersicum) genomic library. An 8.3-kb genomic fragment was isolated and its DNA sequence determined. Analysis of the genomic fragment identified intronless open reading frames of three class I shsp genes (Sl17.6, Sl20.0, and Sl20.1), the Sl17.6 gene flanked by Sl20.1 and Sl20.0, with complete 5' and 3' UTRs. Upstream of the Sl20.0 shsp, and within the shsp gene cluster, resides a box C/D snoRNA cluster made of SlsnoR12.1 and SlU24a. Characteristic C and D, and C' and D', boxes are conserved in SlsnoR12.1 and SlU24a while the upstream flanking region of SlsnoR12.1 carries TATA box 1, homol-E and homol-D box-like cis sequences, TM6 promoter, and an uncharacterized tomato EST. Molecular phylogenetic analysis revealed that this particular arrangement of shsps is conserved in tomato genome but is distinct from other species. The intronless genomic sequence is decorated with cis elements previously shown to be responsive to cues from plant hormones, dehydration, cold, heat, and MYC/MYB and WRKY71 transcription factors. Chromosomal mapping localized the tomato genomic sequence on the short arm of chromosome 6 in the introgression line (IL) 6-3. Quantitative polymerase chain reaction analysis of gene cluster members revealed differential expression during ripening of tomato fruit, and relatively different abundances in other plant parts.
Resumo:
Receive antenna selection (AS) has been shown to maintain the diversity benefits of multiple antennas while potentially reducing hardware costs. However, the promised diversity gains of receive AS depend on the assumptions of perfect channel knowledge at the receiver and slowly time-varying fading. By explicitly accounting for practical constraints imposed by the next-generation wireless standards such as training, packetization and antenna switching time, we propose a single receive AS method for time-varying fading channels. The method exploits the low training overhead and accuracy possible from the use of discrete prolate spheroidal (DPS) sequences based reduced rank subspace projection techniques. It only requires knowledge of the Doppler bandwidth, and does not require detailed correlation knowledge. Closed-form expressions for the channel prediction and estimation error as well as symbol error probability (SEP) of M-ary phase-shift keying (MPSK) for symbol-by-symbol receive AS are also derived. It is shown that the proposed AS scheme, after accounting for the practical limitations mentioned above, outperforms the ideal conventional single-input single-output (SISO) system with perfect CSI and no AS at the receiver and AS with conventional estimation based on complex exponential basis functions.
Suite of tools for statistical N-gram language modeling for pattern mining in whole genome sequences
Resumo:
Genome sequences contain a number of patterns that have biomedical significance. Repetitive sequences of various kinds are a primary component of most of the genomic sequence patterns. We extended the suffix-array based Biological Language Modeling Toolkit to compute n-gram frequencies as well as n-gram language-model based perplexity in windows over the whole genome sequence to find biologically relevant patterns. We present the suite of tools and their application for analysis on whole human genome sequence.
Resumo:
The crystal structures of nine peptides containing gamma(4)Val and gamma(4)Leu are described. The short sequences Boc-gamma(4)(R)Val](2)-OMe 1, Boc-gamma(4)(R)Val](3)-NHMe 2 and Boc-gamma(4)(S)Val-gamma(4)(R)Val-OMe 3 adopt extended apolar, sheet like structures. The tetrapeptide Boc-gamma(4)(R)Val](4)-OMe 4 adopts an extended conformation, in contrast to the folded C-14 helical structure determined previously for Boc-gamma(4)(R)Leu](4)-OMe. The hybrid alpha gamma sequence Boc-Ala-gamma(4)(R)Leu](2)-OMe 5 adopts an S-shaped structure devoid of intramolecular hydrogen bonds, with both alpha residues adopting local helical conformations. In sharp contrast, the tetrapeptides Boc-Aib-gamma(4)(S)Leu](2)-OMe 6 and Boc-Leu-gamma(4)(R)Leu](2)-OMe 7 adopt folded structures stabilized by two successive C-12 hydrogen bonds. gamma(4)Val residues have also been incorporated into the strand segments of a crystalline octapeptide, Boc-Leu-gamma(4)(R)Val-Val-(D)Pro-Gly-Leu-gamma(4)(R)Val-Val-OMe 8. The gamma gamma delta gamma tetrapeptide containing gamma(4)Val and delta(5)Leu residues adopts an extended sheet like structure. The hydrogen bonding pattern at gamma residues corresponds to an apolar sheet, while a polar sheet is observed at the lone delta residue. The transition between folded and extended structures at gamma residues involves a change of the torsion angle from the gauche to the trans conformation about the C-beta-C-alpha bond.
Resumo:
The recombination-activating gene products, RAG1 and RAG2, initiate V(D)J recombination during lymphocyte development by cleaving DNA adjacent to conserved recombination signal sequences (RSSs). The reaction involves DNA binding, synapsis, and cleavage at two RSSs located on the same DNA molecule and results in the assembly of antigen receptor genes. Since their discovery full-length, RAG1 and RAG2 have been difficult to purify, and core derivatives are shown to be most active when purified from adherent 293-T cells. However, the protein yield from adherent 293-T cells is limited. Here we develop a human suspension cell purification and change the expression vector to boost RAG production 6-fold. We use these purified RAG proteins to investigate V(D)J recombination on a mechanistic single molecule level. As a result, we are able to measure the binding statistics (dwell times and binding energies) of the initial RAG binding events with or without its co-factor high mobility group box protein 1 (HMGB1), and to characterize synapse formation at the single-molecule level yielding insights into the distribution of dwell times in the paired complex and the propensity for cleavage upon forming the synapse. We then go on to investigate HMGB1 further by measuring it compact single DNA molecules. We observed concentration dependent DNA compaction, differential DNA compaction depending on the divalent cation type, and found that at a particular HMGB1 concentration the percentage of DNA compacted is conserved across DNA lengths. Lastly, we investigate another HMGB protein called TFAM, which is essential for packaging the mitochondrial genome. We present crystal structures of TFAM bound to the heavy strand promoter 1 (HSP1) and to nonspecific DNA. We show TFAM dimerization is dispensable for DNA bending and transcriptional activation, but is required for mtDNA compaction. We propose that TFAM dimerization enhances mtDNA compaction by promoting looping of mtDNA.
Resumo:
The σD values of nitrated cellulose from a variety of trees covering a wide geographic range have been measured. These measurements have been used to ascertain which factors are likely to cause σD variations in cellulose C-H hydrogen.
It is found that a primary source of tree σD variation is the σD variation of the environmental precipitation. Superimposed on this are isotopic variations caused by the transpiration of the leaf water incorporated by the tree. The magnitude of this transpiration effect appears to be related to relative humidity.
Within a single tree, it is found that the hydrogen isotope variations which occur for a ring sequence in one radial direction may not be exactly the same as those which occur in a different direction. Such heterogeneities appear most likely to occur in trees with asymmetric ring patterns that contain reaction wood. In the absence of reaction wood such heterogeneities do not seem to occur. Thus, hydrogen isotope analyses of tree ring sequences should be performed on trees which do not contain reaction wood.
Comparisons of tree σD variations with variations in local climate are performed on two levels: spatial and temporal. It is found that the σD values of 20 North American trees from a wide geographic range are reasonably well-correlated with the corresponding average annual temperature. The correlation is similar to that observed for a comparison of the σD values of annual precipitation of 11 North American sites with annual temperature. However, it appears that this correlation is significantly disrupted by trees which grew on poorly drained sites such as those in stagnant marshes. Therefore, site selection may be important in choosing trees for climatic interpretation of σD values, although proper sites do not seem to be uncommon.
The measurement of σD values in 5-year samples from the tree ring sequences of 13 trees from 11 North American sites reveals a variety of relationships with local climate. As it was for the spatial σD vs climate comparison, site selection is also apparently important for temporal tree σD vs climate comparisons. Again, it seems that poorly-drained sites are to be avoided. For nine trees from different "well-behaved" sites, it was found that the local climatic variable best related to the σD variations was not the same for all sites.
Two of these trees showed a strong negative correlation with the amount of local summer precipitation. Consideration of factors likely to influence the isotopic composition of summer rain suggests that rainfall intensity may be important. The higher the intensity, the lower the σD value. Such an effect might explain the negative correlation of σD vs summer precipitation amount for these two trees. A third tree also exhibited a strong correlation with summer climate, but in this instance it was a positive correlation of σD with summer temperature.
The remaining six trees exhibited the best correlation between σD values and local annual climate. However, in none of these six cases was it annual temperature that was the most important variable. In fact annual temperature commonly showed no relationship at all with tree σD values. Instead, it was found that a simple mass balance model incorporating two basic assumptions yielded parameters which produced the best relationships with tree σD values. First, it was assumed that the σD values of these six trees reflected the σD values of annual precipitation incorporated by these trees. Second, it was assumed that the σD value of the annual precipitation was a weighted average of two seasonal isotopic components: summer and winter. Mass balance equations derived from these assumptions yielded combinations of variables that commonly showed a relationship with tree σD values where none had previously been discerned.
It was found for these "well-behaved" trees that not all sample intervals in a σD vs local climate plot fell along a well-defined trend. These departures from the local σD VS climate norm were defined as "anomalous". Some of these anomalous intervals were common to trees from different locales. When such widespread commonalty of an anomalous interval occurred, it was observed that the interval corresponded to an interval in which drought had existed in the North American Great Plains.
Consequently, there appears to be a combination of both local and large scale climatic information in the σD variations of tree cellulose C-H hydrogen.
Resumo:
Hairpin pyrrole-imdazole polyamides are cell-permeable, sequence-programmable oligomers that bind in the minor groove of DNA. This thesis describes studies of Py-Im polyamides targeted to biologically important DNA repeat sequences for the purpose of modulating disease states. Design of a hairpin polyamide that binds the CG dyad, a site of DNA methylation that can become dysregulated in cancer, is described. We report the synthesis of a DNA methylation antagonist, its sequence specificity and affinity informed by Bind-n-Seq and iteratively designed, which improves inhibitory activity in a cell-free assay by 1000-fold to low nanomolar IC50. Additionally, a hairpin polyamide targeted to the telomeric sequence is found to trigger a slow necrotic-type cell death with the release of inflammatory molecules in a model of B cell lymphoma. The effects of the polyamide are unique in this class of oligomers; its effects are characterized and a functional assay of phagocytosis by macrophages is described. Additionally, hairpin polyamides targeted to pathologically expanded CTG•CAG triplet repeat DNA sequences, the molecular cause of myotonic dystrophy type 1, are synthesized and assessed for toxicity. Lastly, ChIP-seq of Hypoxia-Inducible Factor is performed under hypoxia-induced conditions. The study results show that ChIP-seq can be employed to understand the genome-wide perturbation of Hypoxia-Inducible Factor occupancy by a Py-Im polyamide.