897 resultados para DNA Sequence, Hidden Markov Model, Bayesian Model, Sensitive Analysis, Markov Chain Monte Carlo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The availability of complete genome sequences and mRNA expression data for all genes creates new opportunities and challenges for identifying DNA sequence motifs that control gene expression. An algorithm, “MobyDick,” is presented that decomposes a set of DNA sequences into the most probable dictionary of motifs or words. This method is applicable to any set of DNA sequences: for example, all upstream regions in a genome or all genes expressed under certain conditions. Identification of words is based on a probabilistic segmentation model in which the significance of longer words is deduced from the frequency of shorter ones of various lengths, eliminating the need for a separate set of reference data to define probabilities. We have built a dictionary with 1,200 words for the 6,000 upstream regulatory regions in the yeast genome; the 500 most significant words (some with as few as 10 copies in all of the upstream regions) match 114 of 443 experimentally determined sites (a significance level of 18 standard deviations). When analyzing all of the genes up-regulated during sporulation as a group, we find many motifs in addition to the few previously identified by analyzing the subclusters individually to the expression subclusters. Applying MobyDick to the genes derepressed when the general repressor Tup1 is deleted, we find known as well as putative binding sites for its regulatory partners.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We describe and test a Markov chain model of microsatellite evolution that can explain the different distributions of microsatellite lengths across different organisms and repeat motifs. Two key features of this model are the dependence of mutation rates on microsatellite length and a mutation process that includes both strand slippage and point mutation events. We compute the stationary distribution of allele lengths under this model and use it to fit DNA data for di-, tri-, and tetranucleotide repeats in humans, mice, fruit flies, and yeast. The best fit results lead to slippage rate estimates that are highest in mice, followed by humans, then yeast, and then fruit flies. Within each organism, the estimates are highest in di-, then tri-, and then tetranucleotide repeats. Our estimates are consistent with experimentally determined mutation rates from other studies. The results suggest that the different length distributions among organisms and repeat motifs can be explained by a simple difference in slippage rates and that selective constraints on length need not be imposed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction of exogenous double-stranded RNA (dsRNA) into Caenorhabditis elegans has been shown to specifically and potently disrupt the activity of genes containing homologous sequences. In this study we present evidence that the primary interference effects of dsRNA are post-transcriptional. First, we examined the primary DNA sequence after dsRNA-mediated interference and found no evidence for alterations. Second, we found that dsRNA-mediated interference with the upstream gene in a polar operon had no effect on the activity of the downstream gene; this finding argues against an effect on initiation or elongation of transcription. Third, we observed by in situ hybridization that dsRNA-mediated interference produced a substantial, although not complete, reduction in accumulation of nascent transcripts in the nucleus, while cytoplasmic accumulation of transcripts was virtually eliminated. These results indicate that the endogenous mRNA is the target for interference and suggest a mechanism that degrades the targeted RNA before translation can occur. This mechanism is not dependent on the SMG system, an mRNA surveillance system in C. elegans responsible for targeting and destroying aberrant messages. We suggest a model of how dsRNA might function in a catalytic mechanism to target homologous mRNAs for degradation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ligase-mediated gene detection has proven valuable for detection and precise distinction of DNA sequence variants. We have recently shown that T4 DNA ligase can also be used to distinguish single nucleotide variants of RNA sequences. Here we describe parameters that influence RNA-templated DNA ligation by T4 DNA ligase. The reaction proceeds much more slowly, requiring more enzyme, compared to ligation of the same oligonucleotides hybridized to the corresponding DNA sequence. The reaction is inhibited at high concentrations of ATP and NaCl and both magnesium and manganese ions can support the reaction. We define reaction conditions where 80% of RNA target molecules can template a diagnostic ligation reaction. Ligase-mediated RNA detection should provide a useful mechanism for sensitive and accurate detection and distinction of RNA sequence variants.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Mouse Genome Database (MGD) is the community database resource for the laboratory mouse, a key model organism for interpreting the human genome and for understanding human biology and disease (http://www.informatics.jax.org). MGD provides standard nomenclature and consensus map positions for mouse genes and genetic markers; it provides a curated set of mammalian homology records, user-defined chromosomal maps, experimental data sets and the definitive mouse ‘gene to sequence’ reference set for the research community. The integration and standardization of these data sets facilitates the transition between mouse DNA sequence, gene and phenotype annotations. A recent focus on allele and phenotype representations enhances the ability of MGD to organize and present data for exploring the relationship between genotype and phenotype. This link between the genome and the biology of the mouse is especially important as phenotype information grows from large mutagenesis projects and genotype information grows from large-scale sequencing projects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have implemented an approach for the detection of DNA alterations in cancer by means of computerized analysis of end-labeled genomic fragments, separated in two dimensions. Analysis of two-dimensional patterns of neuroblastoma tumors, prepared by first digesting DNA with the methylation-sensitive restriction enzyme Not I, yielded a multicopy fragment which was detected in some tumor patterns but not in normal controls. Cloning and sequencing of the fragment, isolated from two-dimensional gels, yielded a sequence with a strong homology to a subtelomeric sequence in chimpanzees and which was previously reported to be undetectable in humans. Fluorescence in situ hybridization indicated the occurrence of this sequence in normal tissue, for the most part in the satellite regions of acrocentric chromosomes. A product containing this sequence was obtained by telomere-anchored PCR using as a primer an oligonucleotide sequence from the cloned fragment. Our data suggest demethylation of cytosines at the cloned Not I site and in neighboring DNA in some tumors, compared with normal tissue, and suggest a greater similarity between human and chimpanzee subtelomeric sequences than was previously reported.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have devised a combinatorial method, restriction endonuclease protection selection and amplification (REPSA), to identify consensus ligand binding sequences in DNA. In this technique, cleavage by a type IIS restriction endonuclease (an enzyme that cleaves DNA at a site distal from its recognition sequence) is prevented by a bound ligand while unbound DNA is cleaved. Since the selection step of REPSA is performed in solution under mild conditions, this approach is amenable to the investigation of ligand-DNA complexes that are either insufficiently stable or not readily separable by other methods. Here we report the use of REPSA to identify the consensus duplex DNA sequence recognized by a G/T-rich oligodeoxyribonucleotide under conditions favoring purine-motif triple-helix formation. Analysis of 47 sequences indicated that recognition between 13 bases on the oligonucleotide 3' end and the duplex DNA was sufficient for triplex formation and indicated the possible existence of a new base triplet, G.AT. This information should help identify appropriate target sequences for purine-motif triplex formation and demonstrates the power of REPSA for investigating ligand-DNA interactions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Kinetochores are DNA-protein structures that assemble on centromeric DNA and attach chromosomes to spindle microtubules. Because of their simplicity, the 125-bp centromeres of Saccharomyces cerevisiae are particularly amenable to molecular analysis. Budding yeast centromeres contain three sequence elements of which centromere DNA sequence element III (CDEIII) appears to be particularly important. cis-acting mutations in CDEIII and trans-acting mutations in genes encoding subunits of the CDEIII-binding complex (CBF3) prevent correct chromosome transmission. Using temperature-sensitive mutations in CBF3 subunits, we show a strong correlation between DNA-binding activity measured in vitro and kinetochore activity in vivo. We extend previous findings by Goh and Kilmartin [Goh, P.-Y. & Kilmartin, J.V. (1993) J. Cell Biol. 121, 503-512] to argue that DNA-bound CBF3 may be involved in the operation of a mitotic checkpoint but that functional CBF3 is not required for the assembly of a bipolar spindle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Speech recognition involves three processes: extraction of acoustic indices from the speech signal, estimation of the probability that the observed index string was caused by a hypothesized utterance segment, and determination of the recognized utterance via a search among hypothesized alternatives. This paper is not concerned with the first process. Estimation of the probability of an index string involves a model of index production by any given utterance segment (e.g., a word). Hidden Markov models (HMMs) are used for this purpose [Makhoul, J. & Schwartz, R. (1995) Proc. Natl. Acad. Sci. USA 92, 9956-9963]. Their parameters are state transition probabilities and output probability distributions associated with the transitions. The Baum algorithm that obtains the values of these parameters from speech data via their successive reestimation will be described in this paper. The recognizer wishes to find the most probable utterance that could have caused the observed acoustic index string. That probability is the product of two factors: the probability that the utterance will produce the string and the probability that the speaker will wish to produce the utterance (the language model probability). Even if the vocabulary size is moderate, it is impossible to search for the utterance exhaustively. One practical algorithm is described [Viterbi, A. J. (1967) IEEE Trans. Inf. Theory IT-13, 260-267] that, given the index string, has a high likelihood of finding the most probable utterance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Escherichia coli cytosolic homotetrameric protein SecB is known to be involved in protein export across the plasma membrane. A currently prevalent view holds that SecB functions exclusively as a chaperone interacting nonspecifically with unfolded proteins, not necessarily exported proteins, whereas a contrary view holds that SecB functions primarily as a specific signal-recognition factor--i.e., in binding to the signal sequence region of exported proteins. To experimentally resolve these differences we assayed for binding between chemically pure SecB and chemically pure precursor (p) form (containing a signal sequence) and mature (m) form (lacking a signal sequence) of a model secretory protein (maltose binding protein, MBP) that was C-terminally truncated. Because of the C-terminal truncation, neither p nor m was able to fold. We found that SecB bound with 100-fold higher affinity to p (Kd 0.8 nM) than it bound to m (Kd 80 nM). As the presence of the signal sequence in p is the only feature that distinguished p from m, these data strongly suggest that the high-affinity binding of SecB is to the signal sequence region and not the mature region of p. Consistent with this conclusion, we found that a wild-type signal peptide, but not an export-incompetent mutant signal peptide of another exported protein (LamB), competed for binding to p. Moreover, the high-affinity binding of SecB to p was resistant to 1 M salt, whereas the low-affinity binding of SecB to m was not. These qualitative differences suggested that SecB binding to m was primarily by electrostatic interactions, whereas SecB binding to p was primarily via hydrophobic interactions, presumably with the hydrophobic core of the signal sequence. Taken together our data strongly support the notion that SecB is primarily a specific signal-recognition factor.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The bithorax complex (BX-C) of Drosophila, one of two complexes that act as master regulators of the body plan of the fly, has now been entirely sequenced and comprises approximately 315,000 bp, only 1.4% of which codes for protein. Analysis of this sequence reveals significantly overrepresented DNA motifs of unknown, as well as known, functions in the non-protein-coding portion of the sequence. The following types of motifs in that portion are analyzed: (i) concatamers of mono-, di-, and trinucleotides; (ii) tightly clustered hexanucleotides (spaced < or = 5 bases apart); (iii) direct and reverse repeats longer than 20 bp; and (iv) a number of motifs known from biochemical studies to play a role in the regulation of the BX-C. The hexanucleotide AGATAC is remarkably overrepresented and is surmised to play a role in chromosome pairing. The positions of sites of highly overrepresented motifs are plotted for those that occur at more than five sites in the sequence, when < 0.5 case is expected. Expected values are based on a third-order Markov chain, which is the optimal order for representing the BXCALL sequence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The products of the recB and recC genes are necessary for conjugal recombination and for repair of chromosomal double-chain breaks in Escherichia coli. The recD gene product combines with the RecB and RecC proteins to comprise RecBCD enzyme but is required for neither recombination nor repair. On the contrary, RecBCD enzyme is an exonuclease that inhibits recombination by destroying linear DNA. The RecD ejection model proposes that RecBCD enzyme enters a DNA duplex at a double-chain end and travels destructively until it encounters the recombination hot spot sequence chi. Chi then alters the RecBCD enzyme by weakening the affinity of the RecD subunit for the RecBC heterodimer. With the loss of the RecD subunit, the resulting protein, RecBC(D-), becomes deficient for exonuclease activity and proficient as a recombinagenic helicase. To test the model, genetic crosses between lambda phage were conducted in cells containing chi on a nonhomologous plasmid. Upon delivering a double-chain break to the plasmid, lambda recombined as if the cells had become recD mutants. The ability of chi to alter lambda recombination in trans was reversed by overproducing the RecD subunit. These results indicate that chi can influence a recombination act without directly participating in it.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have explored the feasibility of using a "double-tagging" assay for assessing which amino acids of a protein are responsible for its binding to another protein. We have chosen the adenovirus E1A-retinoblastoma gene product (pRB) proteins for a model system, and we focused on the high-affinity conserved region 2 of adenovirus E1A (CR2). We used site-specific mutagenesis to generate a mutant E1A gene with a lysine instead of an aspartic acid at position 121 within the CR2 site. We demonstrated that this mutant exhibited little binding to pRB by the double-tagging assay. We also have shown that this lack of binding is not due to any significant decrease in the level of expression of the beta-galactosidase-E1A fusion protein. We then created a "library" of phage expressing beta-galactosidase-E1A fusion proteins with a variety of different mutations within CR2. This library of E1A mutations was used in a double-tagging screening to identify mutant clones that bound to pRB. Three classes of phage were identified: the vast majority of clones were negative and exhibited no binding to pRB. Approximately 1 in 10,000 bound to pRB but not to E1A ("true positives"). A variable number of clones appeared to bind equally well to both pRB and E1A ("false positives"). The DNA sequence of 10 true positive clones yielded the following consensus sequence: DLTCXEX, where X = any amino acid. The recovery of positive clones with only one of several allowed amino acids at each position suggests that most, if not all, of the conserved residues play an important role in binding to pRB. On the other hand, the DNA sequence of the negative clones appeared random. These results are consistent with those obtained from other sources. These data suggest that a double-tagging assay can be employed for determining which amino acids of a protein are important for specifying its interaction with another protein if the complex forms within bacteria. This assay is rapid and up to 1 x 10(6) mutations can be screened at one time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a microcanonical Monte Carlo simulation of the site-diluted Potts model in three dimensions with eight internal states, partly carried out on the citizen supercomputer Ibercivis. Upon dilution, the pure model’s first-order transition becomes of the second order at a tricritical point. We compute accurately the critical exponents at the tricritical point. As expected from the Cardy-Jacobsen conjecture, they are compatible with their random field Ising model counterpart. The conclusion is further reinforced by comparison with older data for the Potts model with four states.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the critical properties of the four-state commutative random permutation glassy Potts model in three and four dimensions by means of Monte Carlo simulations and a finite-size scaling analysis. By using a field programmable gate array, we have been able to thermalize a large number of samples of systems with large volume. This has allowed us to observe a spin-glass ordered phase in d=4 and to study the critical properties of the transition. In d=3, our results are consistent with the presence of a Kosterlitz-Thouless transition, but also with different scenarios: transient effects due to a value of the lower critical dimension slightly below 3 could be very important.