997 resultados para SIMPLE SEQUENCES
Resumo:
Genetic diversity and population structure of Plasmodium viva-V parasites call predict the origin and Spread of novel Variants Within a population enabling Population specific malaria control measures. We analyzed the genetic diversity and population Structure of 425 P. vivax isolates from Sri Lanka, Myanmar, and Ethiopia using 12 trinucleotide and tetranucleotide microsatellite markers. All three parasite populations were highly polymorphic with 3-44 alleles per locus. Approximately 65% were multiple-clone infections. Mean genetic diversity (H(E)) was 0.7517 in Ethiopia, 0.8450 in Myanmar, and 0.8610 in Sri Lanka. Significant linkage disequilibrium Was maintained. Population structure showed two clusters (Asian and African) according to geography and ancestry Strong clustering of outbreak isolates from Sri Lanka and Ethiopia was observed. Predictive power of ancestry using two-thirds of the isolates as a model identified 78.2% of isolates accurately as being African or Asian. Microsatellite analysis is a useful tool for mapping short-term outbreaks of malaria and for predicting ancestry.
Resumo:
A computational system for the prediction of polymorphic loci directly and efficiently from human genomic sequence was developed and verified. A suite of programs, collectively called pompous (polymorphic marker prediction of ubiquitous simple sequences) detects tandem repeats ranging from dinucleotides up to 250 mers, scores them according to predicted level of polymorphism, and designs appropriate flanking primers for PCR amplification. This approach was validated on an approximately 750-kilobase region of human chromosome 3p21.3, involved in lung and breast carcinoma homozygous deletions. Target DNA from 36 paired B lymphoblastoid and lung cancer lines was amplified and allelotyped for 33 loci predicted by pompous to be variable in repeat size. We found that among those 36 predominately Caucasian individuals 22 of the 33 (67%) predicted loci were polymorphic with an average heterozygosity of 0.42. Allele loss in this region was found in 27/36 (75%) of the tumor lines using these markers. pompous provides the genetic researcher with an additional tool for the rapid and efficient identification of polymorphic markers, and through a World Wide Web site, investigators can use pompous to identify polymorphic markers for their research. A catalog of 13,261 potential polymorphic markers and associated primer sets has been created from the analysis of 141,779,504 base pairs of human genomic sequence in GenBank. This data is available on our Web site (pompous.swmed.edu) and will be updated periodically as GenBank is expanded and algorithm accuracy is improved.
Resumo:
The objective of this work was to identify expressed simple sequence repeats (SSR) markers associated to leaf miner resistance in coffee progenies. Identification of SSR markers was accomplished by directed searches on the Brazilian Coffee Expressed Sequence Tags (EST) database. Sequence analysis of 32 selected SSR loci showed that 65% repeats are of tetra-, 21% of tri- and 14% of dinucleotides. Also, expressed SSR are localized frequently in the 5'-UTR of gene transcript. Moreover, most of the genes containing SSR are associated with defense mechanisms. Polymorphisms were analyzed in progenies segregating for resistance to the leaf miner and corresponding to advanced generations of a Coffea arabica x Coffea racemosa hybrid. Frequency of SSR alleles was 2.1 per locus. However, no polymorphism associated with leaf miner resistance was identified. These results suggest that marker-assisted selection in coffee breeding should be performed on the initial cross, in which genetic variability is still significant.
Resumo:
The availaibilty of chloroplast genome (cpDNA) sequences of Atropa belladonna, Nicotiana sylvestris, N tabacum, N tomentosiformis, Solanum bulbocastanum, S lycopersicum and S tuberosum, which are Solanaceae species, allowed us to analyze the organization of cpSSRs in their genic and intergenic regions In general, the number of cpSSRs in cpDNA ranged from 161 in S tuberosum to 226 in N tabacum, and the number of intergenic cpSSRs was higher than genic cpSSRs The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, penta- and hexanucleotide repeats Multiple alignments of all cpSSRs sequence from Solanaceae species made the identification of nucleotide variability possible and the phylogeny was estimated by maximum parsimony Our study showed that the plastome database can be exploited for phylogenetic analyses and biotechnological approaches
Resumo:
This work addresses the question of whether it is possible to define simple pairwise interaction terms to approximate free energies of proteins or polymers. Rather than ask how reliable a potential of mean force is, one can ask how reliable it could possibly be. In a two-dimensional, infinite lattice model system one can calculate exact free energies by exhaustive enumeration. A series of approximations were fitted to exact results to assess the feasibility and utility of pairwise free energy terms. Approximating the true free energy with pairwise interactions gives a poor fit with little transferability between systems of different size. Adding extra artificial terms to the approximation yields better fits, but does not improve the ability to generalize from one system size to another. Furthermore, one cannot distinguish folding from nonfolding sequences via the approximated free energies. Most usefully, the methodology shows how one can assess the utility of various terms in lattice protein/polymer models. (C) 2001 American Institute of Physics.
Resumo:
Microsatellites or simple sequence repeats (SSRs) are ubiquitous in eukaryotic genomes. Single-locus SSR markers have been developed for a number of species, although there is a major bottleneck in developing SSR markers whereby flanking sequences must be known to design 5'-anchors for polymerase chain reaction (PCR) primers. Inter SSR (ISSR) fingerprinting was developed such that no sequence knowledge was required. Primers based on a repeat sequence, such as (CA)(n), can be made with a degenerate 3'-anchor, such as (CA)(8)RG or (AGC)(6)TY. The resultant PCR reaction amplifies the sequence between two SSRs, yielding a multilocus marker system useful for fingerprinting, diversity analysis and genome mapping. PCR products are radiolabelled with P-32 or P-33 via end-labelling or PCR incorporation, and separated on a polyacrylamide sequencing gel prior to autoradiographic visualisation. A typical reaction yields 20-100 bands per lane depending on the species and primer. We have used ISSR fingerprinting in a number of plant species, and report here some results on two important tropical species, sorghum and banana. Previous investigators have demonstrated that ISSR analysis usually detects a higher level of polymorphism than that detected with restriction fragment length polymorphism (RFLP) or random amplified polymorphic DNA (RAPD) analyses. Our data indicate that this is not a result of greater polymorphism genetically, but rather technical reasons related to the detection methodology used for ISSR analysis.
Resumo:
Many of our everyday tasks require the control of the serial order and the timing of component actions. Using the dynamic neural field (DNF) framework, we address the learning of representations that support the performance of precisely time action sequences. In continuation of previous modeling work and robotics implementations, we ask specifically the question how feedback about executed actions might be used by the learning system to fine tune a joint memory representation of the ordinal and the temporal structure which has been initially acquired by observation. The perceptual memory is represented by a self-stabilized, multi-bump activity pattern of neurons encoding instances of a sensory event (e.g., color, position or pitch) which guides sequence learning. The strength of the population representation of each event is a function of elapsed time since sequence onset. We propose and test in simulations a simple learning rule that detects a mismatch between the expected and realized timing of events and adapts the activation strengths in order to compensate for the movement time needed to achieve the desired effect. The simulation results show that the effector-specific memory representation can be robustly recalled. We discuss the impact of the fast, activation-based learning that the DNF framework provides for robotics applications.
Resumo:
Simple sequence repeat anchored polymerase chain reaction amplification (SSR-PCR) is a genetic typing technique based on primers anchored at the 5' or 3' ends of microsatellites, at high primer annealing temperatures. This technique has already been used in studies of genetic variability of several organisms, using different primer designs. In order to conduct a detailed study of the SSR-PCR genomic targets, we cloned and sequenced 20 unique amplification products of two commonly used primers, CAA(CT)6 and (CA)8RY, using Biomphalaria glabrata genomic DNA as template. The sequences obtained were novel B. glabrata genomic sequences. It was observed that 15 clones contained microsatellites between priming sites. Out of 40 clones, seven contained complex sequence repetitions. One of the repeats that appeared in six of the amplified fragments generated a single band in Southern analysis, indicating that the sequence was not widespread in the genome. Most of the annealing sites for the CAA(CT)6 primer contained only the six repeats found within the primer sequence. In conclusion, SSR-PCR is a useful genotyping technique. However, the premise of the SSR-PCR technique, verified with the CAA(CT)6 primer, could not be supported since the amplification products did not result necessarily from microsatellite loci amplification.
Resumo:
In the last decade microsatellites have become one of the most useful genetic markers used in a large number of organisms due to their abundance and high level of polymorphism. Microsatellites have been used for individual identification, paternity tests, forensic studies and population genetics. Data on microsatellite abundance comes preferentially from microsatellite enriched libraries and DNA sequence databases. We have conducted a search in GenBank of more than 16,000 Schistosoma mansoni ESTs and 42,000 BAC sequences. In addition, we obtained 300 sequences from CA and AT microsatellite enriched genomic libraries. The sequences were searched for simple repeats using the RepeatMasker software. Of 16,022 ESTs, we detected 481 (3%) sequences that contained 622 microsatellites (434 perfect, 164 imperfect and 24 compounds). Of the 481 ESTs, 194 were grouped in 63 clusters containing 2 to 15 ESTs per cluster. Polymorphisms were observed in 16 clusters. The 287 remaining ESTs were orphan sequences. Of the 42,017 BAC end sequences, 1,598 (3.8%) contained microsatellites (2,335 perfect, 287 imperfect and 79 compounds). The 1,598 BAC end sequences 80 were grouped into 17 clusters containing 3 to 17 BAC end sequences per cluster. Microsatellites were present in 67 out of 300 sequences from microsatellite enriched libraries (55 perfect, 38 imperfect and 15 compounds). From all of the observed loci 55 were selected for having the longest perfect repeats and flanking regions that allowed the design of primers for PCR amplification. Additionally we describe two new polymorphic microsatellite loci.
Resumo:
In Arabidopsis thaliana, gene expression level polymorphisms (ELPs) between natural accessions that exhibit simple, single locus inheritance are promising quantitative trait locus (QTL) candidates to explain phenotypic variability. It is assumed that such ELPs overwhelmingly represent regulatory element polymorphisms. However, comprehensive genome-wide analyses linking expression level, regulatory sequence and gene structure variation are missing, preventing definite verification of this assumption. Here, we analyzed ELPs observed between the Eil-0 and Lc-0 accessions. Compared with non-variable controls, 5' regulatory sequence variation in the corresponding genes is indeed increased. However, approximately 42% of all the ELP genes also carry major transcription unit deletions in one parent as revealed by genome tiling arrays, representing a >4-fold enrichment over controls. Within the subset of ELPs with simple inheritance, this proportion is even higher and deletions are generally more severe. Similar results were obtained from analyses of the Bay-0 and Sha accessions, using alternative technical approaches. Collectively, our results suggest that drastic structural changes are a major cause for ELPs with simple inheritance, corroborating experimentally observed indel preponderance in cloned Arabidopsis QTL.
Resumo:
We present a simple randomized procedure for the prediction of a binary sequence. The algorithm uses ideas from recent developments of the theory of the prediction of individual sequences. We show that if thesequence is a realization of a stationary and ergodic random process then the average number of mistakes converges, almost surely, to that of the optimum, given by the Bayes predictor.
Resumo:
The bacterial insertion sequence IS21 contains two genes, istA and istB, which are organized as an operon. IS21 spontaneously forms tandem repeats designated (IS21)2. Plasmids carrying (IS21)2 react efficiently with other replicons, producing cointegrates via a cut-and-paste mechanism. Here we show that transposition of a single IS21 element (simple insertion) and cointegrate formation involving (IS21)2 result from two distinct non-replicative pathways, which are essentially due to two differentiated IstA proteins, transposase and cointegrase. In Escherichia coli, transposase was characterized as the full-length, 46 kDa product of the istA gene, whereas the 45 kDa cointegrase was expressed, in-frame, from a natural internal translation start of istA. The istB gene, which could be experimentally disconnected from istA, provided a helper protein that strongly stimulated the transposase and cointegrase-driven reactions. Site-directed mutagenesis was used to express either cointegrase or transposase from the istA gene. Cointegrase promoted replicon fusion at high frequencies by acting on IS21 ends which were linked by 2, 3, or 4 bp junction sequences in (IS21)2. By contrast, cointegrase poorly catalyzed simple insertion of IS21 elements. Transposase had intermediate, uniform activity in both pathways. The ability of transposase to synapse two widely spaced IS21 ends may reside in the eight N-terminal amino acid residues which are absent from cointegrase. Given the 2 or 3 bp spacing in naturally occurring IS21 tandems and the specialization of cointegrase, the fulminant spread of IS21 via cointegration can now be understood.
Resumo:
Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limited in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries.