958 resultados para Complete Dna-sequence


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The apicomplexan parasites Theileria annulata and Theileria parva cause severe lymphoproliferative disorders in cattle. Disease pathogenesis is linked to the ability of the parasite to transform the infected host cell (leukocyte) and induce uncontrolled proliferation. It is known that transformation involves parasite dependent perturbation of leukocyte signal transduction pathways that regulate apoptosis, division and gene expression, and there is evidence for the translocation of Theileria DNA binding proteins to the host cell nucleus. However, the parasite factors responsible for the inhibition of host cell apoptosis, or induction of host cell proliferation are unknown. The recent derivation of the complete genome sequence for both T. annulata and T. parva has provided a wealth of information that can be searched to identify molecules with the potential to subvert host cell regulatory pathways. This review summarizes current knowledge of the mechanisms used by Theileria parasites to transform the host cell, and highlights recent work that has mined the Theileria genomes to identify candidate manipulators of host cell phenotype.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In cattle, at least 39 variants of the 4 casein proteins (α(S1)-, β-, α(S2)- and κ-casein) have been described to date. Many of these variants are known to affect milk-production traits, cheese-processing properties, and the nutritive value of milk. They also provide valuable information for phylogenetic studies. So far, the majority of studies exploring the genetic variability of bovine caseins considered European taurine cattle breeds and were carried out at the protein level by electrophoretic techniques. This only allows the identification of variants that, due to amino acid exchanges, differ in their electric charge, molecular weight, or isoelectric point. In this study, the open reading frames of the casein genes CSN1S1, CSN2, CSN1S2, and CSN3 of 356 animals belonging to 14 taurine and 3 indicine cattle breeds were sequenced. With this approach, we identified 23 alleles, including 5 new DNA sequence variants, with a predicted effect on the protein sequence. The new variants were only found in indicine breeds and in one local Iranian breed, which has been phenotypically classified as a taurine breed. A multidimensional scaling approach based on available SNP chip data, however, revealed an admixture of taurine and indicine populations in this breed as well as in the local Iranian breed Golpayegani. Specific indicine casein alleles were also identified in a few European taurine breeds, indicating the introgression of indicine breeds into these populations. This study shows the existence of substantial undiscovered genetic variability of bovine casein loci, especially in indicine cattle breeds. The identification of new variants is a valuable tool for phylogenetic studies and investigations into the evolution of the milk protein genes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The LIM domain-binding protein Ldb1 is an essential cofactor of LIM-homeodomain (LIM-HD) and LIM-only (LMO) proteins in development. The stoichiometry of Ldb1, LIM-HD, and LMO proteins is tightly controlled in the cell and is likely a critical determinant of their biological actions. Single-stranded DNA-binding proteins (SSBPs) were recently shown to interact with Ldb1 and are also important in developmental programs. We establish here that two mammalian SSBPs, SSBP2 and SSBP3, contribute to an erythroid DNA-binding complex that contains the transcription factors Tal1 and GATA-1, the LIM domain protein Lmo2, and Ldb1 and binds a bipartite E-box-GATA DNA sequence motif. In addition, SSBP2 was found to augment transcription of the Protein 4.2 (P4.2) gene, a direct target of the E-box-GATA-binding complex, in an Ldb1-dependent manner and to increase endogenous Ldb1 and Lmo2 protein levels, E-box-GATA DNA-binding activity, and P4.2 and beta-globin expression in erythroid progenitors. Finally, SSBP2 was demonstrated to inhibit Ldb1 and Lmo2 interaction with the E3 ubiquitin ligase RLIM, prevent RLIM-mediated Ldb1 ubiquitination, and protect Ldb1 and Lmo2 from proteasomal degradation. These results define a novel biochemical function for SSBPs in regulating the abundance of LIM domain and LIM domain-binding proteins.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID) mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained "template-independent" sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice) is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Aniridia (AN) is a congenital, panocular disorder of the eye characterized by the complete or partial absence of the iris. The disease can occur in both the sporadic and familial forms which, in the latter case, is inherited as an autosomal dominant trait with high penetrance. The objective of this study was to isolate and characterize the genes involved in AN and Sey, and thereby to gain a better understanding of the molecular basis of the two disorders.^ Using a positional cloning strategy, I have approached and cloned from the AN locus in human chromosomal band 11p13 a cDNA that is deleted in two patients with AN. The deletions in these patients overlap by about 70 kb and encompass the 3$\sp\prime$ end of the cDNA. This cDNA detects a 2.7 kb mRNA encoded by a transcription unit estimated to span approximately 50 kb of genomic DNA. The message is specifically expressed in all tissues affected in all forms of AN, namely within the presumptive iris, lens, neuroretina, the superficial layers of the cornea, the olfactory bulbs, and the cerebellum. Sequence analysis of the AN cDNA revealed a number of motifs characteristic of certain transcription factors. Chief among these are the presence of the paired domain, the homeodomain, and a carboxy-terminal domain rich in serine, threonine and proline residues. The overall structure shows high homology to the Drosophila segmentation gene paired and members of the murine Pax family of developmental control genes.^ Utilizing a conserved human genomic DNA sequence as probe, I was able to isolate an embryonic murine cDNA which is over 92% homologous in nucleotide sequence and virtually identical at the amino acid level to the human AN cDNA. The expression pattern of the murine gene is the same as that in man, supporting the conclusion that it probably corresponds to the Sey gene. Its specific expression in the neuroectodermal component of the eye, in glioblastomas, but not in the neural crest-derived PC12 pheochromocytoma cell line, suggests that a defect in neuroectodermal rather mesodermal development might be the common etiological factor underlying AN and Sey. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Complete NotI, SfiI, XbaI and BlnI cleavage maps of Escherichia coli K-12 strain MG1655 were constructed. Techniques used included: CHEF pulsed field gel electrophoresis; transposon mutagenesis; fragment hybridization to the ordered $\lambda$ library of Kohara et al.; fragment and cosmid hybridization to Southern blots; correlation of fragments and cleavage sites with EcoMap, a sequence-modified version of the genomic restriction map of Kohara et al.; and correlation of cleavage sites with DNA sequence databases. In all, 105 restriction sites were mapped and correlated with the EcoMap coordinate system.^ NotI, SfiI, XbaI and BlnI restriction patterns of five commonly used E. coli K-12 strains were compared to those of MG1655. The variability between strains, some of which are separated by numerous steps of mutagenic treatment, is readily detectable by pulsed-field gel electrophoresis. A model is presented to account for the difference between the strains on the basis of simple insertions, deletions, and in one case an inversion. Insertions and deletions ranged in size from 1 kb to 86 kb. Several of the larger features have previously been characterized and some of the smaller rearrangements can potentially account for previously reported genetic features of these strains.^ Some aspects of the frequency and distribution of NotI, SfiI, XbaI and BlnI cleavage sites were analyzed using a method based on Markov chain theory. Overlaps of Dam and Dcm methylase sites with XbaI and SfiI cleavage sites were examined. The one XbaI-Dam overlap in the database is in accord with the expected frequency of this overlap. The occurrence of certain types of SfiI-Dcm overlaps are overrepresented. Of the four subtypes of SfiI-Dcm overlap, only one has a partial inhibitory effect on the activity of SfiI. Recognition sites for all four enzymes are rarer than expected based on oligonucleotide frequency data, with this effect being much stronger for XbaI and BlnI than for NotI and SfiI. The latter two enzyme sites are rare mainly due to apparent negative selection against GGCC (both) and CGGCCG (NotI). The former two enzyme sites are rare mainly due to effects of the VSP repair system on certain di-tri- and tetranucleotides, most notably CTAG. Models are proposed to explain several of the anomalies of oligonucleotide distribution in E. coli, and the biological significance of the systems that produce these anomalies is discussed. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A Tn916-like transposon (TnFO1) was found in the multiple antibiotic resistant Enterococcus faecalis strain FO1 isolated from a raw milk cheese. In this strain, the tetracycline determinant was localized by DNA-DNA hybridization with a tetM nucleotide probe on the chromosome and on a 30-kb plasmid. The transposon TnFO1 was identified and characterized by DNA-DNA hybridization experiments with the five internal HincII fragments of Tn916. The tetracycline resistance determinant was identified by its complete nucleotide sequence as TetM. Transposon TnFO1 was also detected in its circular form by DNA-DNA hybridization and PCR amplification. Both ends including the joining region of the closed circular transposon TnFO1 were sequenced. TnFO1 could be transferred by conjugation from Enterococcus faecalis into Enterococcus faecalis, Lactococcus lactis subsp. lactis biovar. diacetylactis, Listeria innocua, Leuconostoc mesenteroides and Staphylococcus aureus, and from Lactococcus lactis subsp. lactis biovar. diacetylactis into Listeria innocua. Pulsed-field electrophoresis of genomic DNA from E. faecalis FO1 transconjugants showed that transposon TnFO1 integrated at different sites.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Puumala virus (PUUV) is one of the predominant hantavirus species in Europe causing mild to moderate cases of haemorrhagic fever with renal syndrome. Parts of Lower Saxony in north-western Germany are endemic for PUUV infections. In this study, the complete PUUV genome sequence of a bank vole-derived tissue sample from the 2007 outbreak was determined by a combined primer-walking and RNA ligation strategy. The S, M and L genome segments were 1,828, 3,680 and 6,550 nucleotides in length, respectively. Sliding-window analyses of the nucleotide sequences of all available complete PUUV genomes indicated a non-homogenous distribution of variability with hypervariable regions located at the 3′-ends of the S and M segments. The overall similarity of the coding genome regions to the other PUUV strains ranged between 80.1 and 84.7 % at the level of the nucleotide sequence and between 89.5 and 98.1 % for the deduced amino acid sequences. In comparison to the phylogenetic trees of the complete coding sequences, trees based on partial segments revealed a general drop in phylogenetic support and a lower resolution. The Astrup strain S and M segment sequences showed the highest similarity to sequences of strains from geographically close sites in the Osnabrück Hills region. In conclusion, a primer-walking-mediated strategy resulted in the determination of the first complete nucleotide sequence of a PUUV strain from Central Europe. Different levels of variability along the genome provide the opportunity to choose regions for analyses according to the particular research question, e.g., large-scale phylogenetics or within-host evolution.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The focus of this thesis lies in the development of a sensitive method for the analysis of protein primary structure which can be easily used to confirm the DNA sequence of a protein's gene and determine the modifications which are made after translation. This technique involves the use of dipeptidyl aminopeptidase (DAP) and dipeptidyl carboxypeptidase (DCP) to hydrolyze the protein and the mass spectrometric analysis of the dipeptide products.^ Dipeptidyl carboxypeptidase was purified from human lung tissue and characterized with respect to its proteolytic activity. The results showed that the enzyme has a relatively unrestricted specificity, making it useful for the analysis of the C-terminal of proteins. Most of the dipeptide products were identified using gas chromatography/mass spectrometry (GC/MS). In order to analyze the peptides not hydrolyzed by DCP and DAP, as well as the dipeptides not identified by GC/MS, a FAB ion source was installed on a quadrupole mass spectrometer and its performance evaluated with a variety of compounds.^ Using these techniques, the sequences of the N-terminal and C-terminal regions and seven fragments of bacteriophage P22 tail protein have been verified. All of the dipeptides identified in these analysis were in the same DNA reading frame, thus ruling out the possibility of a single base being inserted or deleted from the DNA sequence. The verification of small sequences throughout the protein sequence also indicates that no large portions of the protein have been removed after translation. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Academic and industrial research in the late 90s have brought about an exponential explosion of DNA sequence data. Automated expert systems are being created to help biologists to extract patterns, trends and links from this ever-deepening ocean of information. Two such systems aimed on retrieving and subsequently utilizing phylogenetically relevant information have been developed in this dissertation, the major objective of which was to automate the often difficult and confusing phylogenetic reconstruction process. ^ Popular phylogenetic reconstruction methods, such as distance-based methods, attempt to find an optimal tree topology (that reflects the relationships among related sequences and their evolutionary history) by searching through the topology space. Various compromises between the fast (but incomplete) and exhaustive (but computationally prohibitive) search heuristics have been suggested. An intelligent compromise algorithm that relies on a flexible “beam” search principle from the Artificial Intelligence domain and uses the pre-computed local topology reliability information to adjust the beam search space continuously is described in the second chapter of this dissertation. ^ However, sometimes even a (virtually) complete distance-based method is inferior to the significantly more elaborate (and computationally expensive) maximum likelihood (ML) method. In fact, depending on the nature of the sequence data in question either method might prove to be superior. Therefore, it is difficult (even for an expert) to tell a priori which phylogenetic reconstruction method—distance-based, ML or maybe maximum parsimony (MP)—should be chosen for any particular data set. ^ A number of factors, often hidden, influence the performance of a method. For example, it is generally understood that for a phylogenetically “difficult” data set more sophisticated methods (e.g., ML) tend to be more effective and thus should be chosen. However, it is the interplay of many factors that one needs to consider in order to avoid choosing an inferior method (potentially a costly mistake, both in terms of computational expenses and in terms of reconstruction accuracy.) ^ Chapter III of this dissertation details a phylogenetic reconstruction expert system that selects a superior proper method automatically. It uses a classifier (a Decision Tree-inducing algorithm) to map a new data set to the proper phylogenetic reconstruction method. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

DNA sequence variation is currently a major source of data for studying human origins, evolution, and demographic history, and for detecting linkage association of complex diseases. In this dissertation, I investigated DNA variation in worldwide populations from two ∼10 kb autosomal regions on 22q11.2 (noncoding) and 1q24 (introns). A total of 75 variant sites were found among 128 human sequences in the 22q11.2 region, yielding an estimate of 0.088% for nucleotide diversity (π), and a total of 52 variant sites were found among 122 human sequences in the 1q24 region with an estimated π value of 0.057%. The data from these two regions and a 10 kb noncoding region on Xq13.3 all show a strong excess of low-frequency variants in comparison to that expected from an equilibrium population, indicating a relatively recent population expansion. The effective population sizes estimated from the three regions were 11,000, 12,700, and 8,600, respectively, which are close to the commonly used value of 10,000. In each of the two autosomal regions, the age of the most recent common ancestor (MRCA) was estimated to be older than 1 million years among all the sequences and ∼600,000 years among non-African sequences, providing first evidence from autosomal noncoding or intronic regions for a genetic history of humans much more ancient than the emergence of modern humans. The ancient genetic history of humans indicates no severe bottleneck during the evolution of humans in the last half million years; otherwise, much of the ancient genetic history would have been lost during a severe bottleneck. This study strongly suggests that both the “out of Africa” and the multiregional models are too simple for explaining the evolution of modern humans. A compilation of genome-wide data revealed that nucleotide diversity is highest in autosomal regions, intermediate in X-linked regions, and lowest in Y-linked regions. The data suggest the existence of background selection or selective sweep on Y-linked loci. In general, the nucleotide diversity in humans is low compared to that in chimpanzee and Drosophila populations. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background: Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel Next Generation Sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify the richness and diversity of a mixed zooplankton assemblage from a productive monitoring site in the Western English Channel. Methodology/Principle Findings: Plankton WP2 replicate net hauls (200 µm) were taken at the Western Channel Observatory long-term monitoring station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,042 sequences were obtained for all samples. The sequences clustered in to 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 138 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 75 taxonomic groups. Conclusions: The percentage of OTUs assigned to major eukaryotic taxonomic groups broadly aligns between the metagenetic and morphological analysis and are dominated by Copepoda. However, the metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for estimating diversity and species richness of zooplankton communities.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In Azotobacter vinelandii, deletion of the fdxA gene that encodes a well characterized seven-iron ferredoxin (FdI) is known to lead to overexpression of the FdI redox partner, NADPH:ferredoxin reductase (FPR). Previous studies have established that this is an oxidative stress response in which the fpr gene is transcriptionally activated to the same extent in response to either addition of the superoxide propagator paraquat to the cells or to fdxA deletion. In both cases, the activation occurs through a specific DNA sequence located upstream of the fpr gene. Here, we report the identification of the A. vinelandii protein that binds specifically to the paraquat activatable fpr promoter region as the E1 subunit of the pyruvate dehydrogenase complex (PDHE1), a central enzyme in aerobic respiration. Sequence analysis shows that PDHE1, which was not previously suspected to be a DNA-binding protein, has a helix–turn–helix motif. The data presented here further show that FdI binds specifically to the DNA-bound PDHE1.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Although polyomavirus JC (JCV) is the proven pathogen of progressive multifocal leukoencephalopathy, the fatal demyelinating disease, this virus is ubiquitous as a usually harmless symbiote among human beings. JCV propagates in the adult kidney and excretes its progeny in urine, from which JCV DNA can readily be recovered. The main mode of transmission of JCV is from parents to children through long cohabitation. In this study, we collected a substantial number of urine samples from native inhabitants of 34 countries in Europe, Africa, and Asia. A 610-bp segment of JCV DNA was amplified from each urine sample, and its DNA sequence was determined. A worldwide phylogenetic tree subsequently constructed revealed the presence of nine subtypes including minor ones. Five subtypes (EU, Af2, B1, SC, and CY) occupied rather large territories that overlapped with each other at their boundaries. The entire Europe, northern Africa, and western Asia were the domain of EU, whereas the domain of Af2 included nearly all of Africa and southwestern Asia all the way to the northeastern edge of India. Partially overlapping domains in Asia were occupied by subtypes B1, SC, and CY. Of particular interest was the recovery of JCV subtypes in a pocket or pockets that were separated by great geographic distances from the main domains of those subtypes. Certain of these pockets can readily be explained by recent migrations of human populations carrying these subtypes. Overall, it appears that JCV genotyping promises to reveal previously unknown human migration routes: ancient as well as recent.