993 resultados para conserved noncoding sequence


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite the presence of over 3 million transposons separated on average by similar to 500 bp, the human and mouse genomes each contain almost 1000 transposon-free regions (TFRs) over 10 kb in length. The majority of human TFRs correlate with orthologous TFRs in the mouse, despite the fact that most transposons are lineage specific. Many human TFRs also overlap with orthologous TFRs in the marsupial opossum, indicating that these regions have remained refractory to transposon insertion for long evolutionary periods. Over 90% of the bases covered by TFRs are noncoding, much of which is not highly conserved. Most TFRs are not associated with unusual nucleotide composition, but are significantly associated with genes encoding developmental regulators, suggesting that they represent extended regions of regulatory information that are largely unable to tolerate insertions, a conclusion difficult to reconcile with current conceptions of gene regulation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cyclotides are a fascinating family of plant-derived peptides characterized by their head-to-tail cyclized backbone and knotted arrangement of three disulfide bonds. This conserved structural architecture, termed the CCK (cyclic cystine knot), is responsible for their exceptional resistance to thermal, chemical and enzymatic degradation. Cyclotides have a variety of biological activities, but their insecticidal activities suggest that their primary function is in plant defence. In the present study, we determined the cyclotide content of the sweet violet Viola odorata, a member of the Violaceae family. We identified 30 cyclotides from the aerial parts and roots of this plant, 13 of which are novel sequences. The new sequences provide information about the natural diversity of cyclotides and the role of particular residues in defining structure and function. As many of the biological activities of cyclotides appear to be associated with membrane interactions, we used haemolytic activity as a marker of bioactivity for a selection of the new cyclotides. The new cyclotides were tested for their ability to resist proteolysis by a range of enzymes and, in common with other cyclotides, were completely resistant to trypsin, pepsin and thermolysin. The results show that while biological activity varies with the sequence, the proteolytic stability of the framework does not, and appears to be an inherent feature of the cyclotide framework. The structure of one of the new cyclotides, cycloviolacin O14, was determined and shown to contain the CCK motif. This study confirms that cyclotides may be regarded as a natural combinatorial template that displays a variety of peptide epitopes most likely targeted to a range of plant pests and pathogens.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prediction of peroxisomal matrix proteins generally depends on the presence of one of two distinct motifs at the end of the amino acid sequence. PTS1 peroxisomal proteins have a well conserved tripeptide at the C-terminal end. However, the preceding residues in the sequence arguably play a crucial role in targeting the protein to the peroxisome. Previous work in applying machine learning to the prediction of peroxisomal matrix proteins has failed W capitalize on the full extent of these dependencies. We benchmark a range of machine learning algorithms, and show that a classifier - based on the Support Vector Machine - produces more accurate results when dependencies between the conserved motif and the preceding section are exploited. We publish an updated and rigorously curated data set that results in increased prediction accuracy of most tested models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Randomisation of DNA using conventional methodology requires an excess of genes to be cloned, since with randomised codons NNN or NNG/T 64 genes or 32 genes must be cloned to encode 20 amino acids respectively. Thus, as the number of randomised codons increases, the number of genes required to encode a full set of proteins increases exponentially. Various methods have been developed that address the problems associated with excess of genes that occurs due to the degeneracy of the genetic code. These range from chemical methodologies to biological methods. These all involve the replacement, insertion or deletion of codon(s) rather than individual nucleotides. The biological methods are however limited to random insertion/deletion or replacement. Recent work by Hughes et al., (2003) has randomised three binding residues of a zinc finger gene. The drawback with this is the fact that consecutive codons cannot undergo saturation mutagenesis. This thesis describes the development of a method of saturation mutagenesis that can be used to randomise any number of consecutive codons in a DNA strand. The method makes use of “MAX” oligonucleotides coding for each of the 20 amino acids that are ligated to a conserved sequence of DNA using T4 DNA ligase. The “MAX” oligonucleotides were synthesised in such a way, with an MlyI restriction site, that restriction of the oligonucleotides occurred after the three nucleotides coding for the amino acids. This use of the MlyI site and the restrict, purify, ligate and amplify method allows the insertion of “MAX” codons at any position in the DNA. This methodology reduces the number of clones that are required to produce a representative library and has been demonstrated to be effective to 7 amino acid positions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Formal grammars can used for describing complex repeatable structures such as DNA sequences. In this paper, we describe the structural composition of DNA sequences using a context-free stochastic L-grammar. L-grammars are a special class of parallel grammars that can model the growth of living organisms, e.g. plant development, and model the morphology of a variety of organisms. We believe that parallel grammars also can be used for modeling genetic mechanisms and sequences such as promoters. Promoters are short regulatory DNA sequences located upstream of a gene. Detection of promoters in DNA sequences is important for successful gene prediction. Promoters can be recognized by certain patterns that are conserved within a species, but there are many exceptions which makes the promoter recognition a complex problem. We replace the problem of promoter recognition by induction of context-free stochastic L-grammar rules, which are later used for the structural analysis of promoter sequences. L-grammar rules are derived automatically from the drosophila and vertebrate promoter datasets using a genetic programming technique and their fitness is evaluated using a Support Vector Machine (SVM) classifier. The artificial promoter sequences generated using the derived L- grammar rules are analyzed and compared with natural promoter sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The hepatitis C virus (HCV) is able to persist as a chronic infection, which can lead to cirrhosis and liver cancer. There is evidence that clearance of HCV is linked to strong responses by CD8 cytotoxic T lymphocytes (CTLs), suggesting that eliciting CTL responses against HCV through an epitope-based vaccine could prove an effective means of immunization. However, HCV genomic plasticity as well as the polymorphisms of HLA I molecules restricting CD8 T-cell responses challenges the selection of epitopes for a widely protective vaccine. Here, we devised an approach to overcome these limitations. From available databases, we first collected a set of 245 HCV-specific CD8 T-cell epitopes, all known to be targeted in the course of a natural infection in humans. After a sequence variability analysis, we next identified 17 highly invariant epitopes. Subsequently, we predicted the epitope HLA I binding profiles that determine their potential presentation and recognition. Finally, using the relevant HLA I-genetic frequencies, we identified various epitope subsets encompassing 6 conserved HCV-specific CTL epitopes each predicted to elicit an effective T-cell response in any individual regardless of their HLA I background. We implemented this epitope selection approach for free public use at the EPISOPT web server. © 2013 Magdalena Molero-Abraham et al.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mammalian C3 is a pivotal complement protein, encoded for by a single gene. In some vertebrate species multiple C3 isoforms are products of different C3 genes. The goal of this study was to determine whether multiple genes encode for shark C3. A protocol was developed for the isolation of mRNA from shark blood for the isolation of C3 cDNA clones. RT-PCR amplification of mRNA, using sense (GCGEQNM) and antisense (TWLTAYV) primers encoding conserved regions of human C3, yielded 21 clones. The C3-like clones isolated shared 97% similarity with each other and 40% similarity to human C3. RACE-PCR amplification of shark liver RNA, using gene specific primers, yielded products ranging from 1800bp to 3000bp. Deduced amino acid sequence, corresponding to 408bp of the 1800bp fragment, was obtained which showed 51% similarity to human C3. These results suggest that nurse shark C3 might be encoded for by more than one gene. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: During alternative splicing, the inclusion of an exon in the final mRNA molecule is determined by nuclear proteins that bind cis-regulatory sequences in a target pre-mRNA molecule. A recent study suggested that the regulatory codes of individual RNA-binding proteins may be nearly immutable between very diverse species such as mammals and insects. The model system Drosophila melanogaster therefore presents an excellent opportunity for the study of alternative splicing due to the availability of quality EST annotations in FlyBase. Methods: In this paper, we describe an in silico analysis pipeline to extract putative exonic splicing regulatory sequences from a multiple alignment of 15 species of insects. Our method, ESTs-to-ESRs (E2E), uses graph analysis of EST splicing graphs to identify mutually exclusive (ME) exons and combines phylogenetic measures, a sliding window approach along the multiple alignment and the Welch’s t statistic to extract conserved ESR motifs. Results: The most frequent 100% conserved word of length 5 bp in different insect exons was “ATGGA”. We identified 799 statistically significant “spike” hexamers, 218 motifs with either a left or right FDR corrected spike magnitude p-value < 0.05 and 83 with both left and right uncorrected p < 0.01. 11 genes were identified with highly significant motifs in one ME exon but not in the other, suggesting regulation of ME exon splicing through these highly conserved hexamers. The majority of these genes have been shown to have regulated spatiotemporal expression. 10 elements were found to match three mammalian splicing regulator databases. A putative ESR motif, GATGCAG, was identified in the ME-13b but not in the ME-13a of Drosophila N-Cadherin, a gene that has been shown to have a distinct spatiotemporal expression pattern of spliced isoforms in a recent study. Conclusions: Analysis of phylogenetic relationships and variability of sequence conservation as implemented in the E2E spikes method may lead to improved identification of ESRs. We found that approximately half of the putative ESRs in common between insects and mammals have a high statistical support (p < 0.01). Several Drosophila genes with spatiotemporal expression patterns were identified to contain putative ESRs located in one exon of the ME exon pairs but not in the other.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Post-transcriptional regulation of cytoplasmic mRNAs is an efficient mechanism of regulating the amounts of active protein within a eukaryotic cell. RNA sequence elements located in the untranslated regions of mRNAs can influence transcript degradation or translation through associations with RNA-binding proteins. Tristetraprolin (TTP) is the best known member of a family of CCCH zinc finger proteins that targets adenosine-uridine rich element (ARE) binding sites in the 3’ untranslated regions (UTRs) of mRNAs, promoting transcript deadenylation through the recruitment of deadenylases. More specifically, TTP has been shown to bind AREs located in the 3’-UTRs of transcripts with known roles in the inflammatory response. The mRNA-binding region of the protein is the highly conserved CCCH tandem zinc finger (TZF) domain. The synthetic TTP TZF domain has been shown to bind with high affinity to the 13-mer sequence of UUUUAUUUAUUUU. However, the binding affinities of full-length TTP family members to the same sequence and its variants are unknown. Furthermore, the distance needed between two overlapping or neighboring UUAUUUAUU 9-mers for tandem binding events of a full-length TTP family member to a target transcript has not been explored. To address these questions, we recombinantly expressed and purified the full-length C. albicans TTP family member Zfs1. Using full-length Zfs1, tagged at the N-terminus with maltose binding protein (MBP), we determined the binding affinities of the protein to the optimal TTP binding sequence, UUAUUUAUU. Fluorescence anisotropy experiments determined that the binding affinities of MBP-Zfs1 to non-canonical AREs were influenced by ionic buffer strength, suggesting that transcript selectivity may be affected by intracellular conditions. Furthermore, electrophoretic mobility shift assays (EMSAs) revealed that separation of two core AUUUA sequences by two uridines is sufficient for tandem binding of MBP-Zfs1. Finally, we found evidence for tandem binding of MBP-Zfs1 to a 27-base RNA oligonucleotide containing only a single ARE-binding site, and showed that this was concentration and RNA length dependent; this phenomenon had not been seen previously. These data suggest that the association of the TTP TZF domain and the TZF domains of other species, to ARE-binding sites is highly conserved. Domains outside of the TZF domain may mediate transcript selectivity in changing cellular conditions, and promote protein-RNA interactions not associated with the ARE-binding TZF domain.

In summary, the evidence presented here suggests that Zfs1-mediated decay of mRNA targets may require additional interactions, in addition to ARE-TZF domain associations, to promote transcript destabilization and degradation. These studies further our understanding of post-transcriptional steps in gene regulation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coats plus is a highly pleiotropic disorder particularly affecting the eye, brain, bone and gastrointestinal tract. Here, we show that Coats plus results from mutations in CTC1, encoding conserved telomere maintenance component 1, a member of the mammalian homolog of the yeast heterotrimeric CST telomeric capping complex. Consistent with the observation of shortened telomeres in an Arabidopsis CTC1 mutant and the phenotypic overlap of Coats plus with the telomeric maintenance disorders comprising dyskeratosis congenita, we observed shortened telomeres in three individuals with Coats plus and an increase in spontaneous γH2AX-positive cells in cell lines derived from two affected individuals. CTC1 is also a subunit of the α-accessory factor (AAF) complex, stimulating the activity of DNA polymerase-α primase, the only enzyme known to initiate DNA replication in eukaryotic cells. Thus, CTC1 may have a function in DNA metabolism that is necessary for but not specific to telomeric integrity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Antimicrobial peptides and proteins (AMPs) are widespread in the living kingdom. They are key effectors of defense reactions and mediators of competitions between organisms. They are often cationic and amphiphilic, which favors their interactions with the anionic membranes of microorganisms. Several AMP families do not directly alter membrane integrity but rather target conserved components of the bacterial membranes in a process that provides them with potent and specific antimicrobial activities. Thus, lipopolysaccharides (LPS), lipoteichoic acids (LTA) or the peptidoglycan precursor Lipid II are targeted by a broad series of AMPs. Studying the functional diversity of immune effectors tells us about the essential residues involved in AMP mechanism of action. Marine invertebrates have been found to produce a remarkable diversity of AMPs. Molluscan defensins and crustacean anti-LPS factors (ALF) are diverse in terms of amino acid sequence and show contrasted phenotypes in terms of antimicrobial activity. Their activity is directed essentially against Gram-positive or Gram-negative bacteria due their specific interactions with Lipid II or Lipid A, respectively. Through those interesting examples, we discuss here how sequence diversity generated throughout evolution informs us on residues required for essential molecular interaction at the bacterial membranes and subsequent antibacterial activity. Through the analysis of molecular variants having lost antibacterial activity or shaped novel functions, we also discuss the molecular bases of functional divergence in AMPs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Insertion sequence IS900 is used as a target for the identification of Mycobacterium avium subsp. paratuberculosis. Previous reports have revealed single nucleotide polymorphisms within IS900. This study, which analyzed the IS900 sequences of a panel of isolates representing M. avium subsp. paratuberculosis strain types I, II, and III, revealed conserved type-specific polymorphisms that could be utilized as a tool for diagnostic and epidemiological purposes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The actinobacterium Streptomyces wadayamensis A23 is an endophyte of Citrus reticulata that produces the antimycin and mannopeptimycin antibiotics, among others. The strain has the capability to inhibit Xylella fastidiosa growth. The draft genome of S. wadayamensis A23 has ~7.0 Mb and 6,006 protein-coding sequences, with a 73.5% G+C content.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bacillus safensis is a microorganism recognized for its biotechnological and industrial potential due to its interesting enzymatic portfolio. Here, as a means of gathering information about the importance of this species in oil biodegradation, we report a draft genome sequence of a strain isolated from petroleum.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Avian pathogenic Escherichia coli (APEC) strains belong to a category that is associated with colibacillosis, a serious illness in the poultry industry worldwide. Additionally, some APEC groups have recently been described as potential zoonotic agents. In this work, we compared APEC strains with extraintestinal pathogenic E. coli (ExPEC) strains isolated from clinical cases of humans with extra-intestinal diseases such as urinary tract infections (UTI) and bacteremia. PCR results showed that genes usually found in the ColV plasmid (tsh, iucA, iss, and hlyF) were associated with APEC strains while fyuA, irp-2, fepC sitDchrom, fimH, crl, csgA, afa, iha, sat, hlyA, hra, cnf1, kpsMTII, clpVSakai and malX were associated with human ExPEC. Both categories shared nine serogroups (O2, O6, O7, O8, O11, O19, O25, O73 and O153) and seven sequence types (ST10, ST88, ST93, ST117, ST131, ST155, ST359, ST648 and ST1011). Interestingly, ST95, which is associated with the zoonotic potential of APEC and is spread in avian E. coli of North America and Europe, was not detected among 76 APEC strains. When the strains were clustered based on the presence of virulence genes, most ExPEC strains (71.7%) were contained in one cluster while most APEC strains (63.2%) segregated to another. In general, the strains showed distinct genetic and fingerprint patterns, but avian and human strains of ST359, or ST23 clonal complex (CC), presented more than 70% of similarity by PFGE. The results demonstrate that some zoonotic-related STs (ST117, ST131, ST10CC, ST23CC) are present in Brazil. Also, the presence of moderate fingerprint similarities between ST359 E. coli of avian and human origin indicates that strains of this ST are candidates for having zoonotic potential.