984 resultados para Molecular Sequence Annotation
Resumo:
Shrews of the genus Sorex are characterized by a Holarctic distribution, and relationships among extant taxa have never been fully resolved. Phylogenies have been proposed based on morphological, karyological, and biochemical comparisons, but these analyses often produced controversial and contradictory results. Phylogenetic analyses of partial mitochondrial cytochrome b gene sequences (1011 bp) were used to examine the relationships among 27 Sorex species. The molecular data suggest that Sorex comprises two major monophyletic lineages, one restricted mostly to the New World and one with a primarily Palearctic distribution. Furthermore, several sister-species relationships are revealed by the analysis. Based on the split between the Soricinae and Crocidurinae subfamilies, we used a 95% confidence interval for both the calibration of a molecular clock and the subsequent calculation of major diversification events within the genus Sorex. Our analysis does not support an unambiguous acceleration of the molecular clock in shrews, the estimated rate being similar to other estimates of mammalian mitochondrial clocks. In addition, the data presented here indicate that estimates from the fossil record greatly underestimate divergence dates among Sorex taxa.
Resumo:
We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.
Resumo:
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.
Resumo:
The objective of this work was to standardize a semiautomated method for genotyping soybean, based on universal tail sequence primers (UTSP), and to compare it with the conventional genotyping method that uses electrophoresis in polyacrylamide gels. Thirty soybean cultivars were genotypically characterized by both methods, using 13 microsatellite loci. For the UTSP method, the number of alleles (NA) was 50 (2-7 per marker) and the polymorphic information content (PIC) ranged from 0.40 to 0.74. For the conventional method, the NA was 38 (2-5 per marker) and the PIC varied from 0.39 to 0.67. The genetic dissimilarity matrices obtained by the two methods were highly correlated with each other (0.8026), and the formed groups were coherent with the phenotypic data used for varietal registration. The 13 markers allowed the distinction of all analyzed cultivars. The low cost of the UTSP method, associated with its high accuracy, makes it ideal for the characterization of soybean cultivars and for the determination of genetic purity.
Resumo:
Despite the development of novel typing methods based on whole genome sequencing, most laboratories still rely on classical molecular methods for outbreak investigation or surveillance. Reference methods for Clostridium difficile include ribotyping and pulsed-field gel electrophoresis, which are band-comparing methods often difficult to establish and which require reference strain collections. Here, we present the double locus sequence typing (DLST) scheme as a tool to analyse C. difficile isolates. Using a collection of clinical C. difficile isolates recovered during a 1-year period, we evaluated the performance of DLST and compared the results to multilocus sequence typing (MLST), a sequence-based method that has been used to study the structure of bacterial populations and highlight major clones. DLST had a higher discriminatory power compared to MLST (Simpson's index of diversity of 0.979 versus 0.965) and successfully identified all isolates of the study (100 % typeability). Previous studies showed that the discriminatory power of ribotyping was comparable to that of MLST; thus, DLST might be more discriminatory than ribotyping. DLST is easy to establish and provides several advantages, including absence of DNA extraction [polymerase chain reaction (PCR) is performed on colonies], no specific instrumentation, low cost and unambiguous definition of types. Moreover, the implementation of a DLST typing scheme on an Internet database, such as that previously done for Staphylococcus aureus and Pseudomonas aeruginosa ( http://www.dlst.org ), will allow users to easily obtain the DLST type by submitting directly sequencing files and will avoid problems associated with multiple databases.
Resumo:
Background: The tight junction (TJ) is one of the most important structures established during merozoite invasion of host cells and a large amount of proteins stored in Toxoplasma and Plasmodium parasites’ apical organelles are involved in forming the TJ. Plasmodium falciparum and Toxoplasma gondii apical membrane antigen 1 (AMA-1) and rhoptry neck proteins (RONs) are the two main TJ components. It has been shown that RON4 plays an essential role during merozoite and sporozoite invasion to target cells. This study has focused on characterizing a novel Plasmodium vivax rhoptry protein, RON4, which is homologous to PfRON4 and PkRON4. Methods: The ron4 gene was re-annotated in the P. vivax genome using various bioinformatics tools and taking PfRON4 and PkRON4 amino acid sequences as templates. Gene synteny, as well as identity and similarity values between open reading frames (ORFs) belonging to the three species were assessed. The gene transcription of pvron4, and the expression and localization of the encoded protein were also determined in the VCG-1 strain by molecular and immunological studies. Nucleotide and amino acid sequences obtained for pvron4 in VCG-1 were compared to those from strains coming from different geographical areas. Results: PvRON4 is a 733 amino acid long protein, which is encoded by three exons, having similar transcription and translation patterns to those reported for its homologue, PfRON4. Sequencing PvRON4 from the VCG-1 strain and comparing it to P. vivax strains from different geographical locations has shown two conserved regions separated by a low complexity variable region, possibly acting as a “smokescreen”. PvRON4 contains a predicted signal sequence, a coiled-coil α-helical motif, two tandem repeats and six conserved cysteines towards the carboxyterminus and is a soluble protein lacking predicted transmembranal domains or a GPI anchor. Indirect immunofluorescence assays have shown that PvRON4 is expressed at the apical end of schizonts and co-localizes at the rhoptry neck with PvRON2.
Resumo:
Specific monomer sequences in aromatic copolyimides are recognized through their -stacking and hydrogen-bonding interactions with a sterically and electronically complementary molecular tweezer. These interactions enable the tweezer molecule to read monomer sequences comprising up to 27 aromatic rings by multiple adjacent binding to neighboring sites on the polymer chain.
Resumo:
Sequence-specific binding is demonstrated between pyrene-based tweezer molecules and soluble, high molar mass copolyimides. The binding involves complementary pi - pi stacking interactions, polymer chain-folding, and hydrogen bonding and is extremely sensitive to the steric environment around the pyromellitimide binding-site. A detailed picture of the intermolecular interactions involved has been obtained through single-crystal X-ray studies of tweezer complexes with model diimides. Ring-current magnetic shielding of polyimide protons by the pyrene '' arms '' of the tweezer molecule induces large complexation shifts of the corresponding H-1 NMR resonances, enabling specific triplet sequences to be identified by their complexation shifts. Extended comonomer sequences (triplets of triplets in which the monomer residues differ only by the presence or absence of a methyl group) can be '' read '' by a mechanism which involves multiple binding of tweezer molecules to adjacent diimide residues within the copolymer chain. The adjacent-binding model for sequence recognition has been validated by two conceptually different sets of tweezer binding experiments. One approach compares sequence-recognition events for copolyimides having either restricted or unrestricted triple-triplet sequences, and the other makes use of copolymers containing both strongly binding and completely nonbinding diimide residues. In all cases the nature and relative proportions of triple-triplet sequences predicted by the adjacent-binding model are fully consistent with the observed H-1 NMR data.
Resumo:
Antimicrobial drug resistance is a global challenge for the 21st century with the emergence of resistant bacterial strains worldwide. Transferable resistance to beta-lactam antimicrobial drugs, mediated by production of extended-spectrum beta-lactamases (ESBLs), is of particular concern. In 2004, an ESBL-carrying IncK plasmid (pCT) was isolated from cattle in the United Kingdom. The sequence was a 93,629-bp plasmid encoding a single antimicrobial drug resistance gene, bla(CTX-M-14). From this information, PCRs identifying novel features of pCT were designed and applied to isolates from several countries, showing that the plasmid has disseminated worldwide in bacteria from humans and animals. Complete DNA sequences can be used as a platform to develop rapid epidemiologic tools to identify and trace the spread of plasmids in clinically relevant pathogens, thus facilitating a better understanding of their distribution and ability to transfer between bacteria of humans and animals.
Resumo:
The cellular and molecular characteristics of a cell line (BME26) derived from embryos of the cattle tick Rhipicephalus (Boophilus) microplus were studied. The cells contained glycogen inclusions, numerous mitochondria, and vesicles with heterogeneous electron densities dispersed throughout the cytoplasm. Vesicles contained lipids and sequestered palladium meso-porphyrin (Pd-mP) and rhodamine-hemoglobin, suggesting their involvement in the autophagic and endocytic pathways. The cells phagocytosed yeast and expressed genes encoding the antimicrobial peptides (microplusin and defensin). A cDNA library was made and 898 unique mRNA sequences were obtained. Among them, 556 sequences were not significantly similar to any sequence found in public databases. Annotation using Gene Ontology revealed transcripts related to several different functional classes. We identified transcripts involved in immune response such as ferritin, serine proteases, protease inhibitors,. antimicrobial peptides, heat shock protein, glutathione S-transferase, peroxidase, and NADPH oxidase. BME26 cells transfected with a plasmid carrying a red fluorescent protein reporter gene (DsRed2) transiently expressed DsRed2 for up to 5 weeks. We conclude that BME26 can be used to experimentally analyze diverse biological processes that occur in R. (B.) microplus such as the innate immune response to tick-borne pathogens. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
The cellular and molecular characteristics of a cell line (BME26) derived from embryos of the cattle tick Rhipicephalus (Boophilus) microplus were studied. The cells contained glycogen inclusions, numerous mitochondria, and vesicles with heterogeneous electron densities dispersed throughout the cytoplasm. Vesicles contained lipids and sequestered palladium meso-porphyrin (Pd-mP) and rhodamine-hemoglobin, suggesting their involvement in the autophagic and endocytic pathways. The cells phagocytosed yeast and expressed genes encoding the antimicrobial peptides (microplusin and defensin). A cDNA library was made and 898 unique mRNA sequences were obtained. Among them, 556 sequences were not significantly similar to any sequence found in public databases. Annotation using Gene Ontology revealed transcripts related to several different functional classes. We identified transcripts involved in immune response such as ferritin, serine proteases, protease inhibitors,. antimicrobial peptides, heat shock protein, glutathione S-transferase, peroxidase, and NADPH oxidase. BME26 cells transfected with a plasmid carrying a red fluorescent protein reporter gene (DsRed2) transiently expressed DsRed2 for up to 5 weeks. We conclude that BME26 can be used to experimentally analyze diverse biological processes that occur in R. (B.) microplus such as the innate immune response to tick-borne pathogens. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST),program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged.
Resumo:
The complete nucleotide sequence of a nerve growth factor precursor from Bothrops jararacussu snake (Bj-NGF) was determined by DNA sequencing of a clone from cDNA library prepared from the poly(A) + RNA of the venom gland of B.jararacussu. cDNA encoding Bj-NGF precursor contained 723 bp in length, which encoded a prepro-NGF molecule with 241 amino acid residues. The mature Bj-NGF molecule was composed of I 18 amino acid residues with theoretical pI and molecular weight of 8.31 and 13,537, respectively. Its amino acid sequence showed 97%, 96%, 93%, 86%, 78%, 74%, 76%, 76% and 55% sequential similarities with NGFs from Crotalus durissus terrificus, Agkistrodon halys pallas, Daboia (Vipera) russelli russelli, Bungarus multicinctus, Naja sp., mouse, human, bovine and cat, respectively. Phylogenetic analyses based on the amino acid sequences of 15 NGFs separate the Elapidae family (Naja and Bungarus) from those Crotalidae snakes (Bothrops, Crotalus and Agkistrodon). The three-dimensional structure of mature Bj-NGF was modeled based on the crystal structure of the human NGF. The model reveals that the core of NGF, formed by a pair of P-sheets, is highly conserved and the major mutations are both at the three beta-hairpin loops and at the reverse turn. (C) 2002 Societe francaise de biochimie et biologic moleculaire/Editions scientifiques et medicales Elsevier SAS. All rights reserved.