684 resultados para Molecular Sequence Data
Resumo:
Switzerland has a complex human immunodeficiency virus (HIV) epidemic involving several populations. We examined transmission of HIV type 1 (HIV-1) in a national cohort study. Latent class analysis was used to identify socioeconomic and behavioral groups among 6,027 patients enrolled in the Swiss HIV Cohort Study between 2000 and 2011. Phylogenetic analysis of sequence data, available for 4,013 patients, was used to identify transmission clusters. Concordance between sociobehavioral groups and transmission clusters was assessed in correlation and multiple correspondence analyses. A total of 2,696 patients were infected with subtype B, 203 with subtype C, 196 with subtype A, and 733 with recombinant subtypes (mainly CRF02_AG and CRF01_AE). Latent class analysis identified 8 patient groups. Most transmission clusters of subtype B were shared between groups of gay men (groups 1-3) or between the heterosexual groups "heterosexual people of lower socioeconomic position" (group 4) and "injection drug users" (group 8). Clusters linking homosexual and heterosexual groups were associated with "older heterosexual and gay people on welfare" (group 5). "Migrant women in heterosexual partnerships" (group 6) and "heterosexual migrants on welfare" (group 7) shared non-B clusters with groups 4 and 5. Combining approaches from social and molecular epidemiology can provide insights into HIV-1 transmission and inform the design of prevention strategies.
Resumo:
Many species contain genetic lineages that are phylogenetically intermixed with those of other species. In the Sorex araneus group, previous results based on mtDNA and Y chromosome sequence data showed an incongruent position of Sorex granarius within this group. In this study, we explored the relationship between species within the S. araneus group, aiming to resolve the particular position of S. granarius. In this context, we sequenced a total of 2447 base pairs (bp) of X-linked and nuclear genes from 47 individuals of the S. araneus group. The same taxa were also analyzed within a Bayesian framework with nine autosomal microsatellites. These analyses revealed that all markers apart from mtDNA showed similar patterns, suggesting that the problematic position of S. granarius is best explained by an incongruent behavior by mtDNA. Given their close phylogenetic relationship and their close geographic distribution, the most likely explanation for this pattern is past mtDNA introgression from S. araneus race Carlit to S. granarius.
Resumo:
One major methodological problem in analysis of sequence data is the determination of costs from which distances between sequences are derived. Although this problem is currently not optimally dealt with in the social sciences, it has some similarity with problems that have been solved in bioinformatics for three decades. In this article, the authors propose an optimization of substitution and deletion/insertion costs based on computational methods. The authors provide an empirical way of determining costs for cases, frequent in the social sciences, in which theory does not clearly promote one cost scheme over another. Using three distinct data sets, the authors tested the distances and cluster solutions produced by the new cost scheme in comparison with solutions based on cost schemes associated with other research strategies. The proposed method performs well compared with other cost-setting strategies, while it alleviates the justification problem of cost schemes.
Resumo:
OBJECTIVE: To assess the molecular epidemiology and risk factors of predominant clones and sporadic strains of methicillin-resistant Staphylococcus aureus (MRSA) in Swiss hospitals and to compare them with European strains of epidemic clones. MATERIAL AND METHODS: One-year national survey of MRSA cases. Analysis of epidemiological and molecular typing data (PFGE) of MRSA strains. RESULTS: In 1997, 385 cases of MRSA were recorded in the five Swiss university hospitals and in 47 community hospitals. Half of the cases were found in Geneva hospitals where MRSA was already known to be endemic. Molecular typing of 288 isolates (one per case) showed that 186 (65%) belong to four predominant clones, three of which were mostly present in Geneva hospitals. In contrast, the fourth clone (85 cases) was found in 23 hospitals (in one to 16 cases per hospital). The remaining 35% of the strains were clustered into 62 pulsed field gel electrophoresis types. They accounted for one to five patients per hospital and were defined as sporadic. Multivariate analysis revealed no independent risk factors for harboring a predominant versus a sporadic strain, except that transfer from a foreign hospital increases the risk of harboring a sporadic strain (OR, 42; 95% CI, 5-360). CONCLUSION: While cases with predominant clones were due to the local spread of these clones, most sporadic cases appear to be due to the continuous introduction of new strains into the country. With the exception of a transfer from a hospital outside Switzerland, no difference in the clinical or epidemiological characteristics was observed between patients harboring a predominant clone and those with a sporadic strain.
Resumo:
Plasmodium falciparum is the parasite responsible for the most acute form of malaria in humans. Recently, the serine repeat antigen (SERA) in P. falciparum has attracted attention as a potential vaccine and drug target, and it has been shown to be a member of a large gene family. To clarify the relationships among the numerous P. falciparum SERAs and to identify orthologs to SERA5 and SERA6 in Plasmodium species affecting rodents, gene trees were inferred from nucleotide and amino acid sequence data for 33 putative SERA homologs in seven different species. (A distance method for nucleotide sequences that is specifically designed to accommodate differing GC content yielded results that were largely compatible with the amino acid tree. Standard-distance and maximum-likelihood methods for nucleotide sequences, on the other hand, yielded gene trees that differed in important respects.) To infer the pattern of duplication, speciation, and gene loss events in the SERA gene family history, the resulting gene trees were then "reconciled" with two competing Plasmodium species tree topologies that have been identified by previous phylogenetic studies. Parsimony of reconciliation was used as a criterion for selecting a gene tree/species tree pair and provided (1) support for one of the two species trees and for the core topology of the amino acid-derived gene tree, (2) a basis for critiquing fine detail in a poorly resolved region of the gene tree, (3) a set of predicted "missing genes" in some species, (4) clarification of the relationship among the P. falciparum SERA, and (5) some information about SERA5 and SERA6 orthologs in the rodent malaria parasites. Parsimony of reconciliation and a second criterion--implied mutational pattern at two key active sites in the SERA proteins-were also seen to be useful supplements to standard "bootstrap" analysis for inferred topologies.
Resumo:
The international Functional Annotation Of the Mammalian Genomes 4 (FANTOM4) research collaboration set out to better understand the transcriptional network that regulates macrophage differentiation and to uncover novel components of the transcriptome employing a series of high-throughput experiments. The primary and unique technique is cap analysis of gene expression (CAGE), sequencing mRNA 5'-ends with a second-generation sequencer to quantify promoter activities even in the absence of gene annotation. Additional genome-wide experiments complement the setup including short RNA sequencing, microarray gene expression profiling on large-scale perturbation experiments and ChIP-chip for epigenetic marks and transcription factors. All the experiments are performed in a differentiation time course of the THP-1 human leukemic cell line. Furthermore, we performed a large-scale mammalian two-hybrid (M2H) assay between transcription factors and monitored their expression profile across human and mouse tissues with qRT-PCR to address combinatorial effects of regulation by transcription factors. These interdependent data have been analyzed individually and in combination with each other and are published in related but distinct papers. We provide all data together with systematic annotation in an integrated view as resource for the scientific community (http://fantom.gsc.riken.jp/4/). Additionally, we assembled a rich set of derived analysis results including published predicted and validated regulatory interactions. Here we introduce the resource and its update after the initial release.
Resumo:
The amount of sequence data available today highly facilitates the access to genes from many gene families. Primers amplifying the desired genes over a range of species are readily obtained by aligning conserved gene regions, and laborious gene isolation procedures can often be replaced by quicker PCR-based approaches. However, in the case of multigene families, PCR-based approaches bear the often ignored risk of incomplete isolation of family members. This problem is most prominent in gene families with highly variable and thus unpredictable number of gene copies among species, such as in the major histocompatibility complex (MHC). In this study, we (i) report new primers for the isolation of the MHC class IIB (MHCIIB) gene family in birds and (ii) share our experience with isolating MHCIIB genes from an unprecedented number of avian species from all over the avian phylogeny. We report important and usually underappreciated problems encountered during PCR-based multigene family isolation and provide a collection of measures to help significantly improving the chance of successfully isolating complete multigene families using PCR-based approaches.
Resumo:
Of all Pacific salmonids, Chinook salmon Oncorhynchus tshawytscha display the greatest variability in return times to freshwater. The molecular mechanisms of these differential return times have not been well described. Current methods, such as long serial analysis of gene expression (LongSAGE) and microarrays, allow gene expression to be analyzed for thousands of genes simultaneously. To investigate whether differential gene expression is observed between fall- and spring-run Chinook salmon from California's Central Valley, LongSAGE libraries were constructed. Three libraries containing between 25,512 and 29,372 sequenced tags (21 base pairs/tag) were generated using messenger RNA from the brains of adult Chinook salmon returning in fall and spring and from one ocean-caught Chinook salmon. Tags were annotated to genes using complementary DNA libraries from Atlantic salmon Salmo salar and rainbow trout O. mykiss. Differentially expressed genes, as estimated by differences in the number of sequence tags, were found in all pairwise comparisons of libraries (freshwater versus saltwater = 40 genes; fall versus spring = 11 genes: and spawning versus nonspawning = 51 genes). The gene for ependymin, an extracellular glycoprotein involved in behavioral plasticity in fish, exhibited the most differential expression among the three groupings. Reverse transcription polymerase chain reaction analysis verified the differential expression of ependymin between the fall- and spring-run samples. These LongSAGE libraries, the first reported for Chinook salmon, provide a window of the transcriptional changes during Chinook salmon return migration to freshwater and spawning and increase the amount of expressed sequence data.
Resumo:
The fatty acids from cocoa butters of different origins, varieties, and suppliers and a number of cocoa butter equivalents (Illexao 30-61, Illexao 30-71, Illexao 30-96, Choclin, Coberine, Chocosine-Illipe, Chocosine-Shea, Shokao, Akomax, Akonord, and Ertina) were investigated by bulk stable carbon isotope analysis and compound specific isotope analysis. The interpretation is based on principal component analysis combining the fatty acid concentrations and the bulk and molecular isotopic data. The scatterplot of the two first principal components allowed detection of the addition of vegetable fats to cocoa butters. Enrichment in heavy carbon isotope (C-13) of the bulk cocoa butter and of the individual fatty acids is related to mixing with other vegetable fats and possibly to thermally or oxidatively induced degradation during processing (e.g., drying and roasting of the cocoa beans or deodorization of the pressed fat) or storage. The feasibility of the analytical approach for authenticity assessment is discussed.
Resumo:
Calcium-dependent exocytosis of synaptic vesicles mediates the release of neurotransmitters. Important proteins in this process have been identified such as the SNAREs, synaptotagmins, complexins, Munc18 and Munc13. Structural and functional studies have yielded a wealth of information about the physiological role of these proteins. However, it has been surprisingly difficult to arrive at a unified picture of the molecular sequence of events from vesicle docking to calcium-triggered membrane fusion. Using mainly a biochemical and biophysical perspective, we briefly survey the molecular mechanisms in an attempt to functionally integrate the key proteins into the emerging picture of the neuronal fusion machine.
Resumo:
The molecular diversity of viruses complicates the interpretation of viral genomic and proteomic data. To make sense of viral gene functions, investigators must be familiar with the virus host range, replication cycle and virion structure. Our aim is to provide a comprehensive resource bridging together textbook knowledge with genomic and proteomic sequences. ViralZone web resource (www.expasy.org/viralzone/) provides fact sheets on all known virus families/genera with easy access to sequence data. A selection of reference strains (RefStrain) provides annotated standards to circumvent the exponential increase of virus sequences. Moreover ViralZone offers a complete set of detailed and accurate virion pictures.
Resumo:
BACKGROUND: Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. RESULTS: The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. CONCLUSION: The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel.
Resumo:
The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360,000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set.
Resumo:
BACKGROUND: Fourmidable is an infrastructure to curate and share the emerging genetic, molecular, and functional genomic data and protocols for ants. DESCRIPTION: The Fourmidable assembly pipeline groups nucleotide sequences into clusters before independently assembling each cluster. Subsequently, assembled sequences are annotated via Interproscan and BLAST against general and insect-specific databases. Gene-specific information can be retrieved using gene identifiers, searching for similar sequences or browsing through inferred Gene Ontology annotations. The database will readily scale as ultra-high throughput sequence data and sequences from additional species become available. CONCLUSION: Fourmidable currently houses EST data from two ant species and microarray gene expression data for one of these. Fourmidable is publicly available at http://fourmidable.unil.ch.
Resumo:
Sequence data from regions of five vertebrate vitellogenin genes were used to examine the frequency, distribution, and mutability of the dinucleotide CpG, the preferred modification site for eukaryotic DNA methyltransferases. The observed level of the CpG dinucleotide in all five genes was markedly lower than that expected from the known mononucleotide frequencies. CpG suppression was greater in introns than in exons. CpG-containing codons were found to be avoided in the vitellogenin genes, but not completely despite the redundancy of the genetic code. Frequency and distribution patterns of this dinucleotide varied dramatically among these otherwise closely related genes. Dense clusters of CpG dinucleotides tended to appear in regions of either functional or structural interest (e.g., in the transposon-like Vi-element of Xenopus) and these clusters contained 5-methylcytosine (5 mC). 5 mC is known to undergo deamination to form thymidine, but the extent to which this transition occurs in the heavily methylated genomes of vertebrates and its contribution to CpG suppression are still unclear. Sequence comparison of the methylated vitellogenin gene regions identified C----T and G----A substitutions that were found to occur at relatively high frequencies. The predicted products of CpG deamination, TpG and CpA, were elevated. These findings are consistent with the view that CpG distribution and methylation are interdependent and that deamination of 5 mC plays an important role in promoting evolutionary change at the nucleotide sequence level.