28 resultados para Protein Sequence Analysis
em BORIS: Bern Open Repository and Information System - Berna - Suiça
Resumo:
A porcine BAC clone harboring the tightly linked IFNAR1 and IFNGR2 genes was identified by comparative analysis of the publicly available porcine BAC end sequences. The complete 168,835 bp insert sequence of this clone was determined. Sequence comparisons of the genomic sequence with EST sequences from public databases were performed and allowed a detailed annotation of the IFNAR1 and IFNGR2 genes. The analyzed genes showed a conserved genomic organization with their known mammalian orthologs, however the sequence conservation of these genes across species was relatively low. In addition to the IFNAR1 and IFNGR2 genes, which were completely sequenced, the analyzed BAC clone also contained parts of an orphan gene encoding a putative transmembrane protein (TMEM50B). In contrast to the IFNAR1 and IFNGR2 genes the sequence conservation of the TMEM50B gene across different mammalian species was extremely high.
Resumo:
Mycobacterium abscessus, Mycobacterium bolletii, and Mycobacterium massiliense (Mycobacterium abscessus sensu lato) are closely related species that currently are identified by the sequencing of the rpoB gene. However, recent studies show that rpoB sequencing alone is insufficient to discriminate between these species, and some authors have questioned their current taxonomic classification. We studied here a large collection of M. abscessus (sensu lato) strains by partial rpoB sequencing (752 bp) and multilocus sequence analysis (MLSA). The final MLSA scheme developed was based on the partial sequences of eight housekeeping genes: argH, cya, glpK, gnd, murC, pgm, pta, and purH. The strains studied included the three type strains (M. abscessus CIP 104536(T), M. massiliense CIP 108297(T), and M. bolletii CIP 108541(T)) and 120 isolates recovered between 1997 and 2007 in France, Germany, Switzerland, and Brazil. The rpoB phylogenetic tree confirmed the existence of three main clusters, each comprising the type strain of one species. However, divergence values between the M. massiliense and M. bolletii clusters all were below 3% and between the M. abscessus and M. massiliense clusters were from 2.66 to 3.59%. The tree produced using the concatenated MLSA gene sequences (4,071 bp) also showed three main clusters, each comprising the type strain of one species. The M. abscessus cluster had a bootstrap value of 100% and was mostly compact. Bootstrap values for the M. massiliense and M. bolletii branches were much lower (71 and 61%, respectively), with the M. massiliense cluster having a fuzzy aspect. Mean (range) divergence values were 2.17% (1.13 to 2.58%) between the M. abscessus and M. massiliense clusters, 2.37% (1.5 to 2.85%) between the M. abscessus and M. bolletii clusters, and 2.28% (0.86 to 2.68%) between the M. massiliense and M. bolletii clusters. Adding the rpoB sequence to the MLSA-concatenated sequence (total sequence, 4,823 bp) had little effect on the clustering of strains. We found 10/120 (8.3%) isolates for which the concatenated MLSA gene sequence and rpoB sequence were discordant (e.g., M. massiliense MLSA sequence and M. abscessus rpoB sequence), suggesting the intergroup lateral transfers of rpoB. In conclusion, our study strongly supports the recent proposal that M. abscessus, M. massiliense, and M. bolletii should constitute a single species. Our findings also indicate that there has been a horizontal transfer of rpoB sequences between these subgroups, precluding the use of rpoB sequencing alone for the accurate identification of the two proposed M. abscessus subspecies.
Resumo:
Cytochrome P450 enzymes (CYP450s) represent a superfamily of haem-thiolate proteins. CYP450s are most abundant in the liver, a major site of drug metabolism, and play key roles in the metabolism of a variety of substrates, including drugs and environmental contaminants. Interaction of two or more different drugs with the same enzyme can account for adverse effects and failure of therapy. Human CYP3A4 metabolizes about 50% of all known drugs, but little is known about the orthologous CYP450s in horses. We report here the genomic organization of the equine CYP3A gene cluster as well as a comparative analysis with the human CYP3A gene cluster. The equine CYP450 genes of the 3A family are located on ECA 13 between 6.97-7.53 Mb, in a region syntenic to HSA 7 99.05-99.35 Mb. Seven potential, closely linked equine CYP3A genes were found, in contrast to only four genes in the human genome. RNA was isolated from an equine liver sample, and the approximately 1.5-kb coding sequence of six CYP3A genes could be amplified by RT-PCR. Sequencing of the RT-PCR products revealed numerous hitherto unknown single nucleotide polymorphisms (SNPs) in these six CYP3A genes, and one 6-bp deletion compared to the reference sequence (EquCab2.0). The presence of the variants was confirmed in a sample of genomic DNA from the same horse. In conclusion, orthologous genes for the CYP3A family exist in horses, but their number differs from those of the human CYP3A gene family. CYP450 genes of the same family show high homology within and between mammalian species, but can be highly polymorphic.
Resumo:
Defensins are a family of evolutionary ancient antimicrobial peptides consisting of three sub-families: alpha-, beta- and theta-defensins. This investigation was focused on the genomic characterization of equine beta-defensins and the investigation of the potential clustering of beta-defensin genes in the equine genome. Six genomic BAC clones were isolated from the CHORI-241 library and one of these was mapped by FISH to ECA 27q17. This location was confirmed by RH-mapping. The contiguous 212 kb sequence of this clone was determined. Sequence analysis revealed the identification of ten pseudogenes and nine genes, six of which were highly homologous to human beta-defensin DEFB4. Clustering of the beta-defensin genes was confirmed and the order of the genes on the analyzed BAC was related to the corresponding defensin cluster on HSA 8. The knowledge about the sequence and the genomic structure of the equine beta-defensin genes will improve the classification of different paralogous defensin genes and is a prerequisite for subsequent functional studies. Additionally, the first alpha-defensin-like sequence outside the groups of primates, lagomorphs and rodents (glires) was identified.
Resumo:
Genome predictions based on selected genes would be a very welcome approach for taxonomic studies, including DNA-DNA similarity, G+C content and representative phylogeny of bacteria. At present, DNA-DNA hybridizations are still considered the gold standard in species descriptions. However, this method is time-consuming and troublesome, and datasets can vary significantly between experiments as well as between laboratories. For the same reasons, full matrix hybridizations are rarely performed, weakening the significance of the results obtained. The authors established a universal sequencing approach for the three genes recN, rpoA and thdF for the Pasteurellaceae, and determined if the sequences could be used for predicting DNA-DNA relatedness within the family. The sequence-based similarity values calculated using a previously published formula proved most useful for species and genus separation, indicating that this method provides better resolution and no experimental variation compared to hybridization. By this method, cross-comparisons within the family over species and genus borders easily become possible. The three genes also serve as an indicator of the genome G+C content of a species. A mean divergence of around 1 % was observed from the classical method, which in itself has poor reproducibility. Finally, the three genes can be used alone or in combination with already-established 16S rRNA, rpoB and infB gene-sequencing strategies in a multisequence-based phylogeny for the family Pasteurellaceae. It is proposed to use the three sequences as a taxonomic tool, replacing DNA-DNA hybridization.
Resumo:
We improved, evaluated, and used Sanger sequencing for quantification of single nucleotide polymorphism (SNP) variants in transcripts and gDNA samples. This improved assay resulted in highly reproducible relative allele frequencies (e.g., for a heterozygous gDNA 50.0+/-1.4%, and for a missense mutation-bearing transcript 46.9+/-3.7%) with a lower detection limit of 3-9%. It provided excellent accuracy and linear correlation between expected and observed relative allele frequencies. This sequencing assay, which can also be used for the quantification of copy number variations (CNVs), methylations, mosaicisms, and DNA pools, enabled us to analyze transcripts of the FBN1 gene in fibroblasts and blood samples of patients with suspected Marfan syndrome not only qualitatively but also quantitatively. We report a total of 18 novel and 19 known FBN1 sequence variants leading to a premature termination codon (PTC), 26 of which we analyzed by quantitative sequencing both at gDNA and cDNA levels. The relative amounts of PTC-containing FBN1 transcripts in fresh and PAXgene-stabilized blood samples were significantly higher (33.0+/-3.9% to 80.0+/-7.2%) than those detected in affected fibroblasts with inhibition of nonsense-mediated mRNA decay (NMD) (11.0+/-2.1% to 25.0+/-1.8%), whereas in fibroblasts without NMD inhibition no mutant alleles could be detected. These results provide evidence for incomplete NMD in leukocytes and have particular importance for RNA-based analyses not only in FBN1 but also in other genes.
Resumo:
Multilocus sequence analysis (MLSA) based on recN, rpoA and thdF genes was done on more than 30 species of the family Enterobacteriaceae with a focus on Cronobacter and the related genus Enterobacter. The sequences provide valuable data for phylogenetic, taxonomic and diagnostic purposes. Phylogenetic analysis showed that the genus Cronobacter forms a homogenous cluster related to recently described species of Enterobacter, but distant to other species of this genus. Combining sequence information on all three genes is highly representative for the species' %GC-content used as taxonomic marker. Sequence similarity of the three genes and even of recN alone can be used to extrapolate genetic similarities between species of Enterobacteriaceae. Finally, the rpoA gene sequence, which is the easiest one to determine, provides a powerful diagnostic tool to identify and differentiate species of this family. The comparative analysis gives important insights into the phylogeny and genetic relatedness of the family Enterobacteriaceae and will serve as a basis for further studies and clarifications on the taxonomy of this large and heterogeneous family.
Resumo:
Cloud computing provides a promising solution to the genomics data deluge problem resulting from the advent of next-generation sequencing (NGS) technology. Based on the concepts of “resources-on-demand” and “pay-as-you-go”, scientists with no or limited infrastructure can have access to scalable and cost-effective computational resources. However, the large size of NGS data causes a significant data transfer latency from the client’s site to the cloud, which presents a bottleneck for using cloud computing services. In this paper, we provide a streaming-based scheme to overcome this problem, where the NGS data is processed while being transferred to the cloud. Our scheme targets the wide class of NGS data analysis tasks, where the NGS sequences can be processed independently from one another. We also provide the elastream package that supports the use of this scheme with individual analysis programs or with workflow systems. Experiments presented in this paper show that our solution mitigates the effect of data transfer latency and saves both time and cost of computation.
Resumo:
Sequence analysis and optimal matching are useful heuristic tools for the descriptive analysis of heterogeneous individual pathways such as educational careers, job sequences or patterns of family formation. However, to date it remains unclear how to handle the inevitable problems caused by missing values with regard to such analysis. Multiple Imputation (MI) offers a possible solution for this problem but it has not been tested in the context of sequence analysis. Against this background, we contribute to the literature by assessing the potential of MI in the context of sequence analyses using an empirical example. Methodologically, we draw upon the work of Brendan Halpin and extend it to additional types of missing value patterns. Our empirical case is a sequence analysis of panel data with substantial attrition that examines the typical patterns and the persistence of sex segregation in school-to-work transitions in Switzerland. The preliminary results indicate that MI is a valuable methodology for handling missing values due to panel mortality in the context of sequence analysis. MI is especially useful in facilitating a sound interpretation of the resulting sequence types.
Resumo:
Aeromonas salmonicida subsp. salmonicida contains a functional type III secretion system that is responsible for the secretion of the ADP-ribosylating toxin AexT. In this study, the authors identified AopP as a second effector protein secreted by this system. The aopP gene was detected in both typical and atypical A. salmonicida isolates and was found to be encoded on a small plasmid of approximately 6.4 kb. Sequence analysis indicates that AopP is a member of the YopJ family of effector proteins, a group of proteins that interfere with mitogen-activated protein kinase (MAPK) and/or nuclear factor kappa B (NF-kappaB) signalling pathways. AopP inhibits the NF-kappaB pathway downstream of IkappaB kinase (IKK) activation, while a catalytically inactivated mutant, AopPC177A, does not possess this inhibitory effect. Unlike other effectors of the YopJ family, such as YopJ and VopA, AopP does not inhibit the MAPK signalling pathway.
Resumo:
A total of 167 sheep belonging to the Estonian whiteheaded mutton, Estonian blackheaded mutton, Lithuanian coarsewool native, Lithuanian blackface and Latvian darkheaded mutton breeds, and a population of sheep kept isolated on the Estonian island of Ruhnu, were sequence-analysed for polymorphisms in the prion protein (PrP) gene, to determine their genotype and the allele frequencies of polymorphisms in PrP known to confer resistance to scrapie. A 939 base pair fragment of exon 3 from the PrP gene was amplified by pcr and analysed by direct sequencing. For animals showing polymorphism at two nucleotide positions, both haplotypes of these double-heterozygous genotypes were further verified by pcr cloning and sequence analysis. Known polymorphisms were observed at codons 136, 154 and 171, and six different haplotypes (arr, ahq, arh, ahr, arq and vrq) were determined. On the basis of these polymorphisms, the six populations of sheep possessed the resistant arr haplotype at different frequencies. The high-risk arq haplotype occurred in high frequencies in all six populations, but vrq, the haplotype carrying the highest risk, occurred at low frequencies and in only three of the populations.
Resumo:
Echicetin, a heterodimeric protein from the venom of Echis carinatus, binds to platelet glycoprotein Ib (GPIb) and so inhibits platelet aggregation or agglutination induced by various platelet agonists acting via GPIb. The amino acid sequence of the beta subunit of echicetin has been reported and found to belong to the recently identified snake venom subclass of the C-type lectin protein family. Echicetin alpha and beta subunits were purified. N-terminal sequence analysis provided direct evidence that the protein purified was echicetin. The paper presents the complete amino acid sequence of the alpha subunit and computer models of the alpha and beta subunits. The sequence of alpha echicetin is highly similar to the alpha and beta chains of various heterodimeric and homodimeric C-type lectins. Neither of the fully reduced and alkylated alpha or beta subunits of echicetin inhibited the platelet agglutination induced by von Willebrand factor-ristocetin or alpha-thrombin. Earlier reports about the inhibitory activity of reduced and alkylated echicetin beta subunit might have been due to partial reduction of the protein.
Resumo:
The corpus luteum (CL) is a temporary organ involved in the maintenance of pregnancy. In the course of its life-cycle, the CL undergoes two distinct and consecutive processes for its inevitable removal through apoptosis: functional and structural luteolysis. We isolated a gene encoding for a novel rat zinc finger protein (ZFP), named rat ZFP96 (rZFP96) from an ovarian lambda cDNA library. Sequence analysis revealed close sequence and structural similarity to mouse ZFP96 and human zinc finger protein 305 (ZNF305). Quantitative reverse transcription-polymerase chain reaction analysis revealed a positive correlation with the end of pregnancy, that is, the onset of structural luteolysis of the CL. Messenger RNA levels increased 3-fold (P < 0.01) between days 13 and 22 of pregnancy and 8-fold (P < 0.01) between day 13 of pregnancy and day 1 post-partum. In addition, we detected rZFP96 expression in mammary, placenta, heart, kidney and skeletal muscle. Sequence analysis predicted that rZFP96 has a high probability of localizing to the nuclear compartment. The presence of both a perfect consensus TGEKP linker sequence between zinc fingers 2 and 3 as well as several similar sequences between the other zinc fingers suggests physical interaction with DNA. Speculatively, rZFP96 may therefore function as a transcription factor, switching-off pro-survival genes and/or upregulating pro-apoptotic genes and thereby contributing to the demise of the CL.
Resumo:
A 14-kDa outer membrane protein (OMP) was purified from Actinobacillus pleuro-pneumoniae serotype 2. The protein strongly reacts with sera from pigs experimentally or naturally infected with any of the 12 serotypes of A. pleuropneumoniae. The gene encoding this protein was isolated from a gene library of A. pleuropneumoniae serotype 2 reference strain by immunoscreening. Expression of the cloned gene in Escherichia coli revealed that the protein is also located in the outer membrane fraction of the recombinant host. DNA sequence analysis of the gene reveals high similarity of the protein's amino acid sequence to that of the E. coli peptidoglycan-associated lipoprotein PAL, to the Haemophilus influenzae OMP P6 and to related proteins of several other Gram-negative bacteria. We have therefore named the 14-kDa protein PalA, and its corresponding gene, palA. The 20 amino-terminal amino acid residues of PalA constitute a signal sequence characteristic of membrane lipoproteins of prokaryotes with a recognition site for the signal sequence peptidase II and a sorting signal for the final localization of the mature protein in the outer membrane. The DNA sequence upstream of palA contains an open reading frame which is highly similar to the E. coli tolB gene, indicating a gene cluster in A. pleuropneumoniae which is very similar to the E. coli tol locus. The palA gene is conserved and expressed in all A. pleuropneumoniae serotypes and in A. lignieresii. A very similar palA gene is present in A. suis and A. equuli.