939 resultados para Molecular Sequence Annotation
Resumo:
The first part of this work investigates the molecular epidemiology of a human enterovirus (HEV), echovirus 30 (E-30). This project is part of a series of studies performed in our research team analyzing the molecular epidemiology of HEV-B viruses. A total of 129 virus strains had been isolated in different parts of Europe. The sequence analysis was performed in three different genomic regions: 420 nucleotides (nt) in the VP4/VP2 capsid protein coding region, the entire VP1 capsid protein coding gene of 876 nt, and 150 nt in the VP1/2A junction region. The analysis revealed a succession of dominant sublineages within a major genotype. The temporally earlier genotypes had been replaced by a genetically homogenous lineage that has been circulating in Europe since the late 1970s. The same genotype was found by other research groups in North America and Australia. Globally, other cocirculating genetic lineages also exist. The prevalence of a dominant genotype makes E-30 different from other previously studied HEVs, such as polioviruses and coxsackieviruses B4 and B5, for which several coexisting genetic lineages have been reported. The second part of this work deals with molecular epidemiology of human rhinoviruses (HRVs). A total of 61 field isolates were studied in the 420-nt stretch in the capsid coding region of VP4/VP2. The isolates were collected from children under two years of age in Tampere, Finland. Sequences from the clinical isolates clustered in the two previously known phylogenetic clades. Seasonal clustering was found. Also, several distinct serotype-like clusters were found to co-circulate during the same epidemic season. Reappearance of a cluster after disappearing for a season was observed. The molecular epidemiology of the analyzed strains turned out to be complex, and we decided to continue our studies of HRV. Only five previously published complete genome sequences of HRV prototype strains were available for analysis. Therefore, all designated HRV prototype strains (n=102) were sequenced in the VP4/VP2 region, and the possibility of genetic typing of HRV was evaluated. Seventy-six of the 102 prototype strains clustered in HRV genetic group A (HRV-A) and 25 in group B (HRV-B). Serotype 87 clustered separately from other HRVs with HEV species D. The field strains of HRV represented as many as 19 different genotypes, as judged with an approximate demarcation of a 20% nt difference in the VP4/VP2 region. The interserotypic differences of HRV were generally similar to those reported between different HEV serotypes (i.e. about 20%), but smaller differences, less than 10%, were also observed. Because some HRV serotypes are genetically so closely related, we suggest that the genetic typing be performed using the criterion "the closest prototype strain". This study is the first systematic genetic characterization of all known HRV prototype strains, providing a further taxonomic proposal for classification of HRV. We proposed to divide the genus Human rhinoviruses into HRV-A and HRV-B. The final part of the work comprises a phylogenetic analysis of a subset (48) of HRV prototype strains and field isolates (12) in the nonstructural part of the genome coding for the RNA-dependent RNA polymerase (3D). The proposed division of the HRV strains in the species HRV-A and HRV-B was also supported by 3D region. HRV-B clustered closer to HEV species B, C, and also to polioviruses than to HRV-A. Intraspecies variation within both HRV-A and HRV-B was greater in the 3D coding region than in the VP4/VP2 coding region, in contrast to HEV. Moreover, the diversity of HRV in 3D exceeded that of HEV. One group of HRV-A, designated HRV-A', formed a separate cluster outside other HRV-A in the 3D region. It formed a cluster also in the capsid region, but located within HRV-A. This may reflect a different evolutionary history of distinct genomic regions among HRV-A. Furthermore, the tree topology within HRV-A in the 3D region differed from that in the VP4/VP2, suggesting possible recombination events in the evolution of the strains. No conflicting phylogenies were observed in any of the 12 field isolates. Possible recombination was further studied using the Similarity and Bootscanning analyses of the complete genome sequences of HRV available in public databases. Evidence for recombination among HRV-A was found, as HRV2 and HRV39 showed higher similarity in the nonstructural part of the genome. Whether HRV2 and HRV39 strains - and perhaps also some other HRV-A strains not yet completely sequenced - are recombinants remains to be determined.
Resumo:
Software packages NUPARM and NUCGEN, are described, which can be used to understand sequence directed structural variations in nucleic acids, by analysis and generation of non-uniform structures. A set of local inter basepair parameters (viz. tilt, roll, twist, shift, slide and rise) have been defined, which use geometry and coordinates of two successive basepairs only and can be used to generate polymeric structures with varying geometries for each of the 16 possible dinucleotide steps. Intra basepair parameters, propeller, buckle, opening and the C6...C8 distance can also be varied, if required, while the sugar phosphate backbone atoms are fixed in some standard conformation ill each of the nucleotides. NUPARM can be used to analyse both DNA and RNA structures, with single as well as double stranded helices. The NUCGEN software generates double helical models with the backbone fixed in B-form DNA, but with appropriate modifications in the input data, it can also generate A-form DNA ar rd RNA duplex structures.
Resumo:
Transfer RNAs of Azospirillum lipoferum were separated by two- dimensional gel electrophoresis and identified by aminoacylation. Thirty-six tRNA spots were resolved by this technique and twenty-six tRNA species have been identified. There are five tRNAs for Leu, four for Val, three for Pro, two each for Arg, Ile, Lys and Tyr, and one each for Ala, Asp, His, Phe, Ser and Thr. The tRNA(Asn) (QUU) was purified and its nucleotide sequence was determined. The A. lipoferum tRNA(Asn) (QUU) is 92% similar to B. subtilis tRNA(Asn) gene and two hypermodified nucleosides, queuosine (Q) and N-(9-beta-D Ribofuranosylpurine-6-YL) carbamoyl)-threonine (t(6)A) are present in this tRNA.
Resumo:
Study of the evolution of species or organisms is essential for various biological applications. Evolution is typically studied at the molecular level by analyzing the mutations of DNA sequences of organisms. Techniques have been developed for building phylogenetic or evolutionary trees for a set of sequences. Though phylogenetic trees capture the overall evolutionary relationships among the sequences, they do not reveal fine-level details of the evolution. In this work, we attempt to resolve various fine-level sequence transformation details associated with a phylogenetic tree using cellular automata. In particular, our work tries to determine the cellular automata rules for neighbor-dependent mutations of segments of DNA sequences. We also determine the number of time steps needed for evolution of a progeny from an ancestor and the unknown segments of the intermediate sequences in the phylogenetic tree. Due to the existence of vast number of cellular automata rules, we have developed a grid system that performs parallel guided explorations of the rules on grid resources. We demonstrate our techniques by conducting experiments on a grid comprising machines in three countries and obtaining potentially useful statistics regarding evolutions in three HIV sequences. In particular, our work is able to verify the phenomenon of neighbor-dependent mutations and find that certain combinations of neighbor-dependent mutations, defined by a cellular automata rule, occur with greater than 90% probability. We also find the average number of time steps for mutations for some branches of phylogenetic tree over a large number of possible transformations with standard deviations less than 2.
Resumo:
The prevalence of obesity is increasing at an alarming rate in all age groups worldwide. Obesity is a serious health problem due to increased risk of morbidity and mortality. Although environmental factors play a major role in the development of obesity, the identification of rare monogenic defects in human genes have confirmed that obesity has a strong genetic component. Mutations have been identified in genes encoding proteins of the leptin-melanocortin signaling system, which has an important role in the regulation of appetite and energy balance. The present study aimed at identifying mutations and genetic variations in the melanocortin receptors 2-5 and other genes active on the same signaling pathway accounting for severe early-onset obesity in children and morbid obesity in adults. The main achievement of this thesis was the identification of melanocortin-4 receptor (MC4R) mutations in Finnish patients. Six pathogenic MC4R mutations (308delT, P299H, two S127L and two -439delGC mutations) were identified, corresponding to a prevalence of 3% in severe early-onset obesity. No obesity causing MC4R mutations were found among patients with adult-onset morbid obesity. The MC4R 308delT deletion is predicted to result in a grossly truncated nonfunctional receptor of only 107 amino acids. The C-terminal residues, which are important in MC4R cell surface targeting, are totally absent from the mutant 308delT receptor. In vitro functional studies supported a pathogenic role for the S127L mutation since agonist induced signaling of the receptor was impaired. Cell membrane localization of the S127L receptor did not differ from that of the wild-type receptor, confirming that impaired function of the S127L receptor was due to reduced signaling properties. The P299H mutation leads to intracellular retention of the receptor. The -439delGC deletion is situated at a potential nescient helix-loop-helix 2 (NHLH2) -binding site in the MC4R promoter. It was demonstrated that the transcription factor NHLH2 binds to the consensus sequence at the -439delGC site in vitro, possibly resulting in altered promoter activity. Several genetic variants were identified in the melanocortin-3 receptor (MC3R) and pro-opiomelanocortin (POMC) genes. These polymorphisms do not explain morbid obesity, but the results indicate that some of these genetic variations may be modifying factors in obesity, resulting in subtle changes in obesity-related traits. A risk haplotype for obesity was identified in the ectonucleotide pyrophosphatase phosphodiesterase 1 (ENPP1) gene through a candidate gene single nucleotide polymorphism (SNP) genotyping approach. An ENPP1 haplotype, composed of SNPs rs1800949 and rs943003, was shown to be significantly associated with morbid obesity in adults. Accordingly, the MC3R, POMC and ENPP1 genes represent examples of susceptibility genes in which genetic variants predispose to obesity. In conclusion, pathogenic mutations in the MC4R gene were shown to account for 3% of cases with severe early-onset obesity in Finland. This is in line with results from other populations demonstrating that mutations in the MC4R gene underlie 1-6% of morbid obesity worldwide. MC4R deficiency thus represents the most common monogenic defect causing human obesity reported so far. The severity of the MC4-receptor defect appears to be associated with time of onset and the degree of obesity. Classification of MC4R mutations may provide a useful tool when predicting the outcome of the disease. In addition, several other genetic variants conferring susceptibility to obesity were detected in the MC3R, MC4R, POMC and ENPP1 genes.
Resumo:
In the current era of high-throughput sequencing and structure determination, functional annotation has become a bottleneck in biomedical science. Here, we show that automated inference of molecular function using functional linkages among genes increases the accuracy of functional assignments by >= 8% and enriches functional descriptions in >= 34% of top assignments. Furthermore, biochemical literature supports >80% of automated inferences for previously unannotated proteins. These results emphasize the benefit of incorporating functional linkages in protein annotation.
Resumo:
We have recently implicated heat shock protein 90 from Plasmodium falciparum (PfHsp90) as a potential drug target against malaria. Using inhibitors specific to the nucleotide binding domain of Hsp90, we have shown potent growth inhibitory effects on development of malarial parasite in human erythrocytes. To gain better understanding of the vital role played by PfHsp90 in parasite growth, we have modeled its three dimensional structure using recently described full length structure of yeast Hsp90. Sequence similarity found between PfHsp90 and yeast Hsp90 allowed us to model the core structure with high confidence. The superimposition of the predicted structure with that of the template yeast Hsp90 structure reveals an RMSD of 3.31 angstrom. The N-terminal and middle domains showed the least RMSD (1.76 angstrom) while the more divergent C-terminus showed a greater RMSD (2.84 angstrom) with respect to the template. The structure shows overall conservation of domains involved in nucleotide binding, ATPase activity, co-chaperone binding as well as inter-subunit interactions. Important co-chaperones known to modulate Hsp90 function in other eukaryotes are conserved in malarial parasite as well. An acidic stretch of amino acids found in the linker region, which is uniquely extended in PfHsp90 could not be modeled in this structure suggesting a flexible conformation. Our results provide a basis to compare the overall structure and functional pathways dependent on PfHsp90 in malarial parasite. Further analysis of differences found between human and parasite Hsp90 may make it possible to design inhibitors targeted specifically against malaria.
Resumo:
New stars form in dense interstellar clouds of gas and dust called molecular clouds. The actual sites where the process of star formation takes place are the dense clumps and cores deeply embedded in molecular clouds. The details of the star formation process are complex and not completely understood. Thus, determining the physical and chemical properties of molecular cloud cores is necessary for a better understanding of how stars are formed. Some of the main features of the origin of low-mass stars, like the Sun, are already relatively well-known, though many details of the process are still under debate. The mechanism through which high-mass stars form, on the other hand, is poorly understood. Although it is likely that the formation of high-mass stars shares many properties similar to those of low-mass stars, the very first steps of the evolutionary sequence are unclear. Observational studies of star formation are carried out particularly at infrared, submillimetre, millimetre, and radio wavelengths. Much of our knowledge about the early stages of star formation in our Milky Way galaxy is obtained through molecular spectral line and dust continuum observations. The continuum emission of cold dust is one of the best tracers of the column density of molecular hydrogen, the main constituent of molecular clouds. Consequently, dust continuum observations provide a powerful tool to map large portions across molecular clouds, and to identify the dense star-forming sites within them. Molecular line observations, on the other hand, provide information on the gas kinematics and temperature. Together, these two observational tools provide an efficient way to study the dense interstellar gas and the associated dust that form new stars. The properties of highly obscured young stars can be further examined through radio continuum observations at centimetre wavelengths. For example, radio continuum emission carries useful information on conditions in the protostar+disk interaction region where protostellar jets are launched. In this PhD thesis, we study the physical and chemical properties of dense clumps and cores in both low- and high-mass star-forming regions. The sources are mainly studied in a statistical sense, but also in more detail. In this way, we are able to examine the general characteristics of the early stages of star formation, cloud properties on large scales (such as fragmentation), and some of the initial conditions of the collapse process that leads to the formation of a star. The studies presented in this thesis are mainly based on molecular line and dust continuum observations. These are combined with archival observations at infrared wavelengths in order to study the protostellar content of the cloud cores. In addition, centimetre radio continuum emission from young stellar objects (YSOs; i.e., protostars and pre-main sequence stars) is studied in this thesis to determine their evolutionary stages. The main results of this thesis are as follows: i) filamentary and sheet-like molecular cloud structures, such as infrared dark clouds (IRDCs), are likely to be caused by supersonic turbulence but their fragmentation at the scale of cores could be due to gravo-thermal instability; ii) the core evolution in the Orion B9 star-forming region appears to be dynamic and the role played by slow ambipolar diffusion in the formation and collapse of the cores may not be significant; iii) the study of the R CrA star-forming region suggests that the centimetre radio emission properties of a YSO are likely to change with its evolutionary stage; iv) the IRDC G304.74+01.32 contains candidate high-mass starless cores which may represent the very first steps of high-mass star and star cluster formation; v) SiO outflow signatures are seen in several high-mass star-forming regions which suggest that high-mass stars form in a similar way as their low-mass counterparts, i.e., via disk accretion. The results presented in this thesis provide constraints on the initial conditions and early stages of both low- and high-mass star formation. In particular, this thesis presents several observational results on the early stages of clustered star formation, which is the dominant mode of star formation in our Galaxy.
Resumo:
The complete amino acid sequence of winged bean basic agglutinin (WBA I) was obtained by a combination of manual and gas-phase sequencing methods. Peptide fragments for sequence analyses were obtained by enzymatic cleavages using trypsin and Staphylococcus aureus V8 endoproteinase and by chemical cleavages using iodosobenzoic acid, hydroxylamine, and formic acid. COOH-terminal sequence analysis of WBA I and other peptides was performed using carboxypeptidase Y. The primary structure of WBA I was homologous to those of other legume lectins and more so to Erythrina corallodendron. Interestingly, the sequence shows remarkable identities in the regions involved in the association of the two monomers of E. corallodendron lectin. Other conserved regions are the double metal-binding site and residues contributing to the formation of the hydrophobic cavity and the carbohydrate-binding site. Chemical modification studies both in the presence and absence of N-acetylgalactosamine together with sequence analyses of tryptophan-containing tryptic peptides demonstrate that tryptophan 133 is involved in the binding of carbohydrate ligands by the lectin. The location of tryptophan 133 at the active center of WBA I for the first time subserves to explain a role for one of the most conserved residues in legume lectins.
Resumo:
The discovery of GH (Glycoside Hydrolase) 19 chitinases in Streptomyces sp. raises the possibility of the presence of these proteins in other bacterial species, since they were initially thought to be confined to higher plants. The present study mainly concentrates on the phylogenetic distribution and homology conservation in GH19 family chitinases. Extensive database searches are performed to identify the presence of GH19 family chitinases in the three major super kingdoms of life. Multiple sequence alignment of all the identified GH19 chitinase family members resulted in the identification of globally conserved residues. We further identified conserved sequence motifs across the major sub groups within the family. Estimation of evolutionary distance between the various bacterial and plant chitinases are carried out to better understand the pattern of evolution. Our study also supports the horizontal gene transfer theory, which states that GH19 chitinase genes are transferred from higher plants to bacteria. Further, the present study sheds light on the phylogenetic distribution and identifies unique sequence signatures that define GH19 chitinase family of proteins. The identified motifs could be used as markers to delineate uncharacterized GH19 family chitinases. The estimation of evolutionary distance between chitinase identified in plants and bacteria shows that the flowering plants are more related to chitinase in actinobacteria than that of identified in purple bacteria. We propose a model to elucidate the natural history of GH19 family chitinases.
Resumo:
The role of spermine in inducing A-DNA conformation in deoxyoligonucleotides has been studied using CCGG and GGCC as model sequences. It has been found that while CCGG adopts an alternating B-DNA conformation in low salt solution at low temperature, addition of spermine to this medium induces a B --greater than A transition. In contrast, the A-DNA-like structure of GGCC in low salt solution at low temperature does not change under the influence of spermine. This suggests a sequence-dependent behaviour of spermine. Further these results suggest that the A-DNA conformation observed in the crystals of d(iCCGG) and d(GGCC)2 might have been due to the presence of spermine in the crystallization cocktail.
Resumo:
The crystal and molecular structure of the ammonium salt of deoxycytidylyl-(3'-5')-deoxyguanosine has been determined from 0.85 A resolution single crystal X-ray diffraction data. The crystals obtained by acetone diffusion technique at -20 degrees C, are orthorhombic, P212121, a = 12.880(2), b = 17444(2) and c = 27.642(2) A. The structure was solved by high resolution Patterson and Fourier methods and refined to R = 0.136. There are two d(CpG) molecules in the asymmetric unit forming a mini left handed Z-DNA helix. This is in contrast to the earlier reported forms of d(CpG) where the molecules form self base paired duplexes. There are two ammonium ions in the asymmetric unit. The major groove NH+4 ion interacts with N7 of guanines through water bridges besides making H-bonded interactions directly with the phosphate oxygen atoms. A second NH+4 ion is found in the minor groove interacting directly with the phosphate oxygen atoms. Symmetry related molecules pack in such a way that the cytosine base stacks on cytosine and guanine base on guanine. Our structure demonstrates that alternating d(CpG) sequences have the ability to adopt the left handed Z-DNA structure even at the dimer level i.e., in a sequence which is only two base pairs long.
Resumo:
The dynamics of low-density flows is governed by the Boltzmann equation of the kinetic theory of gases. This is a nonlinear integro-differential equation and, in general, numerical methods must be used to obtain its solution. The present paper, after a brief review of Direct Simulation Monte Carlo (DSMC) methods due to Bird, and Belotserkovskii and Yanitskii, studies the details of theDSMC method of Deshpande for mono as well as multicomponent gases. The present method is a statistical particle-in-cell method and is based upon the Kac-Prigogine master equation which reduces to the Boltzmann equation under the hypothesis of molecular chaos. The proposed Markoff model simulating the collisions uses a Poisson distribution for the number of collisions allowed in cells into which the physical space is divided. The model is then extended to a binary mixture of gases and it is shown that it is necessary to perform the collisions in a certain sequence to obtain unbiased simulation.
Resumo:
The sequence specific requirement for B----Z transition in solution was examined in d(CGTGCGCACG), d(CGTACGTACG), d(ACGTACGT) in presence of various Z-inducing factors. Conformational studies show that inspite of the alternating nature of purines and pyrimidines, the aforementioned sequences do not undergo B----Z transition under the influence of NaCl, hexamine cobalt chloride and ethanol. A comparison with the crystal structures of an assorted array of purine and pyrimidine sequences show that the sequence requirement for B----Z transition is much more stringent in solution as compared to the solid state. The disruptive influence of AT base pairs in B to Z transition is discussed.