903 resultados para SEQUENCE DATA


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Molecular and fragment ion data of intact 8- to 43-kDa proteins from electrospray Fourier-transform tandem mass spectrometry are matched against the corresponding data in sequence data bases. Extending the sequence tag concept of Mann and Wilm for matching peptides, a partial amino acid sequence in the unknown is first identified from the mass differences of a series of fragment ions, and the mass position of this sequence is defined from molecular weight and the fragment ion masses. For three studied proteins, a single sequence tag retrieved only the correct protein from the data base; a fourth protein required the input of two sequence tags. However, three of the data base proteins differed by having an extra methionine or by missing an acetyl or heme substitution. The positions of these modifications in the protein examined were greatly restricted by the mass differences of its molecular and fragment ions versus those of the data base. To characterize the primary structure of an unknown represented in the data base, this method is fast and specific and does not require prior enzymatic or chemical degradation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Of the approximately 380 families of angiosperms, representatives of only 10 are known to form symbiotic associations with nitrogen-fixing bacteria in root nodules. The morphologically based classification schemes proposed by taxonomists suggest that many of these 10 families of plants are only distantly related, engendering the hypothesis that the capacity to fix nitrogen evolved independently several, if not many, times. This has in turn influenced attitudes toward the likelihood of transferring genes responsible for symbiotic nitrogen fixation to crop species lacking this ability. Phylogenetic analysis of DNA sequences for the chloroplast gene rbcL indicates, however, that representatives of all 10 families with nitrogen-fixing symbioses occur together, with several families lacking this association, in a single clade. This study therefore indicates that only one lineage of closely related taxa achieved the underlying genetic architecture necessary for symbiotic nitrogen fixation in root nodules.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The under-reporting of cases of infectious diseases is a substantial impediment to the control and management of infectious diseases in both epidemic and endemic contexts. Information about infectious disease dynamics can be recovered from sequence data using time-varying coalescent approaches, and phylodynamic models have been developed in order to reconstruct demographic changes of the numbers of infected hosts through time. In this study I have demonstrated the general concordance between empirically observed epidemiological incidence data and viral demography inferred through analysis of foot-and-mouth disease virus VP1 coding sequences belonging to the CATHAY topotype over large temporal and spatial scales. However a more precise and robust relationship between the effective population size (

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mark Pagel, Andrew Meade (2004). A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Systematic Biology, 53(4), 571-581. RAE2008

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Population subdivision complicates analysis of molecular variation. Even if neutrality is assumed, three evolutionary forces need to be considered: migration, mutation, and drift. Simplification can be achieved by assuming that the process of migration among and drift within subpopulations is occurring fast compared to Mutation and drift in the entire population. This allows a two-step approach in the analysis: (i) analysis of population subdivision and (ii) analysis of molecular variation in the migrant pool. We model population subdivision using an infinite island model, where we allow the migration/drift parameter Theta to vary among populations. Thus, central and peripheral populations can be differentiated. For inference of Theta, we use a coalescence approach, implemented via a Markov chain Monte Carlo (MCMC) integration method that allows estimation of allele frequencies in the migrant pool. The second step of this approach (analysis of molecular variation in the migrant pool) uses the estimated allele frequencies in the migrant pool for the study of molecular variation. We apply this method to a Drosophila ananassae sequence data set. We find little indication of isolation by distance, but large differences in the migration parameter among populations. The population as a whole seems to be expanding. A population from Bogor (Java, Indonesia) shows the highest variation and seems closest to the species center.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background Bactrocera dorsalis s.s. is a pestiferous tephritid fruit fly distributed from Pakistan to the Pacific, with the Thai/Malay peninsula its southern limit. Sister pest taxa, B. papayae and B. philippinensis, occur in the southeast Asian archipelago and the Philippines, respectively. The relationship among these species is unclear due to their high molecular and morphological similarity. This study analysed population structure of these three species within a southeast Asian biogeographical context to assess potential dispersal patterns and the validity of their current taxonomic status. Results Geometric morphometric results generated from 15 landmarks for wings of 169 flies revealed significant differences in wing shape between almost all sites following canonical variate analysis. For the combined data set there was a greater isolation-by-distance (IBD) effect under a ‘non-Euclidean’ scenario which used geographical distances within a biogeographical ‘Sundaland context’ (r2 = 0.772, P < 0.0001) as compared to a ‘Euclidean’ scenario for which direct geographic distances between sample sites was used (r2 = 0.217, P < 0.01). COI sequence data were obtained for 156 individuals and yielded 83 unique haplotypes with no correlation to current taxonomic designations via a minimum spanning network. BEAST analysis provided a root age and location of 540kya in northern Thailand, with migration of B. dorsalis s.l. into Malaysia 470kya and Sumatra 270kya. Two migration events into the Philippines are inferred. Sequence data revealed a weak but significant IBD effect under the ‘non-Euclidean’ scenario (r2 = 0.110, P < 0.05), with no historical migration evident between Taiwan and the Philippines. Results are consistent with those expected at the intra-specific level. Conclusions Bactrocera dorsalis s.s., B. papayae and B. philippinensis likely represent one species structured around the South China Sea, having migrated from northern Thailand into the southeast Asian archipelago and across into the Philippines. No migration is apparent between the Philippines and Taiwan. This information has implications for quarantine, trade and pest management.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Tobacco plants were transformed with a chimeric transgene comprising sequences encoding β-glucuronidase (GUS) and the satellite RNA (satRNA) of cereal yellow dwarf luteovirus. When transgenic plants were infected with potato leafroll luteovirus (PLRV), which replicated the transgene-derived satRNA to a high level, the satellite sequence of the GUS:Sat transgene became densely methylated. Within the satellite region, all 86 cytosines in the upper strand and 73 of the 75 cytosines in the lower strand were either partially or fully methylated. In contrast, very low levels of DNA methylation were detected in the satellite sequence of the transgene in uninfected plants and in the flanking nonsatellite sequences in both infected and uninfected plants. Substantial amounts of truncated GUS:Sat RNA accumulated in the satRNA-replicating plants, and most of the molecules terminated at nucleotides within the first 60 bp of the satellite sequence. Whereas this RNA truncation was associated with high levels of satRNA replication, it appeared to be independent of the levels of DNA methylation in the satellite sequence, suggesting that it is not caused by methylation. All the sequenced GUS:Sat DNA molecules were hypermethylated in plants with replicating satRNA despite the phloem restriction of the helper PLRV. Also, small, sense and antisense ∼22 nt RNAs, derived from the satRNA, were associated with the replicating satellite. These results suggest that the sequence-specific DNA methylation spread into cells in which no satRNA replication occurred and that this was mediated by the spread of unamplified satRNA and/or its associated 22 nt RNA molecules.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The nucleotide sequence of the genomic RNA of barley yellow dwarf virus, PAV serotype was determined except for the 5′-terminal base, and its genome organization deduced. The 5,677 nucleotide genome contains five large open reading frames (ORFs). The genes for the coat protein (1) and the putative viral RNA-dependent RNA polymerase were identified. The latter shows a striking degree of similarity to that of carnation mottle virus (CarMV). By comparison with corona- and retrovirus RNAs, it is proposed that a translational frameshift is involved in expression of the polymerase. An ORF encoding an Mr 49,797 protein (50K ORF) may be translated by in-frame readthrough of the coat protein stop codon. The coat protein, an overlapping 17K ORF, and a 3′ 6.7K ORF are likely to be expressed via subgenomic mRNAs. © 1988 IRL Press Limited.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The complete nucleotide sequence of Subterranean clover mottle virus (SCMoV) genomic RNA has been determined. The SCMoV genome is 4,258 nucleotides in length. It shares most nucleotide and amino acid sequence identity with the genome of Lucerne transient streak virus (LTSV). SCMoV RNA encodes four overlapping open reading frames and has a genome organisation similar to that of Cocksfoot mottle virus (CfMV). ORF1 and ORF4 are predicted to encode single proteins. ORF2 is predicted to encode two proteins that are derived from a -1 translational frameshift between two overlapping reading frames (ORF2a and ORF2b). A search of amino acid databases did not find a significant match for ORF1 and the function of this protein remains unclear. ORF2a contains a motif typical of chymotrypsin-like serine proteases and ORF2b has motifs characteristically present in positive-stranded RNA-dependent RNA polymerases. ORF4 is likely to be expressed from a subgenomic RNA and encodes the viral coat protein. The ORF2a/ORF2b overlapping gene expression strategy used by SCMoV and CfMV is similar to that of the poleroviruses and differ from that of other published sobemoviruses. These results suggest that the sobemoviruses could now be divided into two distinct subgroups based on those that express the RNA-dependent RNA polymerase from a single, in-frame polyprotein, and those that express it via a -1 translational frameshifting mechanism.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The nucleotide sequence of the coat protein gene of barley yellow dwarf virus (BYDV, PAV serotype) was determined, and the amino acid sequence was deduced. The open reading frame, encoding a protein of relative molecular mass (Mr) 22,047, was confirmed as the coat protein gene by comparison with amino acid sequences of tryptic peptides derived from dissociated virions. In addition, a fragment of this gene expressed in Escherichia coli produced a product which was recognized by antibodies prepared against purified BYDV virions. An overlapping reading frame encoding an Mr 17,147 protein is contained completely within the coat protein gene. © 1988.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The number of genetic factors associated with common human traits and disease is increasing rapidly, and the general public is utilizing affordable, direct-to-consumer genetic tests. The results of these tests are often in the public domain. A combination of factors has increased the potential for the indirect estimation of an individual's risk for a particular trait. Here we explain the basic principals underlying risk estimation which allowed us to test the ability to make an indirect risk estimation from genetic data by imputing Dr. James Watson's redacted apolipoprotein E gene (APOE) information. The principles underlying risk prediction from genetic data have been well known and applied for many decades, however, the recent increase in genomic knowledge, and advances in mathematical and statistical techniques and computational power, make it relatively easy to make an accurate but indirect estimation of risk. There is a current hazard for indirect risk estimation that is relevant not only to the subject but also to individuals related to the subject; this risk will likely increase as more detailed genomic data and better computational tools become available.