887 resultados para SEQUENCE ALIGNMENT
Resumo:
Intergenic spacers of chloroplast DNA (cpDNA) are very useful in phylogenetic and population genetic studies of plant species, to study their potential integration in phylogenetic analysis. The non-coding trnE-trnT intergenic spacer of cpDNA was analyzed to assess the nucleotide sequence polymorphism of 16 Solanaceae species and to estimate its ability to contribute to the resolution of phylogenetic studies of this group. Multiple alignments of DNA sequences of trnE-trnT intergenic spacer made the identification of nucleotide variability in this region possible and the phylogeny was estimated by maximum parsimony and rooted with Convolvulaceae Ipomoea batalas, the most closely related family. Besides, this intergenic spacer was tested for the phylogenetic ability to differentiate taxonomic levels. For this purpose, species from four other families were analyzed and compared with Solanaceae species. Results confirmed polymorphism in the trnE-trnT region at different taxonomic levels.
Resumo:
In this study, 222 genome survey sequences were generated for Trypanosoma rangeli strain P07 isolated from an opossum (Didelphis albiventris) in Minas Gerais State, Brazil. T. rangeli sequences were compared by BLASTX (Basic Local Alignment Search Tool X) analysis with the assembled contigs of Leishmania braziliensis, Leishmania infantum, Leishmania major, Trypanosoma brucei, and Trypanosoma cruzi. Results revealed that 82% (182/222) of the sequences were associated with predicted proteins described, whereas 18% (40/222) of the sequences did not show significant identity with sequences deposited in databases, suggesting that they may represent T. rangeli-specific sequences. Among the 182 predicted sequences, 179 (80.6%) had the highest similarity with T. cruzi, 2 (0.9%) with T. brucei, and 1 (0.5%) with L. braziliensis. Computer analysis permitted the identification of members of various gene families described for trypanosomatids in the genome of T. rangeli, such as trans-sialidases, mucin-associated surface proteins, and major surface proteases (MSP or gp63). This is the first report identifying sequences of the MSP family in T. rangeli. Multiple sequence alignments showed that the predicted MSP of T. rangeli presented the typical characteristics of metalloproteases, such as the presence of the HEXXH motif, which corresponds to a region previously associated with the catalytic site of the enzyme, and various cysteine and proline residues, which are conserved among MSPs of different trypanosomatid species. Reverse transcriptase-polymerase chain reaction analysis revealed the presence of MSP transcripts in epimastigote forms of T. rangeli.
Resumo:
Liver samples from rabbits killed by RHDV, collected from five States in Australia in 1996 and 1997 were analysed by RT-PCR. A 398 bp fragment of the capsid protein (VP60) gene was amplified by PCR and directly sequenced. The alignment of the nucleotide and amino acid sequences and their comparison with the original strain of the virus released in Australia indicated genetic changes after two years have been small with 98.2% to 100% identity. The constructed phylogenetic tree suggests slight differences in nucleotide substitutions in various States but there is no clear evidence of clustering of sequences according to their geographic origin. In practical terms, sequencing of viral RNA provides a means of testing the efficacy of further releases and subsequent spread of the virus if such a strategy is employed as a means of enhancing RHD as a biological control of the wild rabbit in Australia.
Resumo:
Given the dynamic nature of cardiac function, correct temporal alignment of pre-operative models and intraoperative images is crucial for augmented reality in cardiac image-guided interventions. As such, the current study focuses on the development of an image-based strategy for temporal alignment of multimodal cardiac imaging sequences, such as cine Magnetic Resonance Imaging (MRI) or 3D Ultrasound (US). First, we derive a robust, modality-independent signal from the image sequences, estimated by computing the normalized crosscorrelation between each frame in the temporal sequence and the end-diastolic frame. This signal is a resembler for the left-ventricle (LV) volume curve over time, whose variation indicates di erent temporal landmarks of the cardiac cycle. We then perform the temporal alignment of these surrogate signals derived from MRI and US sequences of the same patient through Dynamic Time Warping (DTW), allowing to synchronize both sequences. The proposed framework was evaluated in 98 patients, which have undergone both 3D+t MRI and US scans. The end-systolic frame could be accurately estimated as the minimum of the image-derived surrogate signal, presenting a relative error of 1:6 1:9% and 4:0 4:2% for the MRI and US sequences, respectively, thus supporting its association with key temporal instants of the cardiac cycle. The use of DTW reduces the desynchronization of the cardiac events in MRI and US sequences, allowing to temporally align multimodal cardiac imaging sequences. Overall, a generic, fast and accurate method for temporal synchronization of MRI and US sequences of the same patient was introduced. This approach could be straightforwardly used for the correct temporal alignment of pre-operative MRI information and intra-operative US images.
Resumo:
O documento em anexo encontra-se na versão post-print (versão corrigida pelo editor).
Resumo:
BACKGROUND: A growing number of patients with chronic hepatitis B is being treated for extended periods with nucleoside and/or nucleotide analogs. In this context, antiviral resistance represents an increasingly common and complex issue. METHODS: Mutations in the hepatitis B virus (HBV) reverse transcriptase (rt) gene and viral genotypes were determined by direct sequencing of PCR products and alignment with reference sequences deposited in GenBank. RESULTS: Plasma samples from 60 patients with chronic hepatitis B were analyzed since March 2009. The predominant mutation pattern identified in patients with virological breakthrough was rtM204V/I ± different compensatory mutations, conferring resistance to L-nucleosides (lamivudine, telbivudine, emtricitabine) and predisposing to entecavir resistance (n = 18). Complex mutation patterns with a potential for multidrug resistance were identified in 2 patients. Selection of a fully entecavir resistant strain was observed in a patient exposed to lamivudine alone. Novel mutations were identified in 1 patient. Wild-type HBV was identified in 9 patients with suspected virological breakthrough, raising concerns about treatment adherence. No preexisting resistance mutations were identified in treatment-naïve patients (n = 13). Viral genome amplification and sequencing failed in 16 patients, of which only 2 had a documented HBV DNA > 1000 IU/ml. HBV genotypes were D in 28, A in 6, B in 4, C in 3 and E in 3 patients. Results will be updated in August 2010 and therapeutic implications discussed. CONCLUSIONS: With expanding treatment options and a growing number of patients exposed to nucleoside and/or nucleotide analogs, sequence-based HBV antiviral resistance testing is expected to become a cornerstone in the management of chronic hepatitis B.
Resumo:
The number of sequences generated by genome projects has increased exponentially, but gene characterization has not followed at the same rate. Sequencing and analysis of full-length cDNAs is an important step in gene characterization that has been used nowadays by several research groups. In this work, we have selected Schistosoma mansoni clones for full-length sequencing, using an algorithm that investigates the presence of the initial methionine in the parasite sequence based on the positions of alignment start between two sequences. BLAST searches to produce such alignments have been performed using parasite expressed sequence tags produced by Minas Gerais Genome Network against sequences from the database Eukaryotic Cluster of Orthologous Groups (KOG). This procedure has allowed the selection of clones representing 398 proteins which have not been deposited as S. mansoni complete CDS in any public database. Dedicated sequencing of 96 of such clones with reads from both 5' and 3' ends has been performed. These reads have been assembled using PHRAP, resulting in the production of 33 full-length sequences that represent novel S. mansoni proteins. These results shall contribute to construct a more complete view of the biology of this important parasite.
Resumo:
We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments.
Resumo:
One major methodological problem in analysis of sequence data is the determination of costs from which distances between sequences are derived. Although this problem is currently not optimally dealt with in the social sciences, it has some similarity with problems that have been solved in bioinformatics for three decades. In this article, the authors propose an optimization of substitution and deletion/insertion costs based on computational methods. The authors provide an empirical way of determining costs for cases, frequent in the social sciences, in which theory does not clearly promote one cost scheme over another. Using three distinct data sets, the authors tested the distances and cluster solutions produced by the new cost scheme in comparison with solutions based on cost schemes associated with other research strategies. The proposed method performs well compared with other cost-setting strategies, while it alleviates the justification problem of cost schemes.
Resumo:
Our objective was to clone, express and characterize adult Dermatophagoides farinae group 1 (Der f 1) allergens to further produce recombinant allergens for future clinical applications in order to eliminate side reactions from crude extracts of mites. Based on GenBank data, we designed primers and amplified the cDNA fragment coding for Der f 1 by nested-PCR. After purification and recovery, the cDNA fragment was cloned into the pMD19-T vector. The fragment was then sequenced, subcloned into the plasmid pET28a(+), expressed in Escherichia coli BL21 and identified by Western blotting. The cDNA coding for Der f 1 was cloned, sequenced and expressed successfully. Sequence analysis showed the presence of an open reading frame containing 966 bp that encodes a protein of 321 amino acids. Interestingly, homology analysis showed that the Der p 1 shared more than 87% identity in amino acid sequence with Eur m 1 but only 80% with Der f 1. Furthermore, phylogenetic analyses suggested that D. pteronyssinus was evolutionarily closer to Euroglyphus maynei than to D. farinae, even though D. pteronyssinus and D. farinae belong to the same Dermatophagoides genus. A total of three cysteine peptidase active sites were found in the predicted amino acid sequence, including 127-138 (QGGCGSCWAFSG), 267-277 (NYHAVNIVGYG) and 284-303 (YWIVRNSWDTTWGDSGYGYF). Moreover, secondary structure analysis revealed that Der f 1 contained an a helix (33.96%), an extended strand (17.13%), a ß turn (5.61%), and a random coil (43.30%). A simple three-dimensional model of this protein was constructed using a Swiss-model server. The cDNA coding for Der f 1 was cloned, sequenced and expressed successfully. Alignment and phylogenetic analysis suggests that D. pteronyssinus is evolutionarily more similar to E. maynei than to D. farinae.
Resumo:
A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation, few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and can foreseeably be used in a wide variety of imaging situations. Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images with computed tomography (CT) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images. The method is based on a formulation of the mutual information between the model and the image called EMMA. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation. Finally, we will describe a number of additional real-world applications that can be solved efficiently and reliably using EMMA. EMMA can be used in machine learning to find maximally informative projections of high-dimensional data. EMMA can also be used to detect and correct corruption in magnetic resonance images (MRI).
Resumo:
We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.
Resumo:
The alignment of model amyloid peptide YYKLVFFC is investigated in bulk and at a solid surface using a range of spectroscopic methods employing polarized radiation. The peptide is based on a core sequence of the amyloid beta (A beta) peptide, KLVFF. The attached tyrosine and cysteine units are exploited to yield information on alignment and possible formation of disulfide or dityrosine links. Polarized Raman spectroscopy on aligned stalks provides information on tyrosine orientation, which complements data from linear dichroism (LD) on aqueous solutions subjected to shear in a Couette cell. LD provides a detailed picture of alignment of peptide strands and aromatic residues and was also used to probe the kinetics of self-assembly. This suggests initial association of phenylalanine residues, followed by subsequent registry of strands and orientation of tyrosine residues. X-ray diffraction (XRD) data from aligned stalks is used to extract orientational order parameters from the 0.48 nm reflection in the cross-beta pattern, from which an orientational distribution function is obtained. X-ray diffraction on solutions subject to capillary flow confirmed orientation in situ at the level of the cross-beta pattern. The information on fibril and tyrosine orientation from polarized Raman spectroscopy is compared with results from NEXAFS experiments on samples prepared as films on silicon. This indicates fibrils are aligned parallel to the surface, with phenyl ring normals perpendicular to the surface. Possible disulfide bridging leading to peptide dimer formation was excluded by Raman spectroscopy, whereas dityrosine formation was probed by fluorescence experiments and was found not to occur except under alkaline conditions. Congo red binding was found not to influence the cross-beta XRD pattern.
Resumo:
The self-assembly and hydrogelation properties of two Fmoc-tripeptides [Fmoc = N-(fluorenyl-9-methoxycarbonyl)] are investigated, in borate buffer and other basic solutions. A remarkable difference in self-assembly properties is observed comparing Fmoc-VLK(Boc) with Fmoc-K(Boc)LV, both containing K protected by N(epsilon)-tert-butyloxycarbonate (Boc). In borate buffer, the former peptide forms highly anisotropic fibrils which show local alignment, and the hydrogels show flow-aligning properties. In contrast, Fmoc-K(Boc)LV forms highly branched fibrils that produce isotropic hydrogels with a much higher modulus (G' > 10(4) Pa), and lower concentration for hydrogel formation. The distinct self-assembled structures are ascribed to conformational differences, as revealed by secondary structure probes (CD, FTIR, Raman spectroscopy) and X-ray diffraction. Fmoc-VLK(Boc) forms well-defined beta-sheets with a cross-beta X-ray diffraction pattern, whereas Fmoc-KLV(Boc) forms unoriented assemblies with multiple stacked sheets. Interchange of the K and V residues when inverting the tripeptide sequence thus leads to substantial differences in self-assembled structures, suggesting a promising approach to control hydrogel properties.
Resumo:
The elucidation of the domain content of a given protein sequence in the absence of determined structure or significant sequence homology to known domains is an important problem in structural biology. Here we address how successfully the delineation of continuous domains can be accomplished in the absence of sequence homology using simple baseline methods, an existing prediction algorithm (Domain Guess by Size), and a newly developed method (DomSSEA). The study was undertaken with a view to measuring the usefulness of these prediction methods in terms of their application to fully automatic domain assignment. Thus, the sensitivity of each domain assignment method was measured by calculating the number of correctly assigned top scoring predictions. We have implemented a new continuous domain identification method using the alignment of predicted secondary structures of target sequences against observed secondary structures of chains with known domain boundaries as assigned by Class Architecture Topology Homology (CATH). Taking top predictions only, the success rate of the method in correctly assigning domain number to the representative chain set is 73.3%. The top prediction for domain number and location of domain boundaries was correct for 24% of the multidomain set (±20 residues). These results have been put into context in relation to the results obtained from the other prediction methods assessed