14 resultados para SEQUENCE ALIGNMENT

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Angiotensin (Ang) I-converting enzyme (ACE) is a member of the gluzincin family of zinc metalloproteinases that contains two homologous catalytic domains. Both the N- and C-terminal domains are peptidyl-dipeptidases that catalyze Ang II formation and bradykinin degradation. Multiple sequence alignment was used to predict His1089 as the catalytic residue in human ACE C-domain that, by analogy with the prototypical gluzincin, thermolysin, stabilizes the scissile carbonyl bond through a hydrogen bond during transition state binding. Site-directed mutagenesis was used to change His1089 to Ala or Leu. At pH 7.5, with Ang I as substrate, kcat/Km values for these Ala and Leu mutants were 430 and 4,000-fold lower, respectively, compared with wild-type enzyme and were mainly due to a decrease in catalytic rate (kcat) with minor effects on ground state substrate binding (Km). A 120,000-fold decrease in the binding of lisinopril, a proposed transition state mimic, was also observed with the His1089 --> Ala mutation. ACE C-domain-dependent cleavage of AcAFAA showed a pH optimum of 8.2. H1089A has a pH optimum of 5.5 with no pH dependence of its catalytic activity in the range 6.5-10.5, indicating that the His1089 side chain allows ACE to function as an alkaline peptidyl-dipeptidase. Since transition state mutants of other gluzincins show pH optima shifts toward the alkaline, this effect of His1089 on the ACE pH optimum and its ability to influence transition state binding of the sulfhydryl inhibitor captopril indicate that the catalytic mechanism of ACE is distinct from that of other gluzincins.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Non-invasive spatial activity recognition is a difficult task, complicated by variation in how the same activities are conducted and furthermore by noise introduced by video tracking procedures. In this paper we propose an algorithm based on dynamic time warping (DTW) as a viable method with which to quantify segmented spatial activity sequences from a video tracking system. DTW is a widely used technique for optimally aligning or warping temporal sequences through minimisation of the distance between their components. The proposed algorithm threshold DTW (TDTW) is capable of accurate spatial sequence distance quantification and is shown using a three class spatial data set to be more robust and accurate than DTW and the discrete hidden markov model (HMM). We also evaluate the application of a band dynamic programming (DP) constraint to TDTW in order to reduce extraneous warping between sequences and to reduce the computation complexity of the approach. Results show that application of a band DP constraint to TDTW improves runtime performance significantly, whilst still maintaining a high precision and recall.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Environmental context Soils contaminated with metals can pose both environmental and human health risks. This study showed that a common crop vegetable grown in the presence of cadmium and zinc readily accumulated these metals, and thus could be a source of toxicity when eaten. The work highlights potential health risks from consuming crops grown on contaminated soils. Abstract Ingestion of plants grown in heavy metal contaminated soils can cause toxicity because of metal accumulation. We compared Cd and Zn levels in Brassica rapa, a widely grown crop vegetable, with that of the hyperaccumulator Solanum nigrum L. Solanum nigrum contained 4 times more Zn and 12 times more Cd than B. rapa, relative to dry mass. In S. nigrum Cd and Zn preferentially accumulated in the roots whereas in B. rapa Cd and Zn were concentrated more in the shoots than in the roots. The different distribution of Cd and Zn in B. rapa and S. nigrum suggests the presence of distinct metal uptake mechanisms. We correlated plant metal content with the expression of a conserved putative natural resistance-associated macrophage protein (NRAMP) metal transporter in both plants. Treatment of both plants with either Cd or Zn increased expression of the NRAMP, with expression levels being higher in the roots than in the shoots. These findings provide insights into the molecular mechanisms of heavy metal processing by S. nigrum L. and the crop vegetable B. rapa that could assist in application of these plants for phytoremediation. These investigations also highlight potential health risks associated with the consumption of crops grown on contaminated soils.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

BACKGROUND: Aloe vera supports a substantial global trade yet its wild origins, and explanations for its popularity over 500 related Aloe species in one of the world's largest succulent groups, have remained uncertain. We developed an explicit phylogenetic framework to explore links between the rich traditions of medicinal use and leaf succulence in aloes. RESULTS: The phylogenetic hypothesis clarifies the origins of Aloe vera to the Arabian Peninsula at the northernmost limits of the range for aloes. The genus Aloe originated in southern Africa ~16 million years ago and underwent two major radiations driven by different speciation processes, giving rise to the extraordinary diversity known today. Large, succulent leaves typical of medicinal aloes arose during the most recent diversification ~10 million years ago and are strongly correlated to the phylogeny and to the likelihood of a species being used for medicine. A significant, albeit weak, phylogenetic signal is evident in the medicinal uses of aloes, suggesting that the properties for which they are valued do not occur randomly across the branches of the phylogenetic tree. CONCLUSIONS: Phylogenetic investigation of plant use and leaf succulence among aloes has yielded new explanations for the extraordinary market dominance of Aloe vera. The industry preference for Aloe vera appears to be due to its proximity to important historic trade routes, and early introduction to trade and cultivation. Well-developed succulent leaf mesophyll tissue, an adaptive feature that likely contributed to the ecological success of the genus Aloe, is the main predictor for medicinal use among Aloe species, whereas evolutionary loss of succulence tends to be associated with losses of medicinal use. Phylogenetic analyses of plant use offer potential to understand patterns in the value of global plant diversity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

AIMS: To characterize genes involved in maintaining homeostatic levels of zinc in the cyanobacterium Nostoc punctiforme. METHODS AND RESULTS: Metal efflux transporters play a central role in maintaining homeostatic levels of trace elements such as zinc. Sequence analyses of the N. punctiforme genome identified two potential cation diffusion facilitator (CDF) metal efflux transporters, Npun_F0707 (Cdf31) and Npun_F1794 (Cdf33). Deletion of either Cdf31or Cdf33 resulted in increased zinc retention over 3 h. Interestingly, Cdf31(-) and Cdf33(-) mutants showed no change in sensitivity to zinc exposure in comparison with the wild type, suggesting some compensatory capacity for the loss of each other. Using qRT-PCR, a possible interaction was observed between the two cdf's, where the Cdf31(-) mutant had a more profound effect on cdf33 expression than Cdf33(-) did on cdf31. Over-expression of Cdf31 and Cdf33 in ZntA(-) - and ZitB(-) -deficient Escherichia coli revealed function similarities between the ZntA and ZitB of E. coli and the cyanobacterial transporters. CONCLUSIONS: The data presented shed light on the function of two important transporters that regulate zinc homeostasis in N. punctiforme. SIGNIFICANCE AND IMPACT OF THE STUDY: This study shows for the first time the functional characterization of two cyanobacterial zinc efflux proteins belonging to the CDF family.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An amino acid consensus sequence for the seven serotypes of foot-and-mouth disease virus (FMDV) nonstructural protein 3B, including all three contiguous repeats, and its use in the development of a pan-serotype diagnostic test for all seven FMDV serotypes are described. The amino acid consensus sequence of the 3B protein was determined from a multiple-sequence alignment of 125 sequences of 3B. The consensus 3B (c3B) protein was expressed as a soluble recombinant fusion protein with maltose-binding protein (MBP) using a bacterial expression system and was affinity purified using amylose resin. The MBP-c3B protein was used as the antigen in the development of a competition enzyme-linked immunosorbent assay (cELISA) for detection of anti-3B antibodies in bovine sera. The comparative diagnostic sensitivity and specificity at 47% inhibition were estimated to be 87.22% and 93.15%, respectively. Reactivity of c3B with bovine sera representing the seven FMDV serotypes demonstrated the pan-serotype diagnostic capability of this bioreagent. The consensus antigen and competition ELISA are described here as candidates for a pan-serotype diagnostic test for FMDV infection.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Large enterprise software systems make many complex interactions with other services in their environment. Developing and testing for production-like conditions is therefore a very challenging task. Current approaches include emulation of dependent services using either explicit modelling or record-and-replay approaches. Models require deep knowl-edge of the target services while record-and-replay is limited in accuracy. Both face developmental and scaling issues. We present a new technique that improves the accuracy of record-and-replay approaches, without requiring prior knowledge of the service protocols. The approach uses Multiple Sequence Alignment to derive message prototypes from recorded system interactions and a scheme to match incoming request messages against prototypes to generate response messages. We use a modified Needleman-Wunsch algorithm for distance calculation during message matching. Our approach has shown greater than 99% accuracy for four evaluated enterprise system messaging protocols. The approach has been successfully integrated into the CA Service Virtualization commercial product to complement its existing techniques.

Relevância:

40.00% 40.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Large-scale sequence assembly and alignment are fundamental parts of biological computing. However, most of the large-scale sequence assembly and alignment require intensive computing power and normally take very long time to complete. To speedup the assembly and alignment process, this paper parallelizes the Euler sequence assembly and pair-wise/multiple sequence assembly, two important sequence assembly methods, and takes advantage of Computing Grid which has a colossal computing capacity to meet the large-scale biological computing demand.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we address the spatial activity recognition problem with an algorithm based on Smith-Waterman (SW) local alignment. The proposed SW approach utilises dynamic programming with two dimensional spatial data to quantify sequence similarity. SW is well suited for spatial activity recognition as the approach is robust to noise and can accommodate gaps, resulting from tracking system errors. Unlike other approaches SW is able to locate and quantify activities embedded within extraneous spatial data. Through experimentation with a three class data set, we show that the proposed SW algorithm is capable of recognising accurately and inaccurately segmented spatial sequences. To benchmark the techniques classification performance we compare it to the discrete hidden markov model (HMM). Results show that SW exhibits higher accuracy than the HMM, and also maintains higher classification accuracy with smaller training set sizes. We also confirm the robust property of the SW approach via evaluation with sequences containing artificially introduced noise.