991 resultados para sequence database
Resumo:
In this article, we consider the single-machine scheduling problem with past-sequence-dependent (p-s-d) setup times and a learning effect. The setup times are proportional to the length of jobs that are already scheduled; i.e. p-s-d setup times. The learning effect reduces the actual processing time of a job because the workers are involved in doing the same job or activity repeatedly. Hence, the processing time of a job depends on its position in the sequence. In this study, we consider the total absolute difference in completion times (TADC) as the objective function. This problem is denoted as 1/LE, (Spsd)/TADC in Kuo and Yang (2007) ('Single Machine Scheduling with Past-sequence-dependent Setup Times and Learning Effects', Information Processing Letters, 102, 22-26). There are two parameters a and b denoting constant learning index and normalising index, respectively. A parametric analysis of b on the 1/LE, (Spsd)/TADC problem for a given value of a is applied in this study. In addition, a computational algorithm is also developed to obtain the number of optimal sequences and the range of b in which each of the sequences is optimal, for a given value of a. We derive two bounds b* for the normalising constant b and a* for the learning index a. We also show that, when a < a* or b > b*, the optimal sequence is obtained by arranging the longest job in the first position and the rest of the jobs in short processing time order.
Resumo:
The standard quantum search algorithm lacks a feature, enjoyed by many classical algorithms, of having a fixed-point, i.e. a monotonic convergence towards the solution. Here we present two variations of the quantum search algorithm, which get around this limitation. The first replaces selective inversions in the algorithm by selective phase shifts of $\frac{\pi}{3}$. The second controls the selective inversion operations using two ancilla qubits, and irreversible measurement operations on the ancilla qubits drive the starting state towards the target state. Using $q$ oracle queries, these variations reduce the probability of finding a non-target state from $\epsilon$ to $\epsilon^{2q+1}$, which is asymptotically optimal. Similar ideas can lead to robust quantum algorithms, and provide conceptually new schemes for error correction.
Resumo:
This paper describes the efforts at MILE lab, IISc, to create a 100,000-word database each in Kannada and Tamil for the design and development of Online Handwritten Recognition. It has been collected from over 600 users in order to capture the variations in writing style. We describe features of the scripts and how the number of symbols were reduced to be able to effectively train the data for recognition. The list of words include all the characters, Kannada and Indo-Arabic numerals, punctuations and other symbols. A semi-automated tool for the annotation of data from stroke to word level is used. It segments each word into stroke groups and also acts as a validation mechanism for segmentation. The tool displays the stroke, stroke groups and aksharas of a word and hence can be used to study the various styles of writing, delayed strokes and for assigning quality tags to the words. The tool is currently being used for annotating Tamil and Kannada data. The output is stored in a standard XML format.
Resumo:
During V(D)J recombination, RAG (recombination-activating gene) complex cleaves DNA based on sequence specificity. Besides its physiological function, RAG has been shown to act as a structure-specific nuclease. Recently, we showed that the presence of cytosine within the single-stranded region of heteroduplex DNA is important when RAGs cleave on DNA structures. In the present study, we report that heteroduplex DNA containing a bubble region can be cleaved efficiently when present along with a recombination signal sequence (RSS) in cis or trans configuration. The sequence of the bubble region influences RAG cleavage at RSS when present in cis. We also find that the kinetics of RAG cleavage differs between RSS and bubble, wherein RSS cleavage reaches maximum efficiency faster than bubble cleavage. In addition, unlike RSS, RAG cleavage at bubbles does not lead to cleavage complex formation. Finally, we show that the ``nonamer binding region,'' which regulates RAG cleavage on RSS, is not important during RAG activity in non-B DNA structures. Therefore, in the current study, we identify the possible mechanism by which RAG cleavage is regulated when it acts as a structure-specific nuclease. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents the preliminary analysis of Kannada WordNet and the set of relevant computational tools. Although the design has been inspired by the famous English WordNet, and to certain extent, by the Hindi WordNet, the unique features of Kannada WordNet are graded antonyms and meronymy relationships, nominal as well as verbal compoundings, complex verb constructions and efficient underlying database design (designed to handle storage and display of Kannada unicode characters). Kannada WordNet would not only add to the sparse collection of machine-readable Kannada dictionaries, but also will give new insights into the Kannada vocabulary. It provides sufficient interface for applications involved in Kannada machine translation, spell checker and semantic analyser.
Resumo:
Discovering patterns in temporal data is an important task in Data Mining. A successful method for this was proposed by Mannila et al. [1] in 1997. In their framework, mining for temporal patterns in a database of sequences of events is done by discovering the so called frequent episodes. These episodes characterize interesting collections of events occurring relatively close to each other in some partial order. However, in this framework(and in many others for finding patterns in event sequences), the ordering of events in an event sequence is the only allowed temporal information. But there are many applications where the events are not instantaneous; they have time durations. Interesting episodesthat we want to discover may need to contain information regarding event durations etc. In this paper we extend Mannila et al.’s framework to tackle such issues. In our generalized formulation, episodes are defined so that much more temporal information about events can be incorporated into the structure of an episode. This significantly enhances the expressive capability of the rules that can be discovered in the frequent episode framework. We also present algorithms for discovering such generalized frequent episodes.
Resumo:
The rapidly growing structure databases enhance the probability of finding identical sequences sharing structural similarity. Structure prediction methods are being used extensively to abridge the gap between known protein sequences and the solved structures which is essential to understand its specific biochemical and cellular functions. In this work, we plan to study the ambiguity between sequence-structure relationships and examine if sequentially identical peptide fragments adopt similar three-dimensional structures. Fragments of varying lengths (five to ten residues) were used to observe the behavior of sequence and its three-dimensional structures. The STAMP program was used to superpose the three-dimensional structures and the two parameters (Sequence Structure Similarity Score (Sc) and Root Mean Square Deviation value) were employed to classify them into three categories: similar, intermediate and dissimilar structures. Furthermore, the same approach was carried out on all the three-dimensional protein structures solved in the two organisms, Mycobacterium tuberculosis and Plasmodium falciparum to validate our results.
Resumo:
The regulation of phospholipid biosynthesis in Saccharomyces cerevisiae through cis-acting upstream activating sequence inositol (UAS(ino)) and trans-acting elements, such as the INO2-INO4 complex and OPI1 by inositol supplementation in growth is thoroughly studied. In this study, we provide evidence for the regulation of lipid biosynthesis by phosphatidylinositol-specific phospholipase C (PLC) through UAS(ino) and the trans-acting elements. Gene expression analysis and radiolabelling experiments demonstrated that the overexpression of rice PLC in yeast cells altered phospholipid biosynthesis at the levels of transcriptional and enzyme activity. This is the first report implicating PLC in the direct regulation of lipid biosynthesis. (C) 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Resumo:
We report the draft genome sequence of an ST772 Staphylococcus aureus disease isolate carrying staphylococcal cassette chromosome mec (SCCmec) type V from a pyomyositis patient. Our de novo short read assembly is similar to 2.8 Mb and encodes a unique Panton-Valentine leukocidin (PVL) phage with structural genes similar to those of phi 7247PVL and novel lysogenic genes at the N termini.
Resumo:
A computational pipeline PocketAnnotate for functional annotation of proteins at the level of binding sites has been proposed in this study. The pipeline integrates three in-house algorithms for site-based function annotation: PocketDepth, for prediction of binding sites in protein structures; PocketMatch, for rapid comparison of binding sites and PocketAlign, to obtain detailed alignment between pair of binding sites. A novel scheme has been developed to rapidly generate a database of non-redundant binding sites. For a given input protein structure, putative ligand-binding sites are identified, matched in real time against the database and the query substructure aligned with the promising hits, to obtain a set of possible ligands that the given protein could bind to. The input can be either whole protein structures or merely the substructures corresponding to possible binding sites. Structure-based function annotation at the level of binding sites thus achieved could prove very useful for cases where no obvious functional inference can be obtained based purely on sequence or fold-level analyses. An attempt has also been made to analyse proteins of no known function from Protein Data Bank. PocketAnnotate would be a valuable tool for the scientific community and contribute towards structure-based functional inference. The web server can be freely accessed at http://proline.biochem.iisc.ernet.in/pocketannotate/.
Resumo:
Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a ID sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. (C) 2012 Elsevier Masson SAS. All rights reserved.
Resumo:
Chromosomal aberration is considered to be one of the major characteristic features in many cancers. Chromosomal translocation, one type of genomic abnormality, can lead to deregulation of critical genes involved in regulating important physiological functions such as cell proliferation and DNA repair. Although chromosomal translocations were thought to be random events, recent findings suggest that certain regions in the human genome are more susceptible to breakage than others. The possibility of deviation from the usual B-DNA conformation in such fragile regions has been an active area of investigation. This review summarizes the factors that contribute towards the fragility of these regions in the chromosomes, such as DNA sequences and the role of different forms of DNA structures. Proteins responsible for chromosomal fragility, and their mechanism of action are also discussed. The effect of positioning of chromosomes within the nucleus favoring chromosomal translocations and the role of repair mechanisms are also addressed.
Resumo:
Heat shock protein information resource (HSPIR) is a concerted database of six major heat shock proteins (HSPs), namely, Hsp70, Hsp40, Hsp60, Hsp90, Hsp100 and small HSP. The HSPs are essential for the survival of all living organisms, as they protect the conformations of proteins on exposure to various stress conditions. They are a highly conserved group of proteins involved in diverse physiological functions, including de novo folding, disaggregation and protein trafficking. Moreover, their critical role in the control of disease progression made them a prime target of research. Presently, limited information is available on HSPs in reference to their identification and structural classification across genera. To that extent, HSPIR provides manually curated information on sequence, structure, classification, ontology, domain organization, localization and possible biological functions extracted from UniProt, GenBank, Protein Data Bank and the literature. The database offers interactive search with incorporated tools, which enhances the analysis. HSPIR is a reliable resource for researchers exploring structure, function and evolution of HSPs.
Resumo:
We report the draft genome sequence of methicillin-resistant Staphylococcus aureus (MRSA) strain ST672, an emerging disease clone in India, from a septicemia patient. The genome size is about 2.82 Mb with 2,485 open reading frames (ORFs). The staphylococcal cassette chromosome mec (SCCmec) element (type V) and immune evasion cluster appear to be different from those of strain ST772 on preliminary examination.
Resumo:
Resistance to therapy limits the effectiveness of drug treatment in many diseases. Drug resistance can be considered as a successful outcome of the bacterial struggle to survive in the hostile environment of a drug-exposed cell. An important mechanism by which bacteria acquire drug resistance is through mutations in the drug target. Drug resistant strains (multi-drug resistant and extensively drug resistant) of Mycobacterium tuberculosis are being identified at alarming rates, increasing the global burden of tuberculosis. An understanding of the nature of mutations in different drug targets and how they achieve resistance is therefore important. An objective of this study is to first decipher sequence as well as structural bases for the observed resistance in known drug resistant mutants and then to predict positions in each target that are more prone to acquiring drug resistant mutations. A curated database containing hundreds of mutations in the 38 drug targets of nine major clinical drugs, associated with resistance is studied here. Mutations have been classified into those that occur in the binding site itself, those that occur in residues interacting with the binding site and those that occur in outer zones. Structural models of the wild type and mutant forms of the target proteins have been analysed to seek explanations for reduction in drug binding. Stability analysis of an entire array of 19 mutations at each of the residues for each target has been computed using structural models. Conservation indices of individual residues, binding sites and whole proteins are computed based on sequence conservation analysis of the target proteins. The analyses lead to insights about which positions in the polypeptide chain have a higher propensity to acquire drug resistant mutations. Thus critical insights can be obtained about the effect of mutations on drug binding, in terms of which amino acid positions and therefore which interactions should not be heavily relied upon, which in turn can be translated into guidelines for modifying the existing drugs as well as for designing new drugs. The methodology can serve as a general framework to study drug resistant mutants in other micro-organisms as well.