953 resultados para Annotation Tag
Resumo:
Pangasianodon hypophthalmus is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The current study using Ion Torrent technology generated EST resources from the kidney for Tra catfish reared at a salinity level of 9 ppt. We obtained 2,623,929 reads after trimming and processing with an average length of 104 bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 29,940 contigs, and allowing identification of 5,710 putative genes when comppared with NCBI non-redundant database. A large number of single nucleotide polymorphisms (SNPs) were also detected. The sequence collection generated in our study represents the most comprehensive transcriptomic resource for P. hypophthalmus available to date.
Resumo:
The rapid increase in genome sequence information has necessitated the annotation of their functional elements, particularly those occurring in the non-coding regions, in the genomic context. Promoter region is the key regulatory region, which enables the gene to be transcribed or repressed, but it is difficult to determine experimentally. Hence an in silico identification of promoters is crucial in order to guide experimental work and to pin point the key region that controls the transcription initiation of a gene. In this analysis, we demonstrate that while the promoter regions are in general less stable than the flanking regions, their average free energy varies depending on the GC composition of the flanking genomic sequence. We have therefore obtained a set of free energy threshold values, for genomic DNA with varying GC content and used them as generic criteria for predicting promoter regions in several microbial genomes, using an in-house developed tool `PromPredict'. On applying it to predict promoter regions corresponding to the 1144 and 612 experimentally validated TSSs in E. coli (50.8% GC) and B. subtilis (43.5% GC) sensitivity of 99% and 95% and precision values of 58% and 60%, respectively, were achieved. For the limited data set of 81 TSSs available for M. tuberculosis (65.6% GC) a sensitivity of 100% and precision of 49% was obtained.
Resumo:
We discuss a dynamic pricing model which will aid automobile manufacturer in choosing the right price for customer segment. Though there is oligopoly market structure, the customers get "locked" into a particular technology/company which virtually makes the situation akin to a monopoly. There are associated network externalities and positive feedback. The key idea in monopoly pricing lies in extracting the customer surplus by exploiting the respective elasticities of demand. We present a Walrasian general equilibrium approach to determine the segment price. We compare the prices obtained from optimization model with that from Walrasian dynamics. The results are encouraging and can serve as a critical factor in Customer Relationship Management (CRM) and thereby effectively manage the lock-in.
Resumo:
Motivation: The number of bacterial genomes being sequenced is increasing very rapidly and hence, it is crucial to have procedures for rapid and reliable annotation of their functional elements such as promoter regions, which control the expression of each gene or each transcription unit of the genome. The present work addresses this requirement and presents a generic method applicable across organisms. Results: Relative stability of the DNA double helical sequences has been used to discriminate promoter regions from non-promoter regions. Based on the difference in stability between neighboring regions, an algorithm has been implemented to predict promoter regions on a large scale over 913 microbial genome sequences. The average free energy values for the promoter regions as well as their downstream regions are found to differ, depending on their GC content. Threshold values to identify promoter regions have been derived using sequences flanking a subset of translation start sites from all microbial genomes and then used to predict promoters over the complete genome sequences. An average recall value of 72% (which indicates the percentage of protein and RNA coding genes with predicted promoter regions assigned to them) and precision of 56% is achieved over the 913 microbial genome dataset.
Resumo:
Of the similar to 4000 ORFs identified through the genome sequence of Mycobacterium tuberculosis (TB) H37Rv, experimentally determined structures are available for 312. Since knowledge of protein structures is essential to obtain a high-resolution understanding of the underlying biology, we seek to obtain a structural annotation for the genome, using computational methods. Structural models were obtained and validated for similar to 2877 ORFs, covering similar to 70% of the genome. Functional annotation of each protein was based on fold-based functional assignments and a novel binding site based ligand association. New algorithms for binding site detection and genome scale binding site comparison at the structural level, recently reported from the laboratory, were utilized. Besides these, the annotation covers detection of various sequence and sub-structural motifs and quaternary structure predictions based on the corresponding templates. The study provides an opportunity to obtain a global perspective of the fold distribution in the genome. The annotation indicates that cellular metabolism can be achieved with only 219 folds. New insights about the folds that predominate in the genome, as well as the fold-combinations that make up multi-domain proteins are also obtained. 1728 binding pockets have been associated with ligands through binding site identification and sub-structure similarity analyses. The resource (http://proline.physics.iisc.ernet.in/Tbstructuralannotation), being one of the first to be based on structure-derived functional annotations at a genome scale, is expected to be useful for better understanding of TB and for application in drug discovery. The reported annotation pipeline is fairly generic and can be applied to other genomes as well.
Resumo:
A computational pipeline PocketAnnotate for functional annotation of proteins at the level of binding sites has been proposed in this study. The pipeline integrates three in-house algorithms for site-based function annotation: PocketDepth, for prediction of binding sites in protein structures; PocketMatch, for rapid comparison of binding sites and PocketAlign, to obtain detailed alignment between pair of binding sites. A novel scheme has been developed to rapidly generate a database of non-redundant binding sites. For a given input protein structure, putative ligand-binding sites are identified, matched in real time against the database and the query substructure aligned with the promising hits, to obtain a set of possible ligands that the given protein could bind to. The input can be either whole protein structures or merely the substructures corresponding to possible binding sites. Structure-based function annotation at the level of binding sites thus achieved could prove very useful for cases where no obvious functional inference can be obtained based purely on sequence or fold-level analyses. An attempt has also been made to analyse proteins of no known function from Protein Data Bank. PocketAnnotate would be a valuable tool for the scientific community and contribute towards structure-based functional inference. The web server can be freely accessed at http://proline.biochem.iisc.ernet.in/pocketannotate/.
Resumo:
This paper describes a semi-automatic tool for annotation of multi-script text from natural scene images. To our knowledge, this is the maiden tool that deals with multi-script text or arbitrary orientation. The procedure involves manual seed selection followed by a region growing process to segment each word present in the image. The threshold for region growing can be varied by the user so as to ensure pixel-accurate character segmentation. The text present in the image is tagged word-by-word. A virtual keyboard interface has also been designed for entering the ground truth in ten Indic scripts, besides English. The keyboard interface can easily be generated for any script, thereby expanding the scope of the toolkit. Optionally, each segmented word can further be labeled into its constituent characters/symbols. Polygonal masks are used to split or merge the segmented words into valid characters/symbols. The ground truth is represented by a pixel-level segmented image and a '.txt' file that contains information about the number of words in the image, word bounding boxes, script and ground truth Unicode. The toolkit, developed using MATLAB, can be used to generate ground truth and annotation for any generic document image. Thus, it is useful for researchers in the document image processing community for evaluating the performance of document analysis and recognition techniques. The multi-script annotation toolokit (MAST) is available for free download.
Resumo:
The aim of this work is to enable seamless transformation of product concepts to CAD models. This necessitates availability of 3D product sketches. The present work concerns intuitive generation of 3D strokes and intrinsic support for space sharing and articulation for the components of the product being sketched. Direct creation of 3D strokes in air lacks in precision, stability and control. The inadequacy of proprioceptive feedback for the task is complimented in this work with stereo vision and haptics. Three novel methods based on pencil-paper interaction analogy for haptic rendering of strokes have been investigated. The pen-tilt based rendering is simpler and found to be more effective. For the spatial conformity, two modes of constraints for the stylus movements, corresponding to the motions on a control surface and in a control volume have been studied using novel reactive and field based haptic rendering schemes. The field based haptics, which in effect creates an attractive force field near a surface, though non-realistic, provided highly effective support for the control-surface constraints. The efficacy of the reactive haptic rendering scheme for the constrained environments has been demonstrated using scribble strokes. This can enable distributed collaborative 3D concept development. The notion of motion constraints, defined through sketch strokes enables intuitive generation of articulated 3D sketches and direct exploration of motion annotations found in most product concepts. The work, thus, establishes that modeling of the constraints is a central issue in 3D sketching.
Resumo:
Detection of trace amounts of explosive materials is significantly important for security concerns and pollution control. Four multicomponent metal organic frameworks (MOFs-12, 13, 23, and 123) have been synthesized by employing ligands embedded with fluorescent tags. The multicomponent assembly of the ligands was utilized to acquire a diverse electronic behavior of the MOFs and the fluorescent tags were strategically chosen to enhance the electron density in the MOFs. The phase purity of the MOFs was established by PXRD, NMR spectroscopy, and finally by singlecrystal XRD. Single-crystal structures of the MOFs-12 and 13 showed the formation of three-dimensional porous networks with the aromatic tags projecting inwardly into the pores. These electron-rich MOFs were utilized for detection of ex- plosive nitroaromatic compounds (NACs) through fluorescence quenching with high selectivity and sensitivity. The rate of fluorescence quenching for all the MOFs follows the order of electron deficiency of the NACs. We also showed the detection of picric acid (PA) by luminescent MOFs is not always reliable and can be misleading. This attracts our attention to explore these MOFs for sensing picryl chloride (PC), which is as explosive as picric acid and used widely to prepare more stable explosives like 2,4,6-trinitroaniline from PA. Moreover, the recyclability and sensitivity studies indicated that these MOFs can be reused several times with parts per billion (ppb) levels of sensitivity towards PC and 2,4,6-trinitrotoluene (TNT).
Resumo:
Cis-peptide embedded segments are rare in proteins but often highlight their important role in molecular function when they do occur. The high evolutionary conservation of these segments illustrates this observation almost universally, although no attempt has been made to systematically use this information for the purpose of function annotation. In the present study, we demonstrate how geometric clustering and level-specific Gene Ontology molecular-function terms (also known as annotations) can be used in a statistically significant manner to identify cis-embedded segments in a protein linked to its molecular function. The present study identifies novel cis-peptide fragments, which are subsequently used for fragment-based function annotation. Annotation recall benchmarks interpreted using the receiver-operator characteristic plot returned an area-under-curve >0.9, corroborating the utility of the annotation method. In addition, we identified cis-peptide fragments occurring in conjunction with functionally important trans-peptide fragments, providing additional insights into molecular function. We further illustrate the applicability of our method in function annotation where homology-based annotation transfer is not possible. The findings of the present study add to the repertoire of function annotation approaches and also facilitate engineering, design and allied studies around the cis-peptide neighborhood of proteins.
Resumo:
The availability of the genome sequence of Mycobacterium tuberculosis H37Rv has encouraged determination of large numbers of protein structures and detailed definition of the biological information encoded therein; yet, the functions of many proteins in M. tuberculosis remain unknown. The emergence of multidrug resistant strains makes it a priority to exploit recent advances in homology recognition and structure prediction to re-analyse its gene products. Here we report the structural and functional characterization of gene products encoded in the M. tuberculosis genome, with the help of sensitive profile-based remote homology search and fold recognition algorithms resulting in an enhanced annotation of the proteome where 95% of the M. tuberculosis proteins were identified wholly or partly with information on structure or function. New information includes association of 244 proteins with 205 domain families and a separate set of new association of folds to 64 proteins. Extending structural information across uncharacterized protein families represented in the M. tuberculosis proteome, by determining superfamily relationships between families of known and unknown structures, has contributed to an enhancement in the knowledge of structural content. In retrospect, such superfamily relationships have facilitated recognition of probable structure and/or function for several uncharacterized protein families, eventually aiding recognition of probable functions for homologous proteins corresponding to such families. Gene products unique to mycobacteria for which no functions could be identified are 183. Of these 18 were determined to be M. tuberculosis specific. Such pathogen-specific proteins are speculated to harbour virulence factors required for pathogenesis. A re-annotated proteome of M. tuberculosis, with greater completeness of annotated proteins and domain assigned regions, provides a valuable basis for experimental endeavours designed to obtain a better understanding of pathogenesis and to accelerate the process of drug target discovery. (C) 2014 Elsevier Ltd. All rights reserved.