952 resultados para BIOINFORMATICS DATABASES
Resumo:
Mapping-based visualisations of image databases are well suited to users wanting to survey the overall content of a collection. Given the large amount of image data contained within such visualisations, however, this approach has yet to be applied to large image databases stored remotely. In this technical demonstration, we showcase our Web-Based Images Browser (WBIB). Our novel system makes use of image pyramids so that users can interactively explore mapping-based visualisations of large remote image databases. © 2012 Authors.
Resumo:
Vaccine design is highly suited to the application of in silico techniques, for both the discovery and development of new and existing vaccines. Here, we discuss computational contributions to epitope mapping and reverse vaccinology, two techniques central to the new discipline of immunomics. Also discussed are methods to improve the efficiency of vaccination, such as codon optimization and adjuvant discovery in addition to the identification of allergenic proteins. We also review current software developed to facilitate vaccine design.
Resumo:
G protein-coupled receptors (GPCR) are amongst the best studied and most functionally diverse types of cell-surface protein. The importance of GPCRs as mediates or cell function and organismal developmental underlies their involvement in key physiological roles and their prominence as targets for pharmacological therapeutics. In this review, we highlight the requirement for integrated protocols which underline the different perspectives offered by different sequence analysis methods. BLAST and FastA offer broad brush strokes. Motif-based search methods add the fine detail. Structural modelling offers another perspective which allows us to elucidate the physicochemical properties that underlie ligand binding. Together, these different views provide a more informative and a more detailed picture of GPCR structure and function. Many GPCRs remain orphan receptors with no identified ligand, yet as computer-driven functional genomics starts to elaborate their functions, a new understanding of their roles in cell and developmental biology will follow.
Resumo:
The methods and software for integration of databases (DBs) on inorganic material and substance properties have been developed. The information systems integration is based on known approaches combination: EII (Enterprise Information Integration) and EAI (Enterprise Application Integration). The metabase - special database that stores data on integrated DBs contents is an integrated system kernel. Proposed methods have been applied for DBs integrated system creation in the field of inorganic chemistry and materials science. Important developed integrated system feature is ability to include DBs that have been created by means of different DBMS using essentially various computer platforms: Sun (DB "Diagram") and Intel (other DBs) and diverse operating systems: Sun Solaris (DB "Diagram") and Microsoft Windows Server (other DBs).
Resumo:
Analysing the molecular polymorphism and interactions of DNA, RNA and proteins is of fundamental importance in biology. Predicting functions of polymorphic molecules is important in order to design more effective medicines. Analysing major histocompatibility complex (MHC) polymorphism is important for mate choice, epitope-based vaccine design and transplantation rejection etc. Most of the existing exploratory approaches cannot analyse these datasets because of the large number of molecules with a high number of descriptors per molecule. This thesis develops novel methods for data projection in order to explore high dimensional biological dataset by visualising them in a low-dimensional space. With increasing dimensionality, some existing data visualisation methods such as generative topographic mapping (GTM) become computationally intractable. We propose variants of these methods, where we use log-transformations at certain steps of expectation maximisation (EM) based parameter learning process, to make them tractable for high-dimensional datasets. We demonstrate these proposed variants both for synthetic and electrostatic potential dataset of MHC class-I. We also propose to extend a latent trait model (LTM), suitable for visualising high dimensional discrete data, to simultaneously estimate feature saliency as an integrated part of the parameter learning process of a visualisation model. This LTM variant not only gives better visualisation by modifying the project map based on feature relevance, but also helps users to assess the significance of each feature. Another problem which is not addressed much in the literature is the visualisation of mixed-type data. We propose to combine GTM and LTM in a principled way where appropriate noise models are used for each type of data in order to visualise mixed-type data in a single plot. We call this model a generalised GTM (GGTM). We also propose to extend GGTM model to estimate feature saliencies while training a visualisation model and this is called GGTM with feature saliency (GGTM-FS). We demonstrate effectiveness of these proposed models both for synthetic and real datasets. We evaluate visualisation quality using quality metrics such as distance distortion measure and rank based measures: trustworthiness, continuity, mean relative rank errors with respect to data space and latent space. In cases where the labels are known we also use quality metrics of KL divergence and nearest neighbour classifications error in order to determine the separation between classes. We demonstrate the efficacy of these proposed models both for synthetic and real biological datasets with a main focus on the MHC class-I dataset.
Resumo:
Current state of Russian databases for substances and materials properties was considered. A brief review of integration methods of given information systems was prepared and a distributed databases integration approach based on metabase was proposed. Implementation details were mentioned on the posed database on electronics materials integration approach. An operating pilot version of given integrated information system implemented at IMET RAS was considered.
Resumo:
The principles of organization of the distributed system of databases on properties of inorganic substances and materials based on the use of a special reference database are considered. The last includes not only information on a site of the data about the certain substance in other databases but also brief information on the most widespread properties of inorganic substances. The proposed principles were successfully realized at the creation of the distributed system of databases on properties of inorganic compounds developed by A.A.Baikov Institute of Metallurgy and Materials Science of the Russian Academy of Sciences.
Resumo:
Visual information is becoming increasingly important and tools to manage repositories of media collections are highly sought after. In this paper, we focus on image databases and on how to effectively and efficiently access these. In particular, we present effective image browsing systems that are operated on a large multi-touch environment for truly interactive exploration. Not only do image browsers pose a useful alternative to retrieval-based systems, they also provide a visualisation of the whole image collection and let users explore particular parts of the collection. Our systems are based on the idea that visually similar images are located close to each other in the visualisation, that image thumbnails are arranged on a regular lattice (either a regular grid projected on a sphere or a hexagonal lattice), and that large image datasets can be accessed through a hierarchical tree structure. © 2014 International Information Institute.
Resumo:
This paper surveys research in the field of data mining, which is related to discovering the dependencies between attributes in databases. We consider a number of approaches to finding the distribution intervals of association rules, to discovering branching dependencies between a given set of attributes and a given attribute in a database relation, to finding fractional dependencies between a given set of attributes and a given attribute in a database relation, and to collaborative filtering.
Resumo:
Image database visualisations, in particular mapping-based visualisations, provide an interesting approach to accessing image repositories as they are able to overcome some of the drawbacks associated with retrieval based approaches. However, making a mapping-based approach work efficiently on large remote image databases, has yet to be explored. In this paper, we present Web-Based Images Browser (WBIB), a novel system that efficiently employs image pyramids to reduce bandwidth requirements so that users can interactively explore large remote image databases. © 2013 Authors.
Resumo:
Background: Allergy is a form of hypersensitivity to normally innocuous substances, such as dust, pollen, foods or drugs. Allergens are small antigens that commonly provoke an IgE antibody response. There are two types of bioinformatics-based allergen prediction. The first approach follows FAO/WHO Codex alimentarius guidelines and searches for sequence similarity. The second approach is based on identifying conserved allergenicity-related linear motifs. Both approaches assume that allergenicity is a linearly coded property. In the present study, we applied ACC pre-processing to sets of known allergens, developing alignment-independent models for allergen recognition based on the main chemical properties of amino acid sequences.Results: A set of 684 food, 1,156 inhalant and 555 toxin allergens was collected from several databases. A set of non-allergens from the same species were selected to mirror the allergen set. The amino acids in the protein sequences were described by three z-descriptors (z1, z2 and z3) and by auto- and cross-covariance (ACC) transformation were converted into uniform vectors. Each protein was presented as a vector of 45 variables. Five machine learning methods for classification were applied in the study to derive models for allergen prediction. The methods were: discriminant analysis by partial least squares (DA-PLS), logistic regression (LR), decision tree (DT), naïve Bayes (NB) and k nearest neighbours (kNN). The best performing model was derived by kNN at k = 3. It was optimized, cross-validated and implemented in a server named AllerTOP, freely accessible at http://www.pharmfac.net/allertop. AllerTOP also predicts the most probable route of exposure. In comparison to other servers for allergen prediction, AllerTOP outperforms them with 94% sensitivity.Conclusions: AllerTOP is the first alignment-free server for in silico prediction of allergens based on the main physicochemical properties of proteins. Significantly, as well allergenicity AllerTOP is able to predict the route of allergen exposure: food, inhalant or toxin. © 2013 Dimitrov et al.; licensee BioMed Central Ltd.
Resumo:
The IUPHAR database (IUPHAR-DB) integrates peer-reviewed pharmacological, chemical, genetic, functional and anatomical information on the 354 nonsensory G protein-coupled receptors (GPCRs), 71 ligand-gated ion channel subunits and 141 voltage-gated-like ion channel subunits encoded by the human, rat and mouse genomes. These genes represent the targets of approximately one-third of currently approved drugs and are a major focus of drug discovery and development programs in the pharmaceutical industry. IUPHAR-DB provides a comprehensive description of the genes and their functions, with information on protein structure and interactions, ligands, expression patterns, signaling mechanisms, functional assays and biologically important receptor variants (e.g. single nucleotide polymorphisms and splice variants). In addition, the phenotypes resulting from altered gene expression (e.g. in genetically altered animals or in human genetic disorders) are described. The content of the database is peer reviewed by members of the International Union of Basic and Clinical Pharmacology Committee on Receptor Nomenclature and Drug Classification (NC-IUPHAR); the data are provided through manual curation of the primary literature by a network of over 60 subcommittees of NC-IUPHAR. Links to other bioinformatics resources, such as NCBI, Uniprot, HGNC and the rat and mouse genome databases are provided. IUPHAR-DB is freely available at http://www.iuphar-db.org. © 2008 The Author(s).
Resumo:
In this chapter we provide a comprehensive overview of the emerging field of visualising and browsing image databases. We start with a brief introduction to content-based image retrieval and the traditional query-by-example search paradigm that many retrieval systems employ. We specify the problems associated with this type of interface, such as users not being able to formulate a query due to not having a target image or concept in mind. The idea of browsing systems is then introduced as a means to combat these issues, harnessing the cognitive power of the human mind in order to speed up image retrieval.We detail common methods in which the often high-dimensional feature data extracted from images can be used to visualise image databases in an intuitive way. Systems using dimensionality reduction techniques, such as multi-dimensional scaling, are reviewed along with those that cluster images using either divisive or agglomerative techniques as well as graph-based visualisations. While visualisation of an image collection is useful for providing an overview of the contained images, it forms only part of an image database navigation system. We therefore also present various methods provided by these systems to allow for interactive browsing of these datasets. A further area we explore are user studies of systems and visualisations where we look at the different evaluations undertaken in order to test usability and compare systems, and highlight the key findings from these studies. We conclude the chapter with several recommendations for future work in this area. © 2011 Springer-Verlag Berlin Heidelberg.
Resumo:
Image collections are ever growing and hence efficient and effective tools to manage these repositories are highly sought after. In this paper, we present effective image browsing systems that are operated on a large multi-touch environment for truly interactive exploration. Not only do image browsers pose a useful alternative to retrieval-based systems, they also provide a visualisation of the whole image collection and allow users to interactively explore particular parts of the collection. Our systems are based on the idea that visually similar images are located close to each other in the visualisation, that image thumbnails are arranged on a regular lattice (either a regular grid projected onto a sphere or a hexagonal lattice), and that large image datasets can be accessed through a hierarchical tree structure. A pilot study has shown that the presented systems do indeed work well and are preferred compared to conventional image browsers. © 2011 IEEE.