952 resultados para BIOINFORMATICS DATABASES


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Biomedical ontologies are key elements for building up the Life Sciences Semantic Web. Reusing and building biomedical ontologies requires flexible and versatile tools to manipulate them efficiently, in particular for enriching their axiomatic content. The Ontology Pre Processor Language (OPPL) is an OWL-based language for automating the changes to be performed in an ontology. OPPL augments the ontologists’ toolbox by providing a more efficient, and less error-prone, mechanism for enriching a biomedical ontology than that obtained by a manual treatment. Results We present OPPL-Galaxy, a wrapper for using OPPL within Galaxy. The functionality delivered by OPPL (i.e. automated ontology manipulation) can be combined with the tools and workflows devised within the Galaxy framework, resulting in an enhancement of OPPL. Use cases are provided in order to demonstrate OPPL-Galaxy’s capability for enriching, modifying and querying biomedical ontologies. Conclusions Coupling OPPL-Galaxy with other bioinformatics tools of the Galaxy framework results in a system that is more than the sum of its parts. OPPL-Galaxy opens a new dimension of analyses and exploitation of biomedical ontologies, including automated reasoning, paving the way towards advanced biological data analyses.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Over the last few years, the Pennsylvania State University (PSU) under the sponsorship of the US Nuclear Regulatory Commission (NRC) has prepared, organized, conducted, and summarized two international benchmarks based on the NUPEC data—the OECD/NRC Full-Size Fine-Mesh Bundle Test (BFBT) Benchmark and the OECD/NRC PWR Sub-Channel and Bundle Test (PSBT) Benchmark. The benchmarks’ activities have been conducted in cooperation with the Nuclear Energy Agency/Organization for Economic Co-operation and Development (NEA/OECD) and the Japan Nuclear Energy Safety (JNES) Organization. This paper presents an application of the joint Penn State University/Technical University of Madrid (UPM) version of the well-known sub-channel code COBRA-TF (Coolant Boiling in Rod Array-Two Fluid), namely, CTF, to the steady state critical power and departure from nucleate boiling (DNB) exercises of the OECD/NRC BFBT and PSBT benchmarks. The goal is two-fold: firstly, to assess these models and to examine their strengths and weaknesses; and secondly, to identify the areas for improvement.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

El objetivo principal de este proyecto ha sido introducir aprendizaje automático en la aplicación FleSe. FleSe es una aplicación web que permite realizar consultas borrosas sobre bases de datos nítidos. Para llevar a cabo esta función la aplicación utiliza unos criterios para definir los conceptos borrosos usados para llevar a cabo las consultas. FleSe además permite que el usuario cambie estas personalizaciones. Es aquí donde introduciremos el aprendizaje automático, de tal manera que los criterios por defecto cambien y aprendan en función de las personalizaciones que van realizando los usuarios. Los objetivos secundarios han sido familiarizarse con el desarrollo y diseño web, al igual que recordar y ampliar el conocimiento sobre lógica borrosa y el lenguaje de programación lógica Ciao-Prolog. A lo largo de la realización del proyecto y sobre todo después del estudio de los resultados se demuestra que la agrupación de los usuarios marca la diferencia con la última versión de la aplicación. Esto se basa en la siguiente idea, podemos usar un algoritmo de aprendizaje automático sobre las personalizaciones de los criterios de todos los usuarios, pero la gran diversidad de opiniones de los usuarios puede llevar al algoritmo a concluir criterios erróneos o no representativos. Para solucionar este problema agrupamos a los usuarios intentando que cada grupo tengan la misma opinión o mismo criterio sobre el concepto. Y después de haber realizado las agrupaciones usar el algoritmo de aprendizaje automático para precisar el criterio por defecto de cada grupo de usuarios. Como posibles mejoras para futuras versiones de la aplicación FleSe sería un mejor control y manejo del ejecutable plserver. Este archivo se encarga de permitir a la aplicación web usar el lenguaje de programación lógica Ciao-Prolog para llevar a cabo la lógica borrosa relacionada con las consultas. Uno de los problemas más importantes que ofrece plserver es que bloquea el hilo de ejecución al intentar cargar un archivo con errores y en caso de ocurrir repetidas veces bloquea todas las peticiones siguientes bloqueando la aplicación. Pensando en los usuarios y posibles clientes, sería también importante permitir que FleSe trabajase con bases de datos de SQL en vez de almacenar la base de datos en los archivos de Prolog. Otra posible mejora basarse en distintas características a la hora de agrupar los usuarios dependiendo de los conceptos borrosos que se van ha utilizar en las consultas. Con esto se conseguiría que para cada concepto borroso, se generasen distintos grupos de usuarios, los cuales tendrían opiniones distintas sobre el concepto en cuestión. Así se generarían criterios por defecto más precisos para cada usuario y cada concepto borroso.---ABSTRACT---The main objective of this project has been to introduce machine learning in the application FleSe. FleSe is a web application that makes fuzzy queries over databases with precise information, using defined criteria to define the fuzzy concepts used by the queries. The application allows the users to change and custom these criteria. On this point is where the machine learning would be introduced, so FleSe learn from every new user customization of the criteria in order to generate a new default value of it. The secondary objectives of this project were get familiar with web development and web design in order to understand the how the application works, as well as refresh and improve the knowledge about fuzzy logic and logic programing. During the realization of the project and after the study of the results, I realized that clustering the users in different groups makes the difference between this new version of the application and the previous. This conclusion follows the next idea, we can use an algorithm to introduce machine learning over the criteria that people have, but the problem is the diversity of opinions and judgements that exists, making impossible to generate a unique correct criteria for all the users. In order to solve this problem, before using the machine learning methods, we cluster the users in order to make groups that have the same opinion, and afterwards, use the machine learning methods to precise the default criteria of each users group. The future improvements that could be important for the next versions of FleSe will be to control better the behaviour of the plserver file, that cost many troubles at the beginning of this project and it also generate important errors in the previous version. The file plserver allows the web application to use Ciao-Prolog, a logic programming language that control and manage all the fuzzy logic. One of the main problems with plserver is that when the user uploads a file with errors, it will block the thread and when this happens multiple times it will start blocking all the requests. Oriented to the customer, would be important as well to allow FleSe to manage and work with SQL databases instead of store the data in the Prolog files. Another possible improvement would that the cluster algorithm would be based on different criteria depending on the fuzzy concepts that the selected Prolog file have. This will generate more meaningful clusters, and therefore, the default criteria offered to the users will be more precise.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Expressed sequence tags (ESTs) are randomly sequenced cDNA clones. Currently, nearly 3 million human and 2 million mouse ESTs provide valuable resources that enable researchers to investigate the products of gene expression. The EST databases have proven to be useful tools for detecting homologous genes, for exon mapping, revealing differential splicing, etc. With the increasing availability of large amounts of poorly characterised eukaryotic (notably human) genomic sequence, ESTs have now become a vital tool for gene identification, sometimes yielding the only unambiguous evidence for the existence of a gene expression product. However, BLAST-based Web servers available to the general user have not kept pace with these developments and do not provide appropriate tools for querying EST databases with large highly spliced genes, often spanning 50 000–100 000 bases or more. Here we describe Gene2EST (http://woody.embl-heidelberg.de/gene2est/), a server that brings together a set of tools enabling efficient retrieval of ESTs matching large DNA queries and their subsequent analysis. RepeatMasker is used to mask dispersed repetitive sequences (such as Alu elements) in the query, BLAST2 for searching EST databases and Artemis for graphical display of the findings. Gene2EST combines these components into a Web resource targeted at the researcher who wishes to study one or a few genes to a high level of detail.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ARKdb genome databases provide comprehensive public repositories for genome mapping data from farmed species and other animals (http://www.thearkdb.org) providing a resource similar in function to that offered by GDB or MGD for human or mouse genome mapping data, respectively. Because we have attempted to build a generic mapping database, the system has wide utility, particularly for those species for which development of a specific resource would be prohibitive. The ARKdb genome database model has been implemented for 10 species to date. These are pig, chicken, sheep, cattle, horse, deer, tilapia, cat, turkey and salmon. Access to the ARKdb databases is effected via the World Wide Web using the ARKdb browser and Anubis map viewer. The information stored includes details of loci, maps, experimental methods and the source references. Links to other information sources such as PubMed and EMBL/GenBank are provided. Responsibility for data entry and curation is shared amongst scientists active in genome research in the species of interest. Mirror sites in the United States are maintained in addition to the central genome server at Roslin.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The RESID Database is a comprehensive collection of annotations and structures for protein post-translational modifications including N-terminal, C-terminal and peptide chain cross-link modifications. The RESID Database includes systematic and frequently observed alternate names, Chemical Abstracts Service registry numbers, atomic formulas and weights, enzyme activities, taxonomic range, keywords, literature citations with database cross-references, structural diagrams and molecular models. The NRL-3D Sequence–Structure Database is derived from the three-dimensional structure of proteins deposited with the Research Collaboratory for Structural Bioinformatics Protein Data Bank. The NRL-3D Database includes standardized and frequently observed alternate names, sources, keywords, literature citations, experimental conditions and searchable sequences from model coordinates. These databases are freely accessible through the National Cancer Institute–Frederick Advanced Biomedical Computing Center at these web sites: http://www.ncifcrf.gov/RESID, http://www.ncifcrf.gov/ NRL-3D; or at these National Biomedical Research Foundation Protein Information Resource web sites: http://pir.georgetown.edu/pirwww/dbinfo/resid.html, http://pir.georgetown.edu/pirwww/dbinfo/nrl3d.html

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is no control over the information provided with sequences when they are deposited in the sequence databases. Consequently mistakes can seed the incorrect annotation of other sequences. Grouping genes into families and applying controlled annotation overcomes the problems of incorrect annotation associated with individual sequences. Two databases (http://www.mendel.ac.uk) were created to apply controlled annotation to plant genes and plant ESTs: Mendel-GFDb is a database of plant protein (gene) families based on gapped-BLAST analysis of all sequences in the SWISS-PROT family of databases. Sequences are aligned (ClustalW) and identical and similar residues shaded. The families are visually curated to ensure that one or more criteria, for example overall relatedness and/or domain similarity relate all sequences within a family. Sequence families are assigned a ‘Gene Family Number’ and a unified description is developed which best describes the family and its members. If authority exists the gene family is assigned a ‘Gene Family Name’. This information is placed in Mendel-GFDb. Mendel-ESTS is primarily a database of plant ESTs, which have been compared to Mendel-GFDb, completely sequenced genomes and domain databases. This approach associated ESTs with individual sequences and the controlled annotation of gene families and protein domains; the information being placed in Mendel-ESTS. The controlled annotation applied to genes and ESTs provides a basis from which a plant transcription database can be developed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Support for molecular biology researchers has been limited to traditional library resources and services in most academic health sciences libraries. The University of Washington Health Sciences Libraries have been providing specialized services to this user community since 1995. The library recruited a Ph.D. biologist to assess the molecular biological information needs of researchers and design strategies to enhance library resources and services. A survey of laboratory research groups identified areas of greatest need and led to the development of a three-pronged program: consultation, education, and resource development. Outcomes of this program include bioinformatics consultation services, library-based and graduate level courses, networking of sequence analysis tools, and a biological research Web site. Bioinformatics clients are drawn from diverse departments and include clinical researchers in need of tools that are not readily available outside of basic sciences laboratories. Evaluation and usage statistics indicate that researchers, regardless of departmental affiliation or position, require support to access molecular biology and genetics resources. Centralizing such services in the library is a natural synergy of interests and enhances the provision of traditional library resources. Successful implementation of a library-based bioinformatics program requires both subject-specific and library and information technology expertise.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Internet has created new opportunities for librarians to present literature search results to clinicians. In order to take full advantage of these opportunities, libraries need to create locally maintained bibliographic databases. A simple method of creating a local bibliographic database and publishing it on the Web is described. The method uses off-the-shelf software and requires minimal programming. A hedge search strategy for outcome studies of clinical process interventions is created, and Ovid is used to search MEDLINE. The search results are saved and imported into EndNote libraries. The citations are modified, exported to a Microsoft Access database, and published on the Web. Clinicians can use a Web browser to search the database. The bibliographic database contains 13,803 MEDLINE citations of outcome studies. Most searches take between four and ten seconds and retrieve between ten and 100 citations. The entire cost of the software is under $900. Locally maintained bibliographic databases can be created easily and inexpensively. They significantly extend the evidence-based health care services that libraries can offer to clinicians.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Natural Language Interfaces to Query Databases (NLIDBs) have been an active research field since the 1960s. However, they have not been widely adopted. This article explores some of the biggest challenges and approaches for building NLIDBs and proposes techniques to reduce implementation and adoption costs. The article describes {AskMe*}, a new system that leverages some of these approaches and adds an innovative feature: query-authoring services, which lower the entry barrier for end users. Advantages of these approaches are proven with experimentation. Results confirm that, even when {AskMe*} is automatically reconfigurable against multiple domains, its accuracy is comparable to domain-specific NLIDBs.