52 resultados para Database management -- Computer programs

em Université de Lausanne, Switzerland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes a description of the initiation site mapping data, exhaustive cross-references to the EMBL nucleotide sequence database, SWISS-PROT, TRANSFAC and other databases, as well as bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. WWW-based interfaces have been developed that enable the user to view EPD entries in different formats, to select and extract promoter sequences according to a variety of criteria, and to navigate to related databases exploiting different cross-references. The EPD web site also features yearly updated base frequency matrices for major eukaryotic promoter elements. EPD can be accessed at http://www.epd.isb-sib.ch

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Therapeutic drug monitoring (TDM) aims to optimize treatments by individualizing dosage regimens based on the measurement of blood concentrations. Dosage individualization to maintain concentrations within a target range requires pharmacokinetic and clinical capabilities. Bayesian calculations currently represent the gold standard TDM approach but require computation assistance. In recent decades computer programs have been developed to assist clinicians in this assignment. The aim of this survey was to assess and compare computer tools designed to support TDM clinical activities. The literature and the Internet were searched to identify software. All programs were tested on personal computers. Each program was scored against a standardized grid covering pharmacokinetic relevance, user friendliness, computing aspects, interfacing and storage. A weighting factor was applied to each criterion of the grid to account for its relative importance. To assess the robustness of the software, six representative clinical vignettes were processed through each of them. Altogether, 12 software tools were identified, tested and ranked, representing a comprehensive review of the available software. Numbers of drugs handled by the software vary widely (from two to 180), and eight programs offer users the possibility of adding new drug models based on population pharmacokinetic analyses. Bayesian computation to predict dosage adaptation from blood concentration (a posteriori adjustment) is performed by ten tools, while nine are also able to propose a priori dosage regimens, based only on individual patient covariates such as age, sex and bodyweight. Among those applying Bayesian calculation, MM-USC*PACK© uses the non-parametric approach. The top two programs emerging from this benchmark were MwPharm© and TCIWorks. Most other programs evaluated had good potential while being less sophisticated or less user friendly. Programs vary in complexity and might not fit all healthcare settings. Each software tool must therefore be regarded with respect to the individual needs of hospitals or clinicians. Programs should be easy and fast for routine activities, including for non-experienced users. Computer-assisted TDM is gaining growing interest and should further improve, especially in terms of information system interfacing, user friendliness, data storage capability and report generation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: DNA sequence integrity, mRNA concentrations and protein-DNA interactions have been subject to genome-wide analyses based on microarrays with ever increasing efficiency and reliability over the past fifteen years. However, very recently novel technologies for Ultra High-Throughput DNA Sequencing (UHTS) have been harnessed to study these phenomena with unprecedented precision. As a consequence, the extensive bioinformatics environment available for array data management, analysis, interpretation and publication must be extended to include these novel sequencing data types. DESCRIPTION: MIMAS was originally conceived as a simple, convenient and local Microarray Information Management and Annotation System focused on GeneChips for expression profiling studies. MIMAS 3.0 enables users to manage data from high-density oligonucleotide SNP Chips, expression arrays (both 3'UTR and tiling) and promoter arrays, BeadArrays as well as UHTS data using MIAME-compliant standardized vocabulary. Importantly, researchers can export data in MAGE-TAB format and upload them to the EBI's ArrayExpress certified data repository using a one-step procedure. CONCLUSION: We have vastly extended the capability of the system such that it processes the data output of six types of GeneChips (Affymetrix), two different BeadArrays for mRNA and miRNA (Illumina) and the Genome Analyzer (a popular Ultra-High Throughput DNA Sequencer, Illumina), without compromising on its flexibility and user-friendliness. MIMAS, appropriately renamed into Multiomics Information Management and Annotation System, is currently used by scientists working in approximately 50 academic laboratories and genomics platforms in Switzerland and France. MIMAS 3.0 is freely available via http://multiomics.sourceforge.net/.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided in a broad array of diverse formats, making access to this very difficult. The publication and wide implementation of the Human Proteome Organisation Proteomics Standards Initiative Molecular Interactions (HUPO PSI-MI) format in 2004 was a major step towards the establishment of a single, unified format by which molecular interactions should be presented, but focused purely on protein-protein interactions. RESULTS: The HUPO-PSI has further developed the PSI-MI XML schema to enable the description of interactions between a wider range of molecular types, for example nucleic acids, chemical entities, and molecular complexes. Extensive details about each supported molecular interaction can now be captured, including the biological role of each molecule within that interaction, detailed description of interacting domains, and the kinetic parameters of the interaction. The format is supported by data management and analysis tools and has been adopted by major interaction data providers. Additionally, a simpler, tab-delimited format MITAB2.5 has been developed for the benefit of users who require only minimal information in an easy to access configuration. CONCLUSION: The PSI-MI XML2.5 and MITAB2.5 formats have been jointly developed by interaction data producers and providers from both the academic and commercial sector, and are already widely implemented and well supported by an active development community. PSI-MI XML2.5 enables the description of highly detailed molecular interaction data and facilitates data exchange between databases and users without loss of information. MITAB2.5 is a simpler format appropriate for fast Perl parsing or loading into Microsoft Excel.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduction: Therapeutic drug monitoring (TDM) aims at optimizing treatment by individualizing dosage regimen based on measurement of blood concentrations. Maintaining concentrations within a target range requires pharmacokinetic and clinical capabilities. Bayesian calculation represents a gold standard in TDM approach but requires computing assistance. In the last decades computer programs have been developed to assist clinicians in this assignment. The aim of this benchmarking was to assess and compare computer tools designed to support TDM clinical activities.¦Method: Literature and Internet search was performed to identify software. All programs were tested on common personal computer. Each program was scored against a standardized grid covering pharmacokinetic relevance, user-friendliness, computing aspects, interfacing, and storage. A weighting factor was applied to each criterion of the grid to consider its relative importance. To assess the robustness of the software, six representative clinical vignettes were also processed through all of them.¦Results: 12 software tools were identified, tested and ranked. It represents a comprehensive review of the available software's characteristics. Numbers of drugs handled vary widely and 8 programs offer the ability to the user to add its own drug model. 10 computer programs are able to compute Bayesian dosage adaptation based on a blood concentration (a posteriori adjustment) while 9 are also able to suggest a priori dosage regimen (prior to any blood concentration measurement), based on individual patient covariates, such as age, gender, weight. Among those applying Bayesian analysis, one uses the non-parametric approach. The top 2 software emerging from this benchmark are MwPharm and TCIWorks. Other programs evaluated have also a good potential but are less sophisticated (e.g. in terms of storage or report generation) or less user-friendly.¦Conclusion: Whereas 2 integrated programs are at the top of the ranked listed, such complex tools would possibly not fit all institutions, and each software tool must be regarded with respect to individual needs of hospitals or clinicians. Interest in computing tool to support therapeutic monitoring is still growing. Although developers put efforts into it the last years, there is still room for improvement, especially in terms of institutional information system interfacing, user-friendliness, capacity of data storage and report generation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Integration of biological data of various types and the development of adapted bioinformatics tools represent critical objectives to enable research at the systems level. The European Network of Excellence ENFIN is engaged in developing an adapted infrastructure to connect databases, and platforms to enable both the generation of new bioinformatics tools and the experimental validation of computational predictions. With the aim of bridging the gap existing between standard wet laboratories and bioinformatics, the ENFIN Network runs integrative research projects to bring the latest computational techniques to bear directly on questions dedicated to systems biology in the wet laboratory environment. The Network maintains internally close collaboration between experimental and computational research, enabling a permanent cycling of experimental validation and improvement of computational prediction methods. The computational work includes the development of a database infrastructure (EnCORE), bioinformatics analysis methods and a novel platform for protein function analysis FuncNet.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Curated databases are an integral part of the tool set that researchers use on a daily basis for their work. For most users, however, how databases are maintained, and by whom, is rather obscure. The International Society for Biocuration (ISB) represents biocurators, software engineers, developers and researchers with an interest in biocuration. Its goals include fostering communication between biocurators, promoting and describing their work, and highlighting the added value of biocuration to the world. The ISB recently conducted a survey of biocurators to better understand their educational and scientific backgrounds, their motivations for choosing a curatorial job and their career goals. The results are reported here. From the responses received, it is evident that biocuration is performed by highly trained scientists and perceived to be a stimulating career, offering both intellectual challenges and the satisfaction of performing work essential to the modern scientific community. It is also apparent that the ISB has at least a dual role to play to facilitate biocurators' work: (i) to promote biocuration as a career within the greater scientific community; (ii) to aid the development of resources for biomedical research through promotion of nomenclature and data-sharing standards that will allow interconnection of biological databases and better exploit the pivotal contributions that biocurators are making. DATABASE URL: http://biocurator.org.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The DNA microarray technology has arguably caught the attention of the worldwide life science community and is now systematically supporting major discoveries in many fields of study. The majority of the initial technical challenges of conducting experiments are being resolved, only to be replaced with new informatics hurdles, including statistical analysis, data visualization, interpretation, and storage. Two systems of databases, one containing expression data and one containing annotation data are quickly becoming essential knowledge repositories of the research community. This present paper surveys several databases, which are considered "pillars" of research and important nodes in the network. This paper focuses on a generalized workflow scheme typical for microarray experiments using two examples related to cancer research. The workflow is used to reference appropriate databases and tools for each step in the process of array experimentation. Additionally, benefits and drawbacks of current array databases are addressed, and suggestions are made for their improvement.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: The Complete Arabidopsis Transcript MicroArray (CATMA) initiative combines the efforts of laboratories in eight European countries 1 to deliver gene-specific sequence tags (GSTs) for the Arabidopsis research community. The CATMA initiative offers the power and flexibility to regularly update the GST collection according to evolving knowledge about the gene repertoire. These GST amplicons can easily be reamplified and shared, subsets can be picked at will to print dedicated arrays, and the GSTs can be cloned and used for other functional studies. This ongoing initiative has already produced approximately 24,000 GSTs that have been made publicly available for spotted microarray printing and RNA interference. RESULTS: GSTs from the CATMA version 2 repertoire (CATMAv2, created in 2002) were mapped onto the gene models from two independent Arabidopsis nuclear genome annotation efforts, TIGR5 and PSB-EuGène, to consolidate a list of genes that were targeted by previously designed CATMA tags. A total of 9,027 gene models were not tagged by any amplified CATMAv2 GST, and 2,533 amplified GSTs were no longer predicted to tag an updated gene model. To validate the efficacy of GST mapping criteria and design rules, the predicted and experimentally observed hybridization characteristics associated to GST features were correlated in transcript profiling datasets obtained with the CATMAv2 microarray, confirming the reliability of this platform. To complete the CATMA repertoire, all 9,027 gene models for which no GST had yet been designed were processed with an adjusted version of the Specific Primer and Amplicon Design Software (SPADS). A total of 5,756 novel GSTs were designed and amplified by PCR from genomic DNA. Together with the pre-existing GST collection, this new addition constitutes the CATMAv3 repertoire. It comprises 30,343 unique amplified sequences that tag 24,202 and 23,009 protein-encoding nuclear gene models in the TAIR6 and EuGène genome annotations, respectively. To cover the remaining untagged genes, we identified 543 additional GSTs using less stringent design criteria and designed 990 sequence tags matching multiple members of gene families (Gene Family Tags or GFTs) to cover any remaining untagged genes. These latter 1,533 features constitute the CATMAv4 addition. CONCLUSION: To update the CATMA GST repertoire, we designed 7,289 additional sequence tags, bringing the total number of tagged TAIR6-annotated Arabidopsis nuclear protein-coding genes to 26,173. This resource is used both for the production of spotted microarrays and the large-scale cloning of hairpin RNA silencing vectors. All information about the resulting updated CATMA repertoire is available through the CATMA database http://www.catma.org.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The variety of DNA microarray formats and datasets presently available offers an unprecedented opportunity to perform insightful comparisons of heterogeneous data. Cross-species studies, in particular, have the power of identifying conserved, functionally important molecular processes. Validation of discoveries can now often be performed in readily available public data which frequently requires cross-platform studies.Cross-platform and cross-species analyses require matching probes on different microarray formats. This can be achieved using the information in microarray annotations and additional molecular biology databases, such as orthology databases. Although annotations and other biological information are stored using modern database models ( e. g. relational), they are very often distributed and shared as tables in text files, i.e. flat file databases. This common flat database format thus provides a simple and robust solution to flexibly integrate various sources of information and a basis for the combined analysis of heterogeneous gene expression profiles.Results: We provide annotationTools, a Bioconductor-compliant R package to annotate microarray experiments and integrate heterogeneous gene expression profiles using annotation and other molecular biology information available as flat file databases. First, annotationTools contains a specialized set of functions for mining this widely used database format in a systematic manner. It thus offers a straightforward solution for annotating microarray experiments. Second, building on these basic functions and relying on the combination of information from several databases, it provides tools to easily perform cross-species analyses of gene expression data.Here, we present two example applications of annotationTools that are of direct relevance for the analysis of heterogeneous gene expression profiles, namely a cross-platform mapping of probes and a cross-species mapping of orthologous probes using different orthology databases. We also show how to perform an explorative comparison of disease-related transcriptional changes in human patients and in a genetic mouse model.Conclusion: The R package annotationTools provides a simple solution to handle microarray annotation and orthology tables, as well as other flat molecular biology databases. Thereby, it allows easy integration and analysis of heterogeneous microarray experiments across different technological platforms or species.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Little is known about the relation between the genome organization and gene expression in Leishmania. Bioinformatic analysis can be used to predict genes and find homologies with known proteins. A model was proposed, in which genes are organized into large clusters and transcribed from only one strand, in the form of large polycistronic primary transcripts. To verify the validity of this model, we studied gene expression at the transcriptional, post-transcriptional and translational levels in a unique locus of 34kb located on chr27 and represented by cosmid L979. Sequence analysis revealed 115 ORFs on either DNA strand. Using computer programs developed for Leishmania genes, only nine of these ORFs, localized on the same strand, were predicted to code for proteins, some of which show homologies with known proteins. Additionally, one pseudogene, was identified. We verified the biological relevance of these predictions. mRNAs from nine predicted genes and proteins from seven were detected. Nuclear run-on analyses confirmed that the top strand is transcribed by RNA polymerase II and suggested that there is no polymerase entry site. Low levels of transcription were detected in regions of the bottom strand and stable transcripts were identified for four ORFs on this strand not predicted to be protein-coding. In conclusion, the transcriptional organization of the Leishmania genome is complex, raising the possibility that computer predictions may not be comprehensive.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The manipulation of DNA is routine practice in botanical research and has made a huge impact on plant breeding, biotechnology and biodiversity evaluation. DNA is easy to extract from most plant tissues and can be stored for long periods in DNA banks. Curation methods are well developed for other botanical resources such as herbaria, seed banks and botanic gardens, but procedures for the establishment and maintenance of DNA banks have not been well documented. This paper reviews the curation of DNA banks for the characterisation and utilisation of biodiversity and provides guidelines for DNA bank management. It surveys existing DNA banks and outlines their operation. It includes a review of plant DNA collection, preservation, isolation, storage, database management and exchange procedures. We stress that DNA banks require full integration with existing collections such as botanic gardens, herbaria and seed banks, and information retrieval systems that link such facilities, bioinformatic resources and other DNA banks. They also require efficient and well-regulated sample exchange procedures. Only with appropriate curation will maximum utilisation of DNA collections be achieved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data mining can be defined as the extraction of previously unknown and potentially useful information from large datasets. The main principle is to devise computer programs that run through databases and automatically seek deterministic patterns. It is applied in different fields of application, e.g., remote sensing, biometry, speech recognition, but has seldom been applied to forensic case data. The intrinsic difficulty related to the use of such data lies in its heterogeneity, which comes from the many different sources of information. The aim of this study is to highlight potential uses of pattern recognition that would provide relevant results from a criminal intelligence point of view. The role of data mining within a global crime analysis methodology is to detect all types of structures in a dataset. Once filtered and interpreted, those structures can point to previously unseen criminal activities. The interpretation of patterns for intelligence purposes is the final stage of the process. It allows the researcher to validate the whole methodology and to refine each step if necessary. An application to cutting agents found in illicit drug seizures was performed. A combinatorial approach was done, using the presence and the absence of products. Methods coming from the graph theory field were used to extract patterns in data constituted by links between products and place and date of seizure. A data mining process completed using graphing techniques is called ``graph mining''. Patterns were detected that had to be interpreted and compared with preliminary knowledge to establish their relevancy. The illicit drug profiling process is actually an intelligence process that uses preliminary illicit drug classes to classify new samples. Methods proposed in this study could be used \textit{a priori} to compare structures from preliminary and post-detection patterns. This new knowledge of a repeated structure may provide valuable complementary information to profiling and become a source of intelligence.