65 resultados para Database query languages
Resumo:
Mountain ranges are biodiversity hotspots worldwide and provide refuge to many organisms under contemporary climate change. Gathering field information on mountain biodiversity over time is of primary importance to understand the response of biotic communities to climate changes. For plants, several long-term observation sites and networks of mountain biodiversity are emerging worldwide to gather field data and monitor altitudinal range shifts and community composition changes under contemporary climate change. Most of these monitoring sites, however, focus on alpine ecosystems and mountain summits, such as the global observation research initiative in alpine environments (GLORIA). Here we describe the Alps Vegetation Database, a comprehensive community level archive (GIVD ID EU-00-014) which aims at compiling all available geo-referenced vegetation plots from lowland forests to alpine grasslands across the greatest mountain range in Europe: the Alps. This research initiative was funded between 2008 and 2011 by the Danish Council for Independent Research and was part of a larger project to compare cross-scale plant community structure between the Alps and the Scandes. The Alps Vegetation Database currently harbours 35,731 geo-referenced vegetation plots and 5,023 valid taxa across Mediterranean, temperate and alpine environments. The data are mainly used by the main contributors of the Alps Vegetation Database in an ecoinformatics approach to test hypotheses related to plant macroecology and biogeography, but external proposals for joint collaborations are welcome.
Resumo:
The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and to predict the presence of important domains and sites. Central to the InterPro database are predictive models, known as signatures, from a range of different protein family databases that have different biological focuses and use different methodological approaches to classify protein families and domains. InterPro integrates these signatures, capitalizing on the respective strengths of the individual databases, to produce a powerful protein classification resource. Here, we report on the status of InterPro as it enters its 15th year of operation, and give an overview of new developments with the database and its associated Web interfaces and software. In particular, the new domain architecture search tool is described and the process of mapping of Gene Ontology terms to InterPro is outlined. We also discuss the challenges faced by the resource given the explosive growth in sequence data in recent years. InterPro (version 48.0) contains 36 766 member database signatures integrated into 26 238 InterPro entries, an increase of over 3993 entries (5081 signatures), since 2012.
Resumo:
The MyHits web site (http://myhits.isb-sib.ch) is an integrated service dedicated to the analysis of protein sequences. Since its first description in 2004, both the user interface and the back end of the server were improved. A number of tools (e.g. MAFFT, Jacop, Dotlet, Jalview, ESTScan) were added or updated to improve the usability of the service. The MySQL schema and its associated API were revamped and the database engine (HitKeeper) was separated from the web interface. This paper summarizes the current status of the server, with an emphasis on the new services.
Resumo:
HTPSELEX is a public database providing access to primary and derived data from high-throughput SELEX experiments aimed at characterizing the binding specificity of transcription factors. The resource is primarily intended to serve computational biologists interested in building models of transcription factor binding sites from large sets of binding sequences. The guiding principle is to make available all information that is relevant for this purpose. For each experiment, we try to provide accurate information about the protein material used, details of the wet lab protocol, an archive of sequencing trace files, assembled clone sequences (concatemers) and complete sets of in vitro selected protein-binding tags. In addition, we offer in-house derived binding sites models. HTPSELEX also offers reasonably large SELEX libraries obtained with conventional low-throughput protocols. The FTP site contains the trace archives and database flatfiles. The web server offers user-friendly interfaces for viewing individual entries and quality-controlled download of SELEX sequence libraries according to a user-defined sequencing quality threshold. HTPSELEX is available from ftp://ftp.isrec.isb-sib.ch/pub/databases/htpselex/ and http://www.isrec.isb-sib.ch/htpselex.
Resumo:
This paper analyses and discusses arguments that emerge from a recent discussion about the proper assessment of the evidential value of correspondences observed between the characteristics of a crime stain and those of a sample from a suspect when (i) this latter individual is found as a result of a database search and (ii) remaining database members are excluded as potential sources (because of different analytical characteristics). Using a graphical probability approach (i.e., Bayesian networks), the paper here intends to clarify that there is no need to (i) introduce a correction factor equal to the size of the searched database (i.e., to reduce a likelihood ratio), nor to (ii) adopt a propositional level not directly related to the suspect matching the crime stain (i.e., a proposition of the kind 'some person in (outside) the database is the source of the crime stain' rather than 'the suspect (some other person) is the source of the crime stain'). The present research thus confirms existing literature on the topic that has repeatedly demonstrated that the latter two requirements (i) and (ii) should not be a cause of concern.
Resumo:
Since the advent of high-throughput DNA sequencing technologies, the ever-increasing rate at which genomes have been published has generated new challenges notably at the level of genome annotation. Even if gene predictors and annotation softwares are more and more efficient, the ultimate validation is still in the observation of predicted gene product( s). Mass-spectrometry based proteomics provides the necessary high throughput technology to show evidences of protein presence and, from the identified sequences, confirmation or invalidation of predicted annotations. We review here different strategies used to perform a MS-based proteogenomics experiment with a bottom-up approach. We start from the strengths and weaknesses of the different database construction strategies, based on different genomic information (whole genome, ORF, cDNA, EST or RNA-Seq data), which are then used for matching mass spectra to peptides and proteins. We also review the important points to be considered for a correct statistical assessment of the peptide identifications. Finally, we provide references for tools used to map and visualize the peptide identifications back to the original genomic information.
Resumo:
This article documents the addition of 229 microsatellite marker loci to the Molecular Ecology Resources Database. Loci were developed for the following species: Acacia auriculiformis x Acacia mangium hybrid, Alabama argillacea, Anoplopoma fimbria, Aplochiton zebra, Brevicoryne brassicae, Bruguiera gymnorhiza, Bucorvus leadbeateri, Delphacodes detecta, Tumidagena minuta, Dictyostelium giganteum, Echinogammarus berilloni, Epimedium sagittatum, Fraxinus excelsior, Labeo chrysophekadion, Oncorhynchus clarki lewisi, Paratrechina longicornis, Phaeocystis antarctica, Pinus roxburghii and Potamilus capax. These loci were cross-tested on the following species: Acacia peregrinalis, Acacia crassicarpa, Bruguiera cylindrica, Delphacodes detecta, Tumidagena minuta, Dictyostelium macrocephalum, Dictyostelium discoideum, Dictyostelium purpureum, Dictyostelium mucoroides, Dictyostelium rosarium, Polysphondylium pallidum, Epimedium brevicornum, Epimedium koreanum, Epimedium pubescens, Epimedium wushanese and Fraxinus angustifolia.
Resumo:
AIM: The specific natural history of superficial soft tissue sarcomas (S-STS) has been rarely considered. We describe the clinical characteristics of a large series of S-STS (N=367) from the French Sarcoma Group (GSF-GETO) database and analyse the prognostic factors affecting outcome. METHODS: We performed univariate and multivariate analyses for overall survival (OS), metastasis-free survival (MFS) and local recurrence-free survival (LRFS). RESULTS: The median age was 59 years. Fifty-eight percent patients were female. Tumour locations were as follows: extremities, 55%; trunk wall, 35.4%; head and neck, 8% and unknown, 1.6%. Median tumour size was 3.0 cm. The most frequent tumour types were unclassified sarcoma (24.3%) and leiomyosarcoma (22.3%). Thirty-three percent of cases were grade 3. Median follow-up was 6.18 years. The 5-year OS, MFS and LRFS rates were 80.9%, 80.7% and 74.7%, respectively. Multivariate analysis retained histological type and wide resection for predicting LRFS and histological type and grade as prognostic factors of MFS. The factors influencing OS were age, histological type, grade and wide resection. STS with early invasion into but not through the underlying fascia had a significantly poorer MFS than with strict S-STS. CONCLUSION: S-STS represent a separate category characterised by a better outcome. Adequate surgery, i.e. wide resection, is essential in the management of S-STS.
Resumo:
OBJECTIVES: To conduct a national survey on adolescent health and lifestyles in Georgia and to thus set up a database on adolescent. METHODS: A two-stage cluster sample of around 8000-10000 in-school 15-18 years adolescents are being reached through a random selection of classes in Georgia. The sample has been stratified by age, region, type of school and language. A self-administered questionnaire of 87 questions has been developed and translated into the four main languages used in Georgia. RESULTS: Up to June 2004, the researchers have reached 511 classes (9306 pupils). In total, 8039 questionnaires have been considered valid. The main concerns encountered for this survey are linked with acceptance of the survey, cross-cultural issues, political and strategic problems as well as inadequate physical environmental support. CONCLUSION: Despite Georgia's unfavourable economical and political situation, it has been possible to run a national survey on the health of adolescents, according to the usual standards used in the field. This survey should allow for 1) the identification of priorities in the field of health care and health promotion 2) the monitoring of adolescent health in the future.
Resumo:
BACKGROUND: Soft tissue sarcomas of the trunk wall (STS-TW) are usually studied together with soft tissue sarcomas of other locations. We report a study on STS-TW forming part of the French Sarcoma Group database. PATIENTS AND METHODS: Three hundred and forty-three adults were included. We carried out univariate and multivariate analysis for overall survival (OS), metastasis-free survival (MFS) and local recurrence-free survival (LRFS). RESULTS: Tumor locations were as follows: thoracic wall, 82.5%; abdominal wall, 12.3% and pelvic wall, 5.2%. Median tumor size was 6.0 cm. The most frequent tumor types were unclassified sarcoma (27.7%) and myogenic sarcoma (19.2%). A total of 44.6% of cases were grade 3. In all, 21.9% of patients had a previous medical history of radiotherapy (PHR). Median follow-up was 7.6 years. The 5-year OS, MFS and LRFS rates were 60.4%, 68.9% and 58.4%, respectively. Multivariate analysis retained PHR and grade for predicting LRFS and PHR, size and grade as prognostic factors of MFS. Factors influencing OS were age, size, PHR, depth, grade and surgical margins. The predictive factors of incomplete response were PHR, size and T3. CONCLUSIONS: Our results suggest similar classical prognostic factors as compared with sarcomas of other locations. However, a separate analysis of STS-TW revealed a significant poor prognosis subgroup of patients with PHR.
Resumo:
The vast majority of eukaryotic organisms reproduce sexually, yet the nature of the sexual system and the mechanism of sex determination often vary remarkably, even among closely related species. Some species of animals and plants change sex across their lifespan, some contain hermaphrodites as well as males and females, some determine sex with highly differentiated chromosomes, while others determine sex according to their environment. Testing evolutionary hypotheses regarding the causes and consequences of this diversity requires interspecific data placed in a phylogenetic context. Such comparative studies have been hampered by the lack of accessible data listing sexual systems and sex determination mechanisms across the eukaryotic tree of life. Here, we describe a database developed to facilitate access to sexual system and sex chromosome information, with data on sexual systems from 11,038 plant, 705 fish, 173 amphibian, 593 non-avian reptilian, 195 avian, 479 mammalian, and 11,556 invertebrate species.
Resumo:
Although research on influenza lasted for more than 100 years, it is still one of the most prominent diseases causing half a million human deaths every year. With the recent observation of new highly pathogenic H5N1 and H7N7 strains, and the appearance of the influenza pandemic caused by the H1N1 swine-like lineage, a collaborative effort to share observations on the evolution of this virus in both animals and humans has been established. The OpenFlu database (OpenFluDB) is a part of this collaborative effort. It contains genomic and protein sequences, as well as epidemiological data from more than 27,000 isolates. The isolate annotations include virus type, host, geographical location and experimentally tested antiviral resistance. Putative enhanced pathogenicity as well as human adaptation propensity are computed from protein sequences. Each virus isolate can be associated with the laboratories that collected, sequenced and submitted it. Several analysis tools including multiple sequence alignment, phylogenetic analysis and sequence similarity maps enable rapid and efficient mining. The contents of OpenFluDB are supplied by direct user submission, as well as by a daily automatic procedure importing data from public repositories. Additionally, a simple mechanism facilitates the export of OpenFluDB records to GenBank. This resource has been successfully used to rapidly and widely distribute the sequences collected during the recent human swine flu outbreak and also as an exchange platform during the vaccine selection procedure. Database URL: http://openflu.vital-it.ch.
Resumo:
The GO annotation dataset provided by the UniProt Consortium (GOA: http://www.ebi.ac.uk/GOA) is a comprehensive set of evidenced-based associations between terms from the Gene Ontology resource and UniProtKB proteins. Currently supplying over 100 million annotations to 11 million proteins in more than 360,000 taxa, this resource has increased 2-fold over the last 2 years and has benefited from a wealth of checks to improve annotation correctness and consistency as well as now supplying a greater information content enabled by GO Consortium annotation format developments. Detailed, manual GO annotations obtained from the curation of peer-reviewed papers are directly contributed by all UniProt curators and supplemented with manual and electronic annotations from 36 model organism and domain-focused scientific resources. The inclusion of high-quality, automatic annotation predictions ensures the UniProt GO annotation dataset supplies functional information to a wide range of proteins, including those from poorly characterized, non-model organism species. UniProt GO annotations are freely available in a range of formats accessible by both file downloads and web-based views. In addition, the introduction of a new, normalized file format in 2010 has made for easier handling of the complete UniProt-GOA data set.