971 resultados para FORESTs Genome Project database


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Annotation of protein-coding genes is a key goal of genome sequencing projects. In spite of tremendous recent advances in computational gene finding, comprehensive annotation remains a challenge. Peptide mass spectrometry is a powerful tool for researching the dynamic proteome and suggests an attractive approach to discover and validate protein-coding genes. We present algorithms to construct and efficiently search spectra against a genomic database, with no prior knowledge of encoded proteins. By searching a corpus of 18.5 million tandem mass spectra (MS/MS) from human proteomic samples, we validate 39,000 exons and 11,000 introns at the level of translation. We present translation-level evidence for novel or extended exons in 16 genes, confirm translation of 224 hypothetical proteins, and discover or confirm over 40 alternative splicing events. Polymorphisms are efficiently encoded in our database, allowing us to observe variant alleles for 308 coding SNPs. Finally, we demonstrate the use of mass spectrometry to improve automated gene prediction, adding 800 correct exons to our predictions using a simple rescoring strategy. Our results demonstrate that proteomic profiling should play a role in any genome sequencing project.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Selenoproteins are a diverse group of proteinsusually misidentified and misannotated in sequencedatabases. The presence of an in-frame UGA (stop)codon in the coding sequence of selenoproteingenes precludes their identification and correctannotation. The in-frame UGA codons are recodedto cotranslationally incorporate selenocysteine,a rare selenium-containing amino acid. The developmentof ad hoc experimental and, more recently,computational approaches have allowed the efficientidentification and characterization of theselenoproteomes of a growing number of species.Today, dozens of selenoprotein families have beendescribed and more are being discovered in recentlysequenced species, but the correct genomic annotationis not available for the majority of thesegenes. SelenoDB is a long-term project that aims toprovide, through the collaborative effort of experimentaland computational researchers, automaticand manually curated annotations of selenoproteingenes, proteins and SECIS elements. Version 1.0 ofthe database includes an initial set of eukaryoticgenomic annotations, with special emphasis on thehuman selenoproteome, for immediate inspectionby selenium researchers or incorporation into moregeneral databases. SelenoDB is freely available athttp://www.selenodb.org.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new multimodal biometric database designed and acquired within the framework of the European BioSecure Network of Excellence is presented. It is comprised of more than 600 individuals acquired simultaneously in three scenarios: 1) over the Internet, 2) in an office environment with desktop PC, and 3) in indoor/outdoor environments with mobile portable hardware. The three scenarios include a common part of audio/video data. Also, signature and fingerprint data have been acquired both with desktop PC and mobile portable hardware. Additionally, hand and iris data were acquired in the second scenario using desktop PC. Acquisition has been conducted by 11 European institutions. Additional features of the BioSecure Multimodal Database (BMDB) are: two acquisitionsessions, several sensors in certain modalities, balanced gender and age distributions, multimodal realistic scenarios with simple and quick tasks per modality, cross-European diversity, availability of demographic data, and compatibility with other multimodal databases. The novel acquisition conditions of the BMDB allow us to perform new challenging research and evaluation of eithermonomodal or multimodal biometric systems, as in the recent BioSecure Multimodal Evaluation campaign. A description of this campaign including baseline results of individual modalities from the new database is also given. The database is expected to beavailable for research purposes through the BioSecure Association during 2008.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this study was to describe the clinical and PSG characteristics of narcolepsy with cataplexy and their genetic predisposition by using the retrospective patient database of the European Narcolepsy Network (EU-NN). We have analysed retrospective data of 1099 patients with narcolepsy diagnosed according to International Classification of Sleep Disorders-2. Demographic and clinical characteristics, polysomnography and multiple sleep latency test data, hypocretin-1 levels, and genome-wide genotypes were available. We found a significantly lower age at sleepiness onset (men versus women: 23.74 ± 12.43 versus 21.49 ± 11.83, P = 0.003) and longer diagnostic delay in women (men versus women: 13.82 ± 13.79 versus 15.62 ± 14.94, P = 0.044). The mean diagnostic delay was 14.63 ± 14.31 years, and longer delay was associated with higher body mass index. The best predictors of short diagnostic delay were young age at diagnosis, cataplexy as the first symptom and higher frequency of cataplexy attacks. The mean multiple sleep latency negatively correlated with Epworth Sleepiness Scale (ESS) and with the number of sleep-onset rapid eye movement periods (SOREMPs), but none of the polysomnographic variables was associated with subjective or objective measures of sleepiness. Variant rs2859998 in UBXN2B gene showed a strong association (P = 1.28E-07) with the age at onset of excessive daytime sleepiness, and rs12425451 near the transcription factor TEAD4 (P = 1.97E-07) with the age at onset of cataplexy. Altogether, our results indicate that the diagnostic delay remains extremely long, age and gender substantially affect symptoms, and that a genetic predisposition affects the age at onset of symptoms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The mission of the Encyclopedia of DNA Elements (ENCODE) Project is to enable the scientific and medical communities to interpret the human genome sequence and apply it to understand human biology and improve health. The ENCODE Consortium is integrating multiple technologies and approaches in a collective effort to discover and define the functional elements encoded in the human genome, including genes, transcripts, and transcriptional regulatory regions, together with their attendant chromatin states and DNA methylation patterns. In the process, standards to ensure high-quality data have been implemented, and novel algorithms have been developed to facilitate analysis. Data and derived results are made available through a freely accessible database. Here we provide an overview of the project and the resources it is generating and illustrate the application of ENCODE data to interpret the human genome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Microbe browser is a web server providing comparative microbial genomics data. It offers comprehensive, integrated data from GenBank, RefSeq, UniProt, InterPro, Gene Ontology and the Orthologs Matrix Project (OMA) database, displayed along with gene predictions from five software packages. The Microbe browser is daily updated from the source databases and includes all completely sequenced bacterial and archaeal genomes. The data are displayed in an easy-to-use, interactive website based on Ensembl software. The Microbe browser is available at http://microbe.vital-it.ch/. Programmatic access is available through the OMA application programming interface (API) at http://microbe.vital-it.ch/api.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since the advent of high-throughput DNA sequencing technologies, the ever-increasing rate at which genomes have been published has generated new challenges notably at the level of genome annotation. Even if gene predictors and annotation softwares are more and more efficient, the ultimate validation is still in the observation of predicted gene product( s). Mass-spectrometry based proteomics provides the necessary high throughput technology to show evidences of protein presence and, from the identified sequences, confirmation or invalidation of predicted annotations. We review here different strategies used to perform a MS-based proteogenomics experiment with a bottom-up approach. We start from the strengths and weaknesses of the different database construction strategies, based on different genomic information (whole genome, ORF, cDNA, EST or RNA-Seq data), which are then used for matching mass spectra to peptides and proteins. We also review the important points to be considered for a correct statistical assessment of the peptide identifications. Finally, we provide references for tools used to map and visualize the peptide identifications back to the original genomic information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human papillomavirus type 6 (HPV6) is the major etiological agent of anogenital warts and laryngeal papillomas and has been included in both the quadrivalent and nonavalent prophylactic HPV vaccines. This study investigated the global genomic diversity of HPV6, using 724 isolates and 190 complete genomes from six continents, and the association of HPV6 genomic variants with geographical location, anatomical site of infection/disease, and gender. Initially, a 2,800-bp E5a-E5b-L1-LCR fragment was sequenced from 492/530 (92.8%) HPV6-positive samples collected for this study. Among them, 130 exhibited at least one single nucleotide polymorphism (SNP), indel, or amino acid change in the E5a-E5b-L1-LCR fragment and were sequenced in full. A global alignment and maximum likelihood tree of 190 complete HPV6 genomes (130 fully sequenced in this study and 60 obtained from sequence repositories) revealed two variant lineages, A and B, and five B sublineages: B1, B2, B3, B4, and B5. HPV6 (sub)lineage-specific SNPs and a 960-bp representative region for whole-genome-based phylogenetic clustering within the L2 open reading frame were identified. Multivariate logistic regression analysis revealed that lineage B predominated globally. Sublineage B3 was more common in Africa and North and South America, and lineage A was more common in Asia. Sublineages B1 and B3 were associated with anogenital infections, indicating a potential lesion-specific predilection of some HPV6 sublineages. Females had higher odds for infection with sublineage B3 than males. In conclusion, a global HPV6 phylogenetic analysis revealed the existence of two variant lineages and five sublineages, showing some degree of ethnogeographic, gender, and/or disease predilection in their distribution. IMPORTANCE: This study established the largest database of globally circulating HPV6 genomic variants and contributed a total of 130 new, complete HPV6 genome sequences to available sequence repositories. Two HPV6 variant lineages and five sublineages were identified and showed some degree of association with geographical location, anatomical site of infection/disease, and/or gender. We additionally identified several HPV6 lineage- and sublineage-specific SNPs to facilitate the identification of HPV6 variants and determined a representative region within the L2 gene that is suitable for HPV6 whole-genome-based phylogenetic analysis. This study complements and significantly expands the current knowledge of HPV6 genetic diversity and forms a comprehensive basis for future epidemiological, evolutionary, functional, pathogenicity, vaccination, and molecular assay development studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND AND PURPOSE: Beyond the Framingham Stroke Risk Score, prediction of future stroke may improve with a genetic risk score (GRS) based on single-nucleotide polymorphisms associated with stroke and its risk factors. METHODS: The study includes 4 population-based cohorts with 2047 first incident strokes from 22,720 initially stroke-free European origin participants aged ≥55 years, who were followed for up to 20 years. GRSs were constructed with 324 single-nucleotide polymorphisms implicated in stroke and 9 risk factors. The association of the GRS to first incident stroke was tested using Cox regression; the GRS predictive properties were assessed with area under the curve statistics comparing the GRS with age and sex, Framingham Stroke Risk Score models, and reclassification statistics. These analyses were performed per cohort and in a meta-analysis of pooled data. Replication was sought in a case-control study of ischemic stroke. RESULTS: In the meta-analysis, adding the GRS to the Framingham Stroke Risk Score, age and sex model resulted in a significant improvement in discrimination (all stroke: Δjoint area under the curve=0.016, P=2.3×10(-6); ischemic stroke: Δjoint area under the curve=0.021, P=3.7×10(-7)), although the overall area under the curve remained low. In all the studies, there was a highly significantly improved net reclassification index (P<10(-4)). CONCLUSIONS: The single-nucleotide polymorphisms associated with stroke and its risk factors result only in a small improvement in prediction of future stroke compared with the classical epidemiological risk factors for stroke.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the aims of the MEDEX project is to improve the knowledge of high-impact weather events in the Mediterranean. According to the guidelines of this project, a pilot study was carried out in two regions of Spain (the Balearic Islands and Catalonia) by the Social Impact Research group of MEDEX. The main goal is to suggest some general and suitable criteria about how to analyse requests received in Meteorological Services arising out of the damage caused by weather events. Thus, all the requests received between 2000 and 2002 at the Servei Meteorològic de Catalunya as well as at the Division of AEMET in the Balearic Islands were analysed. Firstly, the proposed criteria in order to build the database are defined and discussed. Secondly, the temporal distribution of the requests for damage claims is analysed. On average, almost half of them were received during the first month after the event happened. During the first six months, the percentage increases by 90%. Thirdly, various factors are taken into account to determine the impact of specific events on society. It is remarkable that the greatest number of requests is for those episodes with simultaneous heavy rain and strong wind, and finally, those that are linked to high population density.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although research on influenza lasted for more than 100 years, it is still one of the most prominent diseases causing half a million human deaths every year. With the recent observation of new highly pathogenic H5N1 and H7N7 strains, and the appearance of the influenza pandemic caused by the H1N1 swine-like lineage, a collaborative effort to share observations on the evolution of this virus in both animals and humans has been established. The OpenFlu database (OpenFluDB) is a part of this collaborative effort. It contains genomic and protein sequences, as well as epidemiological data from more than 27,000 isolates. The isolate annotations include virus type, host, geographical location and experimentally tested antiviral resistance. Putative enhanced pathogenicity as well as human adaptation propensity are computed from protein sequences. Each virus isolate can be associated with the laboratories that collected, sequenced and submitted it. Several analysis tools including multiple sequence alignment, phylogenetic analysis and sequence similarity maps enable rapid and efficient mining. The contents of OpenFluDB are supplied by direct user submission, as well as by a daily automatic procedure importing data from public repositories. Additionally, a simple mechanism facilitates the export of OpenFluDB records to GenBank. This resource has been successfully used to rapidly and widely distribute the sequences collected during the recent human swine flu outbreak and also as an exchange platform during the vaccine selection procedure. Database URL: http://openflu.vital-it.ch.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Evolution of proteins after whole-genome duplicationGene and genome duplication are considered major mechanisms in the creation of newfunctions in genomes, or in the refinement of networks by the division of function amongmore genes. In animals, the best demonstrated whole genome duplication occurred at theorigin of Teleost fishes. This makes fishes an ideal model to study the consequences ofgenome duplication, particularly since we have a good sampling of genome sequences,abundant functional information, and a very well studied outgroup: the tetrapodes (includinghuman). More specifically, I studied the consequences of duplication on proteins usingevolutionary models to infer adaptive events. I analysed the influence of positive selection invertebrate genes, by contrasting singleton genes and duplicated genes. The conclusion of theanalyses was threefold: (i) positive selection affects diverse phylogenetic branches anddiverse gene categories during vertebrate evolution; (ii) it concerns only a small proportion ofsites (1%-5%); and (iii) whole genome duplication had no detectable impact on theprevalence of this positive selection.I also studied evolution at the amino acid level with different methods to detect functionalshifts (covarion process and constant-but-different process). As in my previous research, Ifound similar numbers of functional shifts between duplicates and between orthologs.The accepted framework for studies of molecular evolution is that orthologs share the samefunction, whereas the function of paralogs diverges. This framework gives a special place togene duplication in evolution, as the main mechanism for generating novelty. With myprevious results showing that duplication and speciation are not so different, we investigatedthe literature to question the evidence for similar or divergent evolution of gene function afterduplication relative to speciation genes. This led us to propose a more rigorous design offuture studies of gene duplication.Finally, based on my automated protocol, we built a database of positive selection invertebrates' genes, Selectome. This database is freely available on the web and will helpfuture evolutionary as well as biochemical studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Complete Arabidopsis Transcriptome Micro Array (CATMA) database contains gene sequence tag (GST) and gene model sequences for over 70% of the predicted genes in the Arabidopsis thaliana genome as well as primer sequences for GST amplification and a wide range of supplementary information. All CATMA GST sequences are specific to the gene for which they were designed, and all gene models were predicted from a complete reannotation of the genome using uniform parameters. The database is searchable by sequence name, sequence homology or direct SQL query, and is available through the CATMA website at http://www.catma.org/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of the Permanent.Plot.ch project is the conservation of historical data about permanent plots in Switzerland and the monitoring of vegetation in a context of environmental changes (mainly climate and land use). Permanent plots are currently being recognized as valuable tools to monitor long-term effects of environmental changes on vegetation. Often used in short studies (3 to 5 years), they are generally abandoned at the end of projects. However, their full potential might only be revealed after 10 or more years, once the location is lost. For instance, some of the oldest permanent plots in Switzerland (first half of the 20th century) were nearly lost, although they are now very valuable data. The Permanent.Plot.ch national database (GIVD ID EU-CH-001), by storing historical and recent data, will allow to ensuring future access to data from permanent vegetation plots. As the database contains some private data, it is not directly available on internet but an overview of the data can be downloaded from internet (http://www.unil.ch/ppch) and precise data are available on request.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

90Y-labelled radiopharmaceuticals offer promising prospects for radionuclide therapies of tumours, e.g. radioimmunotherapies (RIT), (EANM, 2007), peptide receptor radiotherapies (PRRT), (Otte et al., 1998), and selective internal radiotherapies (SIRT), (Salem and Thurston, 2006). 90Y, an almost pure high-energy beta radiation emitter (Eβ,max = 2.28 MeV), is a favourable radionuclide for therapeutic purposes. However, when preparing and performing these therapies, high activities of 90Y (>1 GBq) are to be manipulated and technicians, physicians and nurses may receive high skin exposures to the hands. If radiation protection standards are low, the exposure of staff can exceed the annual skin dose limit of 500 mSv. Within a particular work package (WP4) of the ORAMED project, comprehensive measurements in nuclear medicine departments of several hospitals in 6 European countries were carried out. The study focussed on 90Y-labelled substances such as Zevalin® and DOTATOC to achieve a representative database on staff exposure. This paper summarises the most important results and conclusions for individual monitoring of skin exposure of staff.