48 resultados para FORESTs Genome Project database


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Microbe browser is a web server providing comparative microbial genomics data. It offers comprehensive, integrated data from GenBank, RefSeq, UniProt, InterPro, Gene Ontology and the Orthologs Matrix Project (OMA) database, displayed along with gene predictions from five software packages. The Microbe browser is daily updated from the source databases and includes all completely sequenced bacterial and archaeal genomes. The data are displayed in an easy-to-use, interactive website based on Ensembl software. The Microbe browser is available at http://microbe.vital-it.ch/. Programmatic access is available through the OMA application programming interface (API) at http://microbe.vital-it.ch/api.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since the advent of high-throughput DNA sequencing technologies, the ever-increasing rate at which genomes have been published has generated new challenges notably at the level of genome annotation. Even if gene predictors and annotation softwares are more and more efficient, the ultimate validation is still in the observation of predicted gene product( s). Mass-spectrometry based proteomics provides the necessary high throughput technology to show evidences of protein presence and, from the identified sequences, confirmation or invalidation of predicted annotations. We review here different strategies used to perform a MS-based proteogenomics experiment with a bottom-up approach. We start from the strengths and weaknesses of the different database construction strategies, based on different genomic information (whole genome, ORF, cDNA, EST or RNA-Seq data), which are then used for matching mass spectra to peptides and proteins. We also review the important points to be considered for a correct statistical assessment of the peptide identifications. Finally, we provide references for tools used to map and visualize the peptide identifications back to the original genomic information.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Human papillomavirus type 6 (HPV6) is the major etiological agent of anogenital warts and laryngeal papillomas and has been included in both the quadrivalent and nonavalent prophylactic HPV vaccines. This study investigated the global genomic diversity of HPV6, using 724 isolates and 190 complete genomes from six continents, and the association of HPV6 genomic variants with geographical location, anatomical site of infection/disease, and gender. Initially, a 2,800-bp E5a-E5b-L1-LCR fragment was sequenced from 492/530 (92.8%) HPV6-positive samples collected for this study. Among them, 130 exhibited at least one single nucleotide polymorphism (SNP), indel, or amino acid change in the E5a-E5b-L1-LCR fragment and were sequenced in full. A global alignment and maximum likelihood tree of 190 complete HPV6 genomes (130 fully sequenced in this study and 60 obtained from sequence repositories) revealed two variant lineages, A and B, and five B sublineages: B1, B2, B3, B4, and B5. HPV6 (sub)lineage-specific SNPs and a 960-bp representative region for whole-genome-based phylogenetic clustering within the L2 open reading frame were identified. Multivariate logistic regression analysis revealed that lineage B predominated globally. Sublineage B3 was more common in Africa and North and South America, and lineage A was more common in Asia. Sublineages B1 and B3 were associated with anogenital infections, indicating a potential lesion-specific predilection of some HPV6 sublineages. Females had higher odds for infection with sublineage B3 than males. In conclusion, a global HPV6 phylogenetic analysis revealed the existence of two variant lineages and five sublineages, showing some degree of ethnogeographic, gender, and/or disease predilection in their distribution. IMPORTANCE: This study established the largest database of globally circulating HPV6 genomic variants and contributed a total of 130 new, complete HPV6 genome sequences to available sequence repositories. Two HPV6 variant lineages and five sublineages were identified and showed some degree of association with geographical location, anatomical site of infection/disease, and/or gender. We additionally identified several HPV6 lineage- and sublineage-specific SNPs to facilitate the identification of HPV6 variants and determined a representative region within the L2 gene that is suitable for HPV6 whole-genome-based phylogenetic analysis. This study complements and significantly expands the current knowledge of HPV6 genetic diversity and forms a comprehensive basis for future epidemiological, evolutionary, functional, pathogenicity, vaccination, and molecular assay development studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND AND PURPOSE: Beyond the Framingham Stroke Risk Score, prediction of future stroke may improve with a genetic risk score (GRS) based on single-nucleotide polymorphisms associated with stroke and its risk factors. METHODS: The study includes 4 population-based cohorts with 2047 first incident strokes from 22,720 initially stroke-free European origin participants aged ≥55 years, who were followed for up to 20 years. GRSs were constructed with 324 single-nucleotide polymorphisms implicated in stroke and 9 risk factors. The association of the GRS to first incident stroke was tested using Cox regression; the GRS predictive properties were assessed with area under the curve statistics comparing the GRS with age and sex, Framingham Stroke Risk Score models, and reclassification statistics. These analyses were performed per cohort and in a meta-analysis of pooled data. Replication was sought in a case-control study of ischemic stroke. RESULTS: In the meta-analysis, adding the GRS to the Framingham Stroke Risk Score, age and sex model resulted in a significant improvement in discrimination (all stroke: Δjoint area under the curve=0.016, P=2.3×10(-6); ischemic stroke: Δjoint area under the curve=0.021, P=3.7×10(-7)), although the overall area under the curve remained low. In all the studies, there was a highly significantly improved net reclassification index (P<10(-4)). CONCLUSIONS: The single-nucleotide polymorphisms associated with stroke and its risk factors result only in a small improvement in prediction of future stroke compared with the classical epidemiological risk factors for stroke.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Although research on influenza lasted for more than 100 years, it is still one of the most prominent diseases causing half a million human deaths every year. With the recent observation of new highly pathogenic H5N1 and H7N7 strains, and the appearance of the influenza pandemic caused by the H1N1 swine-like lineage, a collaborative effort to share observations on the evolution of this virus in both animals and humans has been established. The OpenFlu database (OpenFluDB) is a part of this collaborative effort. It contains genomic and protein sequences, as well as epidemiological data from more than 27,000 isolates. The isolate annotations include virus type, host, geographical location and experimentally tested antiviral resistance. Putative enhanced pathogenicity as well as human adaptation propensity are computed from protein sequences. Each virus isolate can be associated with the laboratories that collected, sequenced and submitted it. Several analysis tools including multiple sequence alignment, phylogenetic analysis and sequence similarity maps enable rapid and efficient mining. The contents of OpenFluDB are supplied by direct user submission, as well as by a daily automatic procedure importing data from public repositories. Additionally, a simple mechanism facilitates the export of OpenFluDB records to GenBank. This resource has been successfully used to rapidly and widely distribute the sequences collected during the recent human swine flu outbreak and also as an exchange platform during the vaccine selection procedure. Database URL: http://openflu.vital-it.ch.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Evolution of proteins after whole-genome duplicationGene and genome duplication are considered major mechanisms in the creation of newfunctions in genomes, or in the refinement of networks by the division of function amongmore genes. In animals, the best demonstrated whole genome duplication occurred at theorigin of Teleost fishes. This makes fishes an ideal model to study the consequences ofgenome duplication, particularly since we have a good sampling of genome sequences,abundant functional information, and a very well studied outgroup: the tetrapodes (includinghuman). More specifically, I studied the consequences of duplication on proteins usingevolutionary models to infer adaptive events. I analysed the influence of positive selection invertebrate genes, by contrasting singleton genes and duplicated genes. The conclusion of theanalyses was threefold: (i) positive selection affects diverse phylogenetic branches anddiverse gene categories during vertebrate evolution; (ii) it concerns only a small proportion ofsites (1%-5%); and (iii) whole genome duplication had no detectable impact on theprevalence of this positive selection.I also studied evolution at the amino acid level with different methods to detect functionalshifts (covarion process and constant-but-different process). As in my previous research, Ifound similar numbers of functional shifts between duplicates and between orthologs.The accepted framework for studies of molecular evolution is that orthologs share the samefunction, whereas the function of paralogs diverges. This framework gives a special place togene duplication in evolution, as the main mechanism for generating novelty. With myprevious results showing that duplication and speciation are not so different, we investigatedthe literature to question the evidence for similar or divergent evolution of gene function afterduplication relative to speciation genes. This led us to propose a more rigorous design offuture studies of gene duplication.Finally, based on my automated protocol, we built a database of positive selection invertebrates' genes, Selectome. This database is freely available on the web and will helpfuture evolutionary as well as biochemical studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Complete Arabidopsis Transcriptome Micro Array (CATMA) database contains gene sequence tag (GST) and gene model sequences for over 70% of the predicted genes in the Arabidopsis thaliana genome as well as primer sequences for GST amplification and a wide range of supplementary information. All CATMA GST sequences are specific to the gene for which they were designed, and all gene models were predicted from a complete reannotation of the genome using uniform parameters. The database is searchable by sequence name, sequence homology or direct SQL query, and is available through the CATMA website at http://www.catma.org/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of the Permanent.Plot.ch project is the conservation of historical data about permanent plots in Switzerland and the monitoring of vegetation in a context of environmental changes (mainly climate and land use). Permanent plots are currently being recognized as valuable tools to monitor long-term effects of environmental changes on vegetation. Often used in short studies (3 to 5 years), they are generally abandoned at the end of projects. However, their full potential might only be revealed after 10 or more years, once the location is lost. For instance, some of the oldest permanent plots in Switzerland (first half of the 20th century) were nearly lost, although they are now very valuable data. The Permanent.Plot.ch national database (GIVD ID EU-CH-001), by storing historical and recent data, will allow to ensuring future access to data from permanent vegetation plots. As the database contains some private data, it is not directly available on internet but an overview of the data can be downloaded from internet (http://www.unil.ch/ppch) and precise data are available on request.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

90Y-labelled radiopharmaceuticals offer promising prospects for radionuclide therapies of tumours, e.g. radioimmunotherapies (RIT), (EANM, 2007), peptide receptor radiotherapies (PRRT), (Otte et al., 1998), and selective internal radiotherapies (SIRT), (Salem and Thurston, 2006). 90Y, an almost pure high-energy beta radiation emitter (Eβ,max = 2.28 MeV), is a favourable radionuclide for therapeutic purposes. However, when preparing and performing these therapies, high activities of 90Y (>1 GBq) are to be manipulated and technicians, physicians and nurses may receive high skin exposures to the hands. If radiation protection standards are low, the exposure of staff can exceed the annual skin dose limit of 500 mSv. Within a particular work package (WP4) of the ORAMED project, comprehensive measurements in nuclear medicine departments of several hospitals in 6 European countries were carried out. The study focussed on 90Y-labelled substances such as Zevalin® and DOTATOC to achieve a representative database on staff exposure. This paper summarises the most important results and conclusions for individual monitoring of skin exposure of staff.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Body fat distribution, particularly centralized obesity, is associated with metabolic risk above and beyond total adiposity. We performed genome-wide association of abdominal adipose depots quantified using computed tomography (CT) to uncover novel loci for body fat distribution among participants of European ancestry. Subcutaneous and visceral fat were quantified in 5,560 women and 4,997 men from 4 population-based studies. Genome-wide genotyping was performed using standard arrays and imputed to ~2.5 million Hapmap SNPs. Each study performed a genome-wide association analysis of subcutaneous adipose tissue (SAT), visceral adipose tissue (VAT), VAT adjusted for body mass index, and VAT/SAT ratio (a metric of the propensity to store fat viscerally as compared to subcutaneously) in the overall sample and in women and men separately. A weighted z-score meta-analysis was conducted. For the VAT/SAT ratio, our most significant p-value was rs11118316 at LYPLAL1 gene (p = 3.1 × 10E-09), previously identified in association with waist-hip ratio. For SAT, the most significant SNP was in the FTO gene (p = 5.9 × 10E-08). Given the known gender differences in body fat distribution, we performed sex-specific analyses. Our most significant finding was for VAT in women, rs1659258 near THNSL2 (p = 1.6 × 10-08), but not men (p = 0.75). Validation of this SNP in the GIANT consortium data demonstrated a similar sex-specific pattern, with observed significance in women (p = 0.006) but not men (p = 0.24) for BMI and waist circumference (p = 0.04 [women], p = 0.49 [men]). Finally, we interrogated our data for the 14 recently published loci for body fat distribution (measured by waist-hip ratio adjusted for BMI); associations were observed at 7 of these loci. In contrast, we observed associations at only 7/32 loci previously identified in association with BMI; the majority of overlap was observed with SAT. Genome-wide association for visceral and subcutaneous fat revealed a SNP for VAT in women. More refined phenotypes for body composition and fat distribution can detect new loci not previously uncovered in large-scale GWAS of anthropometric traits.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall, the project provides new insights into the organization and regulation of our genes and genome, and is an expansive resource of functional annotations for biomedical research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main goal of CleanEx is to provide access to public gene expression data via unique gene names. A second objective is to represent heterogeneous expression data produced by different technologies in a way that facilitates joint analysis and cross-data set comparisons. A consistent and up-to-date gene nomenclature is achieved by associating each single experiment with a permanent target identifier consisting of a physical description of the targeted RNA population or the hybridization reagent used. These targets are then mapped at regular intervals to the growing and evolving catalogues of human genes and genes from model organisms. The completely automatic mapping procedure relies partly on external genome information resources such as UniGene and RefSeq. The central part of CleanEx is a weekly built gene index containing cross-references to all public expression data already incorporated into the system. In addition, the expression target database of CleanEx provides gene mapping and quality control information for various types of experimental resource, such as cDNA clones or Affymetrix probe sets. The web-based query interfaces offer access to individual entries via text string searches or quantitative expression criteria. CleanEx is accessible at: http://www.cleanex.isb-sib.ch/.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Searching for matches between large collections of short (14-30 nucleotides) words and sequence databases comprising full genomes or transcriptomes is a common task in biological sequence analysis. We investigated the performance of simple indexing strategies for handling such tasks and developed two programs, fetchGWI and tagger, that index either the database or the query set. Either strategy outperforms megablast for searches with more than 10,000 probes. FetchGWI is shown to be a versatile tool for rapidly searching multiple genomes, whose performance is limited in most cases by the speed of access to the filesystem. We have made publicly available a Web interface for searching the human, mouse, and several other genomes and transcriptomes with oligonucleotide queries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genetic variants influence the risk to develop certain diseases or give rise to differences in drug response. Recent progresses in cost-effective, high-throughput genome-wide techniques, such as microarrays measuring Single Nucleotide Polymorphisms (SNPs), have facilitated genotyping of large clinical and population cohorts. Combining the massive genotypic data with measurements of phenotypic traits allows for the determination of genetic differences that explain, at least in part, the phenotypic variations within a population. So far, models combining the most significant variants can only explain a small fraction of the variance, indicating the limitations of current models. In particular, researchers have only begun to address the possibility of interactions between genotypes and the environment. Elucidating the contributions of such interactions is a difficult task because of the large number of genetic as well as possible environmental factors.In this thesis, I worked on several projects within this context. My first and main project was the identification of possible SNP-environment interactions, where the phenotypes were serum lipid levels of patients from the Swiss HIV Cohort Study (SHCS) treated with antiretroviral therapy. Here the genotypes consisted of a limited set of SNPs in candidate genes relevant for lipid transport and metabolism. The environmental variables were the specific combinations of drugs given to each patient over the treatment period. My work explored bioinformatic and statistical approaches to relate patients' lipid responses to these SNPs, drugs and, importantly, their interactions. The goal of this project was to improve our understanding and to explore the possibility of predicting dyslipidemia, a well-known adverse drug reaction of antiretroviral therapy. Specifically, I quantified how much of the variance in lipid profiles could be explained by the host genetic variants, the administered drugs and SNP-drug interactions and assessed the predictive power of these features on lipid responses. Using cross-validation stratified by patients, we could not validate our hypothesis that models that select a subset of SNP-drug interactions in a principled way have better predictive power than the control models using "random" subsets. Nevertheless, all models tested containing SNP and/or drug terms, exhibited significant predictive power (as compared to a random predictor) and explained a sizable proportion of variance, in the patient stratified cross-validation context. Importantly, the model containing stepwise selected SNP terms showed higher capacity to predict triglyceride levels than a model containing randomly selected SNPs. Dyslipidemia is a complex trait for which many factors remain to be discovered, thus missing from the data, and possibly explaining the limitations of our analysis. In particular, the interactions of drugs with SNPs selected from the set of candidate genes likely have small effect sizes which we were unable to detect in a sample of the present size (<800 patients).In the second part of my thesis, I performed genome-wide association studies within the Cohorte Lausannoise (CoLaus). I have been involved in several international projects to identify SNPs that are associated with various traits, such as serum calcium, body mass index, two-hour glucose levels, as well as metabolic syndrome and its components. These phenotypes are all related to major human health issues, such as cardiovascular disease. I applied statistical methods to detect new variants associated with these phenotypes, contributing to the identification of new genetic loci that may lead to new insights into the genetic basis of these traits. This kind of research will lead to a better understanding of the mechanisms underlying these pathologies, a better evaluation of disease risk, the identification of new therapeutic leads and may ultimately lead to the realization of "personalized" medicine.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The R package EasyStrata facilitates the evaluation and visualization of stratified genome-wide association meta-analyses (GWAMAs) results. It provides (i) statistical methods to test and account for between-strata difference as a means to tackle gene-strata interaction effects and (ii) extended graphical features tailored for stratified GWAMA results. The software provides further features also suitable for general GWAMAs including functions to annotate, exclude or highlight specific loci in plots or to extract independent subsets of loci from genome-wide datasets. It is freely available and includes a user-friendly scripting interface that simplifies data handling and allows for combining statistical and graphical functions in a flexible fashion. AVAILABILITY: EasyStrata is available for free (under the GNU General Public License v3) from our Web site www.genepi-regensburg.de/easystrata and from the CRAN R package repository cran.r-project.org/web/packages/EasyStrata/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.