711 resultados para Annotation informatisée
Resumo:
del Sig:re Sebastiano Nasolini :
Resumo:
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
Molecular protein function prediction using sequence similarity-based and similarity-free approaches
Resumo:
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
Efficient and effective approaches of dealing with the vast amount of visual information available nowadays are highly sought after. This is particularly the case for image collections, both personal and commercial. Due to the magnitude of these ever expanding image repositories, annotation of all images images is infeasible, and search in such an image collection therefore becomes inherently difficult. Although content-based image retrieval techniques have shown much potential, such approaches also suffer from various problems making it difficult to adopt them in practice. In this paper, we follow a different approach, namely that of browsing image databases for image retrieval. In our Honeycomb Image Browser, large image databases are visualised on a hexagonal lattice with image thumbnails occupying hexagons. Arranged in a space filling manner, visually similar images are located close together enabling large image datasets to be navigated in a hierarchical manner. Various browsing tools are incorporated to allow for interactive exploration of the database. Experimental results confirm that our approach affords efficient image retrieval. © 2010 IEEE.
Resumo:
Some Eubacterium and Roseburia species are among the most prevalent motile bacteria present in the intestinal microbiota of healthy adults. These flagellate species contribute "cell motility" category genes to the intestinal microbiome and flagellin proteins to the intestinal proteome. We reviewed and revised the annotation of motility genes in the genomes of six Eubacterium and Roseburia species that occur in the human intestinal microbiota and examined their respective locus organization by comparative genomics. Motility gene order was generally conserved across these loci. Five of these species harbored multiple genes for predicted flagellins. Flagellin proteins were isolated from R. inulinivorans strain A2-194 and from E. rectale strains A1-86 and M104/1. The amino-termini sequences of the R. inulinivorans and E. rectale A1-86 proteins were almost identical. These protein preparations stimulated secretion of interleukin-8 (IL-8) from human intestinal epithelial cell lines, suggesting that these flagellins were pro-inflammatory. Flagellins from the other four species were predicted to be pro-inflammatory on the basis of alignment to the consensus sequence of pro-inflammatory flagellins from the beta- and gamma-proteobacteria. Many fliC genes were deduced to be under the control of sigma(28). The relative abundance of the target Eubacterium and Roseburia species varied across shotgun metagenomes from 27 elderly individuals. Genes involved in the flagellum biogenesis pathways of these species were variably abundant in these metagenomes, suggesting that the current depth of coverage used for metagenomic sequencing (3.13-4.79 Gb total sequence in our study) insufficiently captures the functional diversity of genomes present at low (<= 1%) relative abundance. E. rectale and R. inulinivorans thus appear to synthesize complex flagella composed of flagellin proteins that stimulate IL-8 production. A greater depth of sequencing, improved evenness of sequencing and improved metagenome assembly from short reads will be required to facilitate in silico analyses of complete complex biochemical pathways for low-abundance target species from shotgun metagenomes.
Resumo:
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
Molecular protein function prediction using sequence similarity-based and similarity-free approaches
Resumo:
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
BACKGROUND: Multiple recent genome-wide association studies (GWAS) have identified a single nucleotide polymorphism (SNP), rs10771399, at 12p11 that is associated with breast cancer risk. METHOD: We performed a fine-scale mapping study of a 700 kb region including 441 genotyped and more than 1300 imputed genetic variants in 48,155 cases and 43,612 controls of European descent, 6269 cases and 6624 controls of East Asian descent and 1116 cases and 932 controls of African descent in the Breast Cancer Association Consortium (BCAC; http://bcac.ccge.medschl.cam.ac.uk/ ), and in 15,252 BRCA1 mutation carriers in the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA). Stepwise regression analyses were performed to identify independent association signals. Data from the Encyclopedia of DNA Elements project (ENCODE) and the Cancer Genome Atlas (TCGA) were used for functional annotation. RESULTS: Analysis of data from European descendants found evidence for four independent association signals at 12p11, represented by rs7297051 (odds ratio (OR) = 1.09, 95 % confidence interval (CI) = 1.06-1.12; P = 3 × 10(-9)), rs805510 (OR = 1.08, 95 % CI = 1.04-1.12, P = 2 × 10(-5)), and rs1871152 (OR = 1.04, 95 % CI = 1.02-1.06; P = 2 × 10(-4)) identified in the general populations, and rs113824616 (P = 7 × 10(-5)) identified in the meta-analysis of BCAC ER-negative cases and BRCA1 mutation carriers. SNPs rs7297051, rs805510 and rs113824616 were also associated with breast cancer risk at P < 0.05 in East Asians, but none of the associations were statistically significant in African descendants. Multiple candidate functional variants are located in putative enhancer sequences. Chromatin interaction data suggested that PTHLH was the likely target gene of these enhancers. Of the six variants with the strongest evidence of potential functionality, rs11049453 was statistically significantly associated with the expression of PTHLH and its nearby gene CCDC91 at P < 0.05. CONCLUSION: This study identified four independent association signals at 12p11 and revealed potentially functional variants, providing additional insights into the underlying biological mechanism(s) for the association observed between variants at 12p11 and breast cancer risk
Resumo:
El periplo de Hannón, frente a las propuestas que lo interpretan como una obra literaria, creemos que recoge un periplo auténtico, que sólo alcanzó cabo Juby y algunas de las Islas Canarias. Las refundaciones cartaginesas fueron todas en la Mauretania fértil, en los 7 primeros días de la expedición. Desde el islote de Kérne, en la expedición primó una primera exploración de evaluación, indicativo de que se trataba de apenas 2 o 3 barcos, con una tripulación limitada, que evitaban enfrentamientos con la población local. Los intérpretes Lixítai parecen conocer todos los puntos explorados, el río Chrétes, los etíopes del Alto Atlas costero, el gran golfo caluroso que finalizaba en el Hespérou Kéras, el volcán Theôn Óchema, o las gentes salvajes que denominaban Goríllai. Probablemente la mayor sorpresa fuese encontrar un volcán activo, emitiendo lava, que pudo ser la razón última para redactar este periplo. La falta de agua, alimentos y caza como razón para finalizar la expedición exploratoria sólo es comprensible en un trayecto corto que alcanzó hasta el inicio del desierto del Sahara. Otro tanto sucede con la ausencia de ríos importantes al Sur del río Chrétes, una clara prueba de que no se alcanzaron latitudes ecuatoriales y que los barcos se fueron alejando de la costa norteafricana.
Resumo:
This paper is a study about the way in which se structures are represented in 20 verb entries of nine dictionaries of Spanish language. There is a large number of these structures and they are problematic for native and non native speakers. Verbs of the analysis are middle-high frequency and, in the most part of the cases, very polysemous, and this allows to observe interconnections between the different se structures and the different meanings of each verb. Data of the lexicographic analysis are cross-checked with corpus analysis of the same units. As a result, it is observed that there is a large variety in the data which are offered in each dictionary and in the way they are offered, inter and intradictionary. The reasons range from the theoretical overall of each Project to practical performance. This leads to the conclusion that it is necessary to further progress in the dictionary model it is being handled, in order to offer lexico-grammatical phenomenon such as se verbs in an accurate, clear and exhaustive way.
Resumo:
Background: Esophageal adenocarcinoma (EA) is one of the fastest rising cancers in western countries. Barrett’s Esophagus (BE) is the premalignant precursor of EA. However, only a subset of BE patients develop EA, which complicates the clinical management in the absence of valid predictors. Genetic risk factors for BE and EA are incompletely understood. This study aimed to identify novel genetic risk factors for BE and EA.Methods: Within an international consortium of groups involved in the genetics of BE/EA, we performed the first meta-analysis of all genome-wide association studies (GWAS) available, involving 6,167 BE patients, 4,112 EA patients, and 17,159 representative controls, all of European ancestry, genotyped on Illumina high-density SNP-arrays, collected from four separate studies within North America, Europe, and Australia. Meta-analysis was conducted using the fixed-effects inverse variance-weighting approach. We used the standard genome-wide significant threshold of 5×10-8 for this study. We also conducted an association analysis following reweighting of loci using an approach that investigates annotation enrichment among the genome-wide significant loci. The entire GWAS-data set was also analyzed using bioinformatics approaches including functional annotation databases as well as gene-based and pathway-based methods in order to identify pathophysiologically relevant cellular pathways.Findings: We identified eight new associated risk loci for BE and EA, within or near the CFTR (rs17451754, P=4·8×10-10), MSRA (rs17749155, P=5·2×10-10), BLK (rs10108511, P=2·1×10-9), KHDRBS2 (rs62423175, P=3·0×10-9), TPPP/CEP72 (rs9918259, P=3·2×10-9), TMOD1 (rs7852462, P=1·5×10-8), SATB2 (rs139606545, P=2·0×10-8), and HTR3C/ABCC5 genes (rs9823696, P=1·6×10-8). A further novel risk locus at LPA (rs12207195, posteriori probability=0·925) was identified after re-weighting using significantly enriched annotations. This study thereby doubled the number of known risk loci. The strongest disease pathways identified (P<10-6) belong to muscle cell differentiation and to mesenchyme development/differentiation, which fit with current pathophysiological BE/EA concepts. To our knowledge, this study identified for the first time an EA-specific association (rs9823696, P=1·6×10-8) near HTR3C/ABCC5 which is independent of BE development (P=0·45).Interpretation: The identified disease loci and pathways reveal new insights into the etiology of BE and EA. Furthermore, the EA-specific association at HTR3C/ABCC5 may constitute a novel genetic marker for the prediction of transition from BE to EA. Mutations in CFTR, one of the new risk loci identified in this study, cause cystic fibrosis (CF), the most common recessive disorder in Europeans. Gastroesophageal reflux (GER) belongs to the phenotypic CF-spectrum and represents the main risk factor for BE/EA. Thus, the CFTR locus may trigger a common GER-mediated pathophysiology.
Resumo:
MOTIVATION: Data from RNA-seq experiments provide us with many new possibilities to gain insights into biological and disease mechanisms of cellular functioning. However, the reproducibility and robustness of RNA-seq data analysis results is often unclear. This is in part attributed to the two counter acting goals of (a) a cost efficient and (b) an optimal experimental design leading to a compromise, e.g., in the sequencing depth of experiments.
RESULTS: We introduce an R package called samExploreR that allows the subsampling (m out of n bootstraping) of short-reads based on SAM files facilitating the investigation of sequencing depth related questions for the experimental design. Overall, this provides a systematic way for exploring the reproducibility and robustness of general RNA-seq studies. We exemplify the usage of samExploreR by studying the influence of the sequencing depth and the annotation on the identification of differentially expressed genes.
AVAILABILITY: Availability: samExploreR is available as an R package from Bioconductor (after acceptance of the paper, download link: http://www.bio-complexity.com/samExploreR_1.0.0.tar.gz).
Resumo:
The annotation of Business Dynamics models with parameters and equations, to simulate the system under study and further evaluate its simulation output, typically involves a lot of manual work. In this paper we present an approach for automated equation formulation of a given Causal Loop Diagram (CLD) and a set of associated time series with the help of neural network evolution (NEvo). NEvo enables the automated retrieval of surrogate equations for each quantity in the given CLD, hence it produces a fully annotated CLD that can be used for later simulations to predict future KPI development. In the end of the paper, we provide a detailed evaluation of NEvo on a business use-case to demonstrate its single step prediction capabilities.
Resumo:
We analyzed genome-wide association studies (GWASs), including data from 71,638 individuals from four ancestries, for estimated glomerular filtration rate (eGFR), a measure of kidney function used to define chronic kidney disease (CKD). We identified 20 loci attaining genome-wide-significant evidence of association (p < 5 × 10(-8)) with kidney function and highlighted that allelic effects on eGFR at lead SNPs are homogeneous across ancestries. We leveraged differences in the pattern of linkage disequilibrium between diverse populations to fine-map the 20 loci through construction of "credible sets" of variants driving eGFR association signals. Credible variants at the 20 eGFR loci were enriched for DNase I hypersensitivity sites (DHSs) in human kidney cells. DHS credible variants were expression quantitative trait loci for NFATC1 and RGS14 (at the SLC34A1 locus) in multiple tissues. Loss-of-function mutations in ancestral orthologs of both genes in Drosophila melanogaster were associated with altered sensitivity to salt stress. Renal mRNA expression of Nfatc1 and Rgs14 in a salt-sensitive mouse model was also reduced after exposure to a high-salt diet or induced CKD. Our study (1) demonstrates the utility of trans-ethnic fine mapping through integration of GWASs involving diverse populations with genomic annotation from relevant tissues to define molecular mechanisms by which association signals exert their effect and (2) suggests that salt sensitivity might be an important marker for biological processes that affect kidney function and CKD in humans.