926 resultados para Genomic data integration
Resumo:
A data warehouse is a data repository which collects and maintains a large amount of data from multiple distributed, autonomous and possibly heterogeneous data sources. Often the data is stored in the form of materialized views in order to provide fast access to the integrated data. One of the most important decisions in designing a data warehouse is the selection of views for materialization. The objective is to select an appropriate set of views that minimizes the total query response time with the constraint that the total maintenance time for these materialized views is within a given bound. This view selection problem is totally different from the view selection problem under the disk space constraint. In this paper the view selection problem under the maintenance time constraint is investigated. Two efficient, heuristic algorithms for the problem are proposed. The key to devising the proposed algorithms is to define good heuristic functions and to reduce the problem to some well-solved optimization problems. As a result, an approximate solution of the known optimization problem will give a feasible solution of the original problem. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
Este trabalho sugere uma solução de integração de dados em tempo real no contexto dos transportes públicos. Com o aumento das alternativas oferecidas aos utilizadores dos transportes públicos é importante que estes conheçam todas as alternativas com base em informação em tempo real para que realizem a escolha que melhor se enquadre às suas necessidades. Por outro lado, os operadores de transportes públicos deverão ser capazes de disponibilizar toda a informação pretendida com o mínimo de esforço ou de alterações ao sistema que têm implementado. Neste trabalho serão utilizadas ferramentas que permitem fornecer uma visão homogénea das várias fontes de dados heterogéneas, sendo essa homogeneidade o ponto de integração de todas as fontes de dados com as aplicações cliente.
Resumo:
E-Learning frameworks are conceptual tools to organize networks of elearning services. Most frameworks cover areas that go beyond the scope of e-learning, from course to financial management, and neglects the typical activities in everyday life of teachers and students at schools such as the creation, delivery, resolution and evaluation of assignments. This paper presents the Ensemble framework - an e-learning framework exclusively focused on the teaching-learning process through the coordination of pedagogical services. The framework presents an abstract data, integration and evaluation model based on content and communications specifications. These specifications must base the implementation of networks in specialized domains with complex evaluations. In this paper we specialize the framework for two domains with complex evaluation: computer programming and computer-aided design (CAD). For each domain we highlight two Ensemble hotspots: data and evaluations procedures. In the former we formally describe the exercise and present possible extensions. In the latter, we describe the automatic evaluation procedures.
Resumo:
Thesis submitted to Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa, in partial fulfillment of the requirements for the degree of Master in Computer Science
Resumo:
FEMS Yeast Research, Vol. 9, nº 4
Resumo:
Eukaryotic Cell, Vol.7, Nº6
Resumo:
The present work aims to achieve and further develop a hydrogeomechanical approach in Caldas da Cavaca hydromineral system rock mass (Aguiar da Beira, NW Portugal), and contribute to a better understanding of the hydrogeological conceptual site model. A collection of several data, namely geology, hydrogeology, rock and soil geotechnics, borehole hydraulics and hydrogeomechanics, was retrieved from three rock slopes (Lagoa, Amores and Cancela). To accomplish a comprehensive analysis and rock engineering conceptualisation of the site, a multi‐technical approach were used, such as, field and laboratory techniques, hydrogeotechnical mapping, hydrogeomechanical zoning and hydrogeomechanical scheme classifications and indexes. In addition, a hydrogeomechanical data analysis and assessment, such as Hydro‐Potential (HP)‐Value technique, JW Joint Water Reduction index, Hydraulic Classification (HC) System were applied on rock slopes. The hydrogeomechanical zone HGMZ 1 of Lagoa slope achieved higher hydraulic conductivities with poorer rock mass quality results, followed by the hydrogeomechanical zone HGMZ 2 of Lagoa slope, with poor to fair rock mass quality and lower hydraulic parameters. In addition, Amores slope had a fair to good rock mass quality and the lowest hydraulic conductivity. The hydrogeomechanical zone HGMZ 3 of Lagoa slope, and the hydrogeomechanical zones HGMZ 1 and HGMZ 2 of Cancela slope had a fair to poor rock mass quality but were completely dry. Geographical Information Systems (GIS) mapping technologies was used in overall hydrogeological and hydrogeomechanical data integration in order to improve the hydrogeological conceptual site model.
Resumo:
This paper presents a mobile information system denominated as Vehicle-to-Anything Application (V2Anything App), and explains its conceptual aspects. This application is aimed at giving relevant information to Full Electric Vehicle (FEV) drivers, by supporting the integration of several sources of data in a mobile application, thus contributing to the deployment of the electric mobility process. The V2Anything App provides recommendations to the drivers about the FEV range autonomy, location of battery charging stations, information of the electricity market, and also a route planner taking into account public transportations and car or bike sharing systems. The main contributions of this application are related with the creation of an Information and Communication Technology (ICT) platform, recommender systems, data integration systems, driver profile, and personalized range prediction. Thus, it is possible to deliver relevant information to the FEV drivers related with the electric mobility process, electricity market, public transportation, and the FEV performance.
Resumo:
Developing and implementing data-oriented workflows for data migration processes are complex tasks involving several problems related to the integration of data coming from different schemas. Usually, they involve very specific requirements - every process is almost unique. Having a way to abstract their representation will help us to better understand and validate them with business users, which is a crucial step for requirements validation. In this demo we present an approach that provides a way to enrich incrementally conceptual models in order to support an automatic way for producing their correspondent physical implementation. In this demo we will show how B2K (Business to Kettle) system works transforming BPMN 2.0 conceptual models into Kettle data-integration executable processes, approaching the most relevant aspects related to model design and enrichment, model to system transformation, and system execution.
Resumo:
Under the framework of constraint based modeling, genome-scale metabolic models (GSMMs) have been used for several tasks, such as metabolic engineering and phenotype prediction. More recently, their application in health related research has spanned drug discovery, biomarker identification and host-pathogen interactions, targeting diseases such as cancer, Alzheimer, obesity or diabetes. In the last years, the development of novel techniques for genome sequencing and other high-throughput methods, together with advances in Bioinformatics, allowed the reconstruction of GSMMs for human cells. Considering the diversity of cell types and tissues present in the human body, it is imperative to develop tissue-specific metabolic models. Methods to automatically generate these models, based on generic human metabolic models and a plethora of omics data, have been proposed. However, their results have not yet been adequately and critically evaluated and compared. This work presents a survey of the most important tissue or cell type specific metabolic model reconstruction methods, which use literature, transcriptomics, proteomics and metabolomics data, together with a global template model. As a case study, we analyzed the consistency between several omics data sources and reconstructed distinct metabolic models of hepatocytes using different methods and data sources as inputs. The results show that omics data sources have a poor overlapping and, in some cases, are even contradictory. Additionally, the hepatocyte metabolic models generated are in many cases not able to perform metabolic functions known to be present in the liver tissue. We conclude that reliable methods for a priori omics data integration are required to support the reconstruction of complex models of human cells.
Resumo:
En la presente memoria se detallan con exactitud los pasos y procesos realizados para construir una aplicación que posibilite el cruce de datos genéticos a partir de información contenida en bases de datos remotas. Desarrolla un estudio en profundidad del contenido y estructura de las bases de datos remotas del NCBI y del KEGG, documentando una minería de datos con el objetivo de extraer de ellas la información necesaria para desarrollar la aplicación de cruce de datos genéticos. Finalmente se establecen los programas, scripts y entornos gráficos que han sido implementados para la construcción y posterior puesta en marcha de la aplicación que proporciona la funcionalidad de cruce de la que es objeto este proyecto fin de carrera.
Resumo:
AbstractThe Chlamydiales order is an important bacterial phylum that comprises some of the most successful human pathogens such as Chlamydia trachomatis, the leading infectious cause of blindness worldwide. Since some years, several new bacteria related to Chlamydia have been discovered in clinical or environmental samples and might represent emerging pathogens. The genome sequencing of classical Chlamydia has brought invaluable information on these obligate intracellular bacteria otherwise difficult to study due to the lack of tools to perform basic genetic manipulation. The recent emergence of high-throughput sequencing technologies yielding millions of reads in a short time lowered the costs of genome sequencing and thus represented a unique opportunity to study Chlamydia-re\ated bacteria. Based on the sequencing and the analysis of Chlamydiales genomes, this thesis provides significant insights into the genetic determinants of the intracellular lifestyle, the pathogenicity, the metabolism and the evolution of Chlamydia-related bacteria. A first approach showed the efficacy of rapid sequencing coupled to proteomics to identify immunogenic proteins. This method, particularly useful for an emerging pathogen such as Parachlamydia acanthamoebae, enabled us to discover good candidates for the development of diagnostic tools that would permit to evaluate at larger scale the role of this bacterium in disease. Second, the complete genome of Waddlia chondrophila, a potential agent of miscarriage, encodes numerous virulence factors to manipulate its host cell and resist to environmental stresses. The reconstruction of metabolic pathways showed that the bacterium possesses extensive capabilities compared to related organisms. However, it is still incapable of synthesizing some essential components and thus has to import them from its host. Third, the genome comparison of Protochlamydia naegleriophila to its closest known relative Protochlamydia amoebophila revealed a particular evolutionary dynamic with the occurrence of an unexpected genome rearrangement. Fourth, a phylogenetic analysis of P. acanthamoebae and Legionella drancourtii identified several genes probably exchanged by horizontal gene transfer with other intracellular bacteria that might occur within their amoebal host. These genes often encode mechanisms for resistance to metal or toxic compounds. As a whole, the analysis of the different genomes enabled us to highlight a large diversity in size, GC percentage, repeat content as well as plasmid organization. The abundant genomic data obtained during this thesis have a wide impact since they provide the necessary bases for detailed investigations on countless aspects of the biology and the evolution of Chlamydia-related bacteria, whether in wet lab or by bioinformatical analyses.RésuméL'ordre des Chlamydiales est un important phylum bactérien qui comprend de nombreuses espèces pathogènes pour l'homme et les animaux, dont Chlamydia trachomatis, responsable du trachome, la cause majeure de cécité d'origine infectieuse à travers le monde. Durant ces dernières décennies, de nombreuses bactéries apparentées aux Chlamydia ont été découvertes dans des échantillons environnementaux ou cliniques mais leur éventuel rôle pathogène dans le développement de maladies reste peu connu. Ces bactéries sont des intracellulaires obligatoires car elles ont besoin d'une cellule hôte pour se multiplier, ce qui rend leur étude particulièrement difficile. Le développement de nouvelles technologies permettant de séquencer le génome d'un organisme rapidement et à moindre coût ainsi que l'essor des méthodes d'analyse s'y rapportant représentent une opportunité exceptionnelle d'étudier ces organismes. Dans ce contexte, cette thèse démontre l'utilité de la génomique pour développer de nouveaux outils diagnostiques ainsi que pour étudier le métabolisme de ces bactéries, leurs facteurs de virulence et leur évolution.Ainsi, une première approche a illustré l'utilité d'un séquençage rapide pour obtenir les informations nécessaires à l'identification de protéines qui sont reconnues par des anticorps humains ou animaux. Cette méthode, particulièrement utile pour un pathogène émergent tel que Parachlamydia acanthamoebae, a permis de découvrir de bons candidats pour le développement d'un outil diagnostique qui permettrait d'évaluer à plus large échelle le rôle de cette bactérie notamment dans la pneumonie. L'analyse du contenu génique de Waddlia chondrophila, un autre germe qui pourrait être impliqué dans les avortements et tes fausses-couches, a en outre mis en évidence la présence de nombreux facteurs connus qui lui permettent de manipuler son hôte. Cette bactérie possède de plus grandes capacités métaboliques que les autres Chlamydia, mais elle est incapable de synthétiser certains composants et doit donc les importer de son hôte pour subvenir à ses besoins. La comparaison du génome de Protochlamydia naegleriophila à son plus proche parent, Protochlamydia amoebophila, a dévoilé une évolution dynamique particulière avec l'occurrence d'un réarrangement majeur inattendu après la séparation de ces deux espèces. En outre, ces études ont montré l'occurrence de plusieurs transferts de gène avec d'autres organismes plus éloignés, notamment d'autres intracellulaires d'amibes, souvent pour l'acquisition de mécanismes de résistances à des composés toxiques. Les données génomiques acquises durant ce travail posent les fondements nécessaires a de nombreuses analyses qui permettront progressivement de mieux comprendre de nombreux aspects de ces bactéries fascinantes.
Resumo:
Phagocytosis, whether of food particles in protozoa or bacteria and cell remnants in the metazoan immune system, is a conserved process. The particles are taken up into phagosomes, which then undergo complex remodeling of their components, called maturation. By using two-dimensional gel electrophoresis and mass spectrometry combined with genomic data, we identified 179 phagosomal proteins in the amoeba Dictyostelium, including components of signal transduction, membrane traffic, and the cytoskeleton. By carrying out this proteomics analysis over the course of maturation, we obtained time profiles for 1,388 spots and thus generated a dynamic record of phagosomal protein composition. Clustering of the time profiles revealed five clusters and 24 functional groups that were mapped onto a flow chart of maturation. Two heterotrimeric G protein subunits, Galpha4 and Gbeta, appeared at the earliest times. We showed that mutations in the genes encoding these two proteins produce a phagocytic uptake defect in Dictyostelium. This analysis of phagosome protein dynamics provides a reference point for future genetic and functional investigations.
Resumo:
Understanding the drivers of population divergence, speciation and species persistence is of great interest to molecular ecology, especially for species-rich radiations inhabiting the world's biodiversity hotspots. The toolbox of population genomics holds great promise for addressing these key issues, especially if genomic data are analysed within a spatially and ecologically explicit context. We have studied the earliest stages of the divergence continuum in the Restionaceae, a species-rich and ecologically important plant family of the Cape Floristic Region (CFR) of South Africa, using the widespread CFR endemic Restio capensis (L.) H.P. Linder & C.R. Hardy as an example. We studied diverging populations of this morphotaxon for plastid DNA sequences and >14 400 nuclear DNA polymorphisms from Restriction site Associated DNA (RAD) sequencing and analysed the results jointly with spatial, climatic and phytogeographic data, using a Bayesian generalized linear mixed modelling (GLMM) approach. The results indicate that population divergence across the extreme environmental mosaic of the CFR is mostly driven by isolation by environment (IBE) rather than isolation by distance (IBD) for both neutral and non-neutral markers, consistent with genome hitchhiking or coupling effects during early stages of divergence. Mixed modelling of plastid DNA and single divergent outlier loci from a Bayesian genome scan confirmed the predominant role of climate and pointed to additional drivers of divergence, such as drift and ecological agents of selection captured by phytogeographic zones. Our study demonstrates the usefulness of population genomics for disentangling the effects of IBD and IBE along the divergence continuum often found in species radiations across heterogeneous ecological landscapes.
Resumo:
We analyze here the relation between alternative splicing and gene duplication in light of recent genomic data, with a focus on the human genome. We show that the previously reported negative correlation between level of alternative splicing and family size no longer holds true. We clarify this pattern and show that it is sufficiently explained by two factors. First, genes progressively gain new splice variants with time. The gain is consistent with a selectively relaxed regime, until purifying selection slows it down as aging genes accumulate a large number of variants. Second, we show that duplication does not lead to a loss of splice forms, but rather that genes with low levels of alternative splicing tend to duplicate more frequently. This leads us to reconsider the role of alternative splicing in duplicate retention.