7 resultados para DATA INTEGRATION

em Cambridge University Engineering Department Publications Database


Relevância:

100.00% 100.00%

Publicador:

Resumo:

MOTIVATION: We present a method for directly inferring transcriptional modules (TMs) by integrating gene expression and transcription factor binding (ChIP-chip) data. Our model extends a hierarchical Dirichlet process mixture model to allow data fusion on a gene-by-gene basis. This encodes the intuition that co-expression and co-regulation are not necessarily equivalent and hence we do not expect all genes to group similarly in both datasets. In particular, it allows us to identify the subset of genes that share the same structure of transcriptional modules in both datasets. RESULTS: We find that by working on a gene-by-gene basis, our model is able to extract clusters with greater functional coherence than existing methods. By combining gene expression and transcription factor binding (ChIP-chip) data in this way, we are better able to determine the groups of genes that are most likely to represent underlying TMs. AVAILABILITY: If interested in the code for the work presented in this article, please contact the authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Architecture, Engineering, Construction and Facilities Management (AEC/FM) industry is rapidly becoming a multidisciplinary, multinational and multi-billion dollar economy, involving large numbers of actors working concurrently at different locations and using heterogeneous software and hardware technologies. Since the beginning of the last decade, a great deal of effort has been spent within the field of construction IT in order to integrate data and information from most computer tools used to carry out engineering projects. For this purpose, a number of integration models have been developed, like web-centric systems and construction project modeling, a useful approach in representing construction projects and integrating data from various civil engineering applications. In the modern, distributed and dynamic construction environment it is important to retrieve and exchange information from different sources and in different data formats in order to improve the processes supported by these systems. Previous research demonstrated that a major hurdle in AEC/FM data integration in such systems is caused by its variety of data types and that a significant part of the data is stored in semi-structured or unstructured formats. Therefore, new integrative approaches are needed to handle non-structured data types like images and text files. This research is focused on the integration of construction site images. These images are a significant part of the construction documentation with thousands stored in site photographs logs of large scale projects. However, locating and identifying such data needed for the important decision making processes is a very hard and time-consuming task, while so far, there are no automated methods for associating them with other related objects. Therefore, automated methods for the integration of construction images are important for construction information management. During this research, processes for retrieval, classification, and integration of construction images in AEC/FM model based systems have been explored. Specifically, a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval have been deployed in order to develop a methodology for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

MOTIVATION: The integration of multiple datasets remains a key challenge in systems biology and genomic medicine. Modern high-throughput technologies generate a broad array of different data types, providing distinct-but often complementary-information. We present a Bayesian method for the unsupervised integrative modelling of multiple datasets, which we refer to as MDI (Multiple Dataset Integration). MDI can integrate information from a wide range of different datasets and data types simultaneously (including the ability to model time series data explicitly using Gaussian processes). Each dataset is modelled using a Dirichlet-multinomial allocation (DMA) mixture model, with dependencies between these models captured through parameters that describe the agreement among the datasets. RESULTS: Using a set of six artificially constructed time series datasets, we show that MDI is able to integrate a significant number of datasets simultaneously, and that it successfully captures the underlying structural similarity between the datasets. We also analyse a variety of real Saccharomyces cerevisiae datasets. In the two-dataset case, we show that MDI's performance is comparable with the present state-of-the-art. We then move beyond the capabilities of current approaches and integrate gene expression, chromatin immunoprecipitation-chip and protein-protein interaction data, to identify a set of protein complexes for which genes are co-regulated during the cell cycle. Comparisons to other unsupervised data integration techniques-as well as to non-integrative approaches-demonstrate that MDI is competitive, while also providing information that would be difficult or impossible to extract using other methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We propose an algorithm to perform multitask learning where each task has potentially distinct label sets and label correspondences are not readily available. This is in contrast with existing methods which either assume that the label sets shared by different tasks are the same or that there exists a label mapping oracle. Our method directly maximizes the mutual information among the labels, and we show that the resulting objective function can be efficiently optimized using existing algorithms. Our proposed approach has a direct application for data integration with different label spaces, such as integrating Yahoo! and DMOZ web directories.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This book explores the processes for retrieval, classification, and integration of construction images in AEC/FM model based systems. The author describes a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval that have been integrated into a novel method for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks. objects. Therefore, automated methods for the integration of construction images are important for construction information management. During this research, processes for retrieval, classification, and integration of construction images in AEC/FM model based systems have been explored. Specifically, a combination of techniques from the areas of image and video processing, computer vision, information retrieval, statistics and content-based image and video retrieval have been deployed in order to develop a methodology for the retrieval of related construction site image data from components of a project model. This method has been tested on available construction site images from a variety of sources like past and current building construction and transportation projects and is able to automatically classify, store, integrate and retrieve image data files in inter-organizational systems so as to allow their usage in project management related tasks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel integration method for the production of cost-effective optoelectronic printed circuit boards (OE PCBs) is presented. The proposed integration method allows fabrication of OE PCBs with manufacturing processes common to the electronics industry while enabling direct attachment of electronic components onto the board with solder reflow processes as well as board assembly with automated pick-and-place tools. The OE PCB design is based on the use of polymer multimode waveguides, end-fired optical coupling schemes, and simple electro-optic connectors, eliminating the need for additional optical components in the optical layer, such as micro-mirrors and micro-lenses. A proof-of-concept low-cost optical transceiver produced with the proposed integration method is presented. This transceiver is fabricated on a low-cost FR4 substrate, comprises a polymer Y-splitter together with the electronic circuitry of the transmitter and receiver modules and achieves error-free 10-Gb/s bidirectional data transmission. Theoretical studies on the optical coupling efficiencies and alignment tolerances achieved with the employed end-fired coupling schemes are presented while experimental results on the optical transmission characteristics, frequency response, and data transmission performance of the integrated optical links are reported. The demonstrated optoelectronic unit can be used as a front-end optical network unit in short-reach datacommunication links. © 2011-2012 IEEE.