983 resultados para Biological networks


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The overarching goal of the Pathway Semantics Algorithm (PSA) is to improve the in silico identification of clinically useful hypotheses about molecular patterns in disease progression. By framing biomedical questions within a variety of matrix representations, PSA has the flexibility to analyze combined quantitative and qualitative data over a wide range of stratifications. The resulting hypothetical answers can then move to in vitro and in vivo verification, research assay optimization, clinical validation, and commercialization. Herein PSA is shown to generate novel hypotheses about the significant biological pathways in two disease domains: shock / trauma and hemophilia A, and validated experimentally in the latter. The PSA matrix algebra approach identified differential molecular patterns in biological networks over time and outcome that would not be easily found through direct assays, literature or database searches. In this dissertation, Chapter 1 provides a broad overview of the background and motivation for the study, followed by Chapter 2 with a literature review of relevant computational methods. Chapters 3 and 4 describe PSA for node and edge analysis respectively, and apply the method to disease progression in shock / trauma. Chapter 5 demonstrates the application of PSA to hemophilia A and the validation with experimental results. The work is summarized in Chapter 6, followed by extensive references and an Appendix with additional material.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We have performed a systematic temporal and spatial expression profiling of the developing mouse kidney using Compugen long-oligonucleotide microarrays. The activity of 18,000 genes was monitored at 24-h intervals from 10.5-day-postcoitum (dpc) metanephric mesenchyme (MM) through to neonatal kidney, and a cohort of 3,600 dynamically expressed genes was identified. Early metanephric development was further surveyed by directly comparing RNA from 10.5 vs. 11.5 vs. 13.5dpc kidneys. These data showed high concordance with the previously published dynamic profile of rat kidney development (Stuart RO, Bush KT, and Nigam SK. Proc Natl Acad Sci USA 98: 5649-5654, 2001) and our own temporal data. Cluster analyses were used to identify gene ontological terms, functional annotations, and pathways associated with temporal expression profiles. Genetic network analysis was also used to identify biological networks that have maximal transcriptional activity during early metanephric development, highlighting the involvement of proliferation and differentiation. Differential gene expression was validated using whole mount and section in situ hybridization of staged embryonic kidneys. Two spatial profiling experiments were also undertaken. MM (10.5dpc) was compared with adjacent intermediate mesenchyme to further define metanephric commitment. To define the genes involved in branching and in the induction of nephrogenesis, expression profiling was performed on ureteric bud (GFP+) FACS sorted from HoxB7-GFP transgenic mice at 15.5dpc vs. the GFP- mesenchymal derivatives. Comparisons between temporal and spatial data enhanced the ability to predict function for genes and networks. This study provides the most comprehensive temporal and spatial survey of kidney development to date, and the compilation of these transcriptional surveys provides important insights into metanephric development that can now be functionally tested.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Graphs are powerful tools to describe social, technological and biological networks, with nodes representing agents (people, websites, gene, etc.) and edges (or links) representing relations (or interactions) between agents. Examples of real-world networks include social networks, the World Wide Web, collaboration networks, protein networks, etc. Researchers often model these networks as random graphs. In this dissertation, we study a recently introduced social network model, named the Multiplicative Attribute Graph model (MAG), which takes into account the randomness of nodal attributes in the process of link formation (i.e., the probability of a link existing between two nodes depends on their attributes). Kim and Lesckovec, who defined the model, have claimed that this model exhibit some of the properties a real world social network is expected to have. Focusing on a homogeneous version of this model, we investigate the existence of zero-one laws for graph properties, e.g., the absence of isolated nodes, graph connectivity and the emergence of triangles. We obtain conditions on the parameters of the model, so that these properties occur with high or vanishingly probability as the number of nodes becomes unboundedly large. In that regime, we also investigate the property of triadic closure and the nodal degree distribution.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Human mesenchymal stem cells (MSC) are powerful sources for cell therapy in regenerative medicine. The long time cultivation can result in replicative senescence or can be related to the emergence of chromosomal alterations responsible for the acquisition of tumorigenesis features in vitro. In this study, for the first time, the expression profile of MSC with a paracentric chromosomal inversion (MSC/inv) was compared to normal karyotype (MSC/n) in early and late passages. Furthermore, we compared the transcriptome of each MSC in early passages with late passages. MSC used in this study were obtained from the umbilical vein of three donors, two MSC/n and one MSC/inv. After their cryopreservation, they have been expanded in vitro until reached senescence. Total RNA was extracted using the RNeasy mini kit (Qiagen) and marked with the GeneChip ® 3 IVT Express Kit (Affymetrix Inc.). Subsequently, the fragmented aRNA was hybridized on the microarranjo Affymetrix Human Genome U133 Plus 2.0 arrays (Affymetrix Inc.). The statistical analysis of differential gene expression was performed between groups MSC by the Partek Genomic Suite software, version 6.4 (Partek Inc.). Was considered statistically significant differences in expression to p-value Bonferroni correction ˂.01. Only signals with fold change ˃ 3.0 were included in the list of differentially expressed. Differences in gene expression data obtained from microarrays were confirmed by Real Time RT-PCR. For the interpretation of biological expression data were used: IPA (Ingenuity Systems) for analysis enrichment functions, the STRING 9.0 for construction of network interactions; Cytoscape 2.8 to the network visualization and analysis bottlenecks with the aid of the GraphPad Prism 5.0 software. BiNGO Cytoscape pluggin was used to access overrepresentation of Gene Ontology categories in Biological Networks. The comparison between senescent and young at each group of MSC has shown that there is a difference in the expression parttern, being higher in the senescent MSC/inv group. The results also showed difference in expression profiles between the MSC/inv versus MSC/n, being greater when they are senescent. New networks were identified for genes related to the response of two of MSC over cultivation time. Were also identified genes that can coordinate functional categories over represented at networks, such as CXCL12, SFRP1, xvi EGF, SPP1, MMP1 e THBS1. The biological interpretation of these data suggests that the population of MSC/inv has different constitutional characteristics, related to their potential for differentiation, proliferation and response to stimuli, responsible for a distinct process of replicative senescence in MSC/inv compared to MSC/n. The genes identified in this study are candidates for biomarkers of cellular senescence in MSC, but their functional relevance in this process should be evaluated in additional in vitro and/or in vivo assays

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: The analysis and usage of biological data is hindered by the spread of information across multiple repositories and the difficulties posed by different nomenclature systems and storage formats. In particular, there is an important need for data unification in the study and use of protein-protein interactions. Without good integration strategies, it is difficult to analyze the whole set of available data and its properties.Results: We introduce BIANA (Biologic Interactions and Network Analysis), a tool for biological information integration and network management. BIANA is a Python framework designed to achieve two major goals: i) the integration of multiple sources of biological information, including biological entities and their relationships, and ii) the management of biological information as a network where entities are nodes and relationships are edges. Moreover, BIANA uses properties of proteins and genes to infer latent biomolecular relationships by transferring edges to entities sharing similar properties. BIANA is also provided as a plugin for Cytoscape, which allows users to visualize and interactively manage the data. A web interface to BIANA providing basic functionalities is also available. The software can be downloaded under GNU GPL license from http://sbi.imim.es/web/BIANA.php.Conclusions: BIANA's approach to data unification solves many of the nomenclature issues common to systems dealing with biological data. BIANA can easily be extended to handle new specific data repositories and new specific data types. The unification protocol allows BIANA to be a flexible tool suitable for different user requirements: non-expert users can use a suggested unification protocol while expert users can define their own specific unification rules.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The aim of this work was to design a novel strategy to detect new targets for anticancer treatments. The rationale was to build Biological Association Networks from differentially expressed genes in drug-resistant cells to identify important nodes within the Networks. These nodes may represent putative targets to attack in cancer therapy, as a way to destabilize the gene network developed by the resistant cells to escape from the drug pressure. As a model we used cells resistant to methotrexate (MTX), an inhibitor of DHFR. Selected node-genes were analyzed at the transcriptional level and from a genotypic point of view. In colon cancer cells, DHFR, the AKR1 family, PKC¿, S100A4, DKK1, and CAV1 were overexpressed while E-cadherin was lost. In breast cancer cells, the UGT1A family was overexpressed, whereas EEF1A1 was overexpressed in pancreatic cells. Interference RNAs directed against these targets sensitized cells towards MTX.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Developing high-quality scientific research will be most effective if research communities with diverse skills and interests are able to share information and knowledge, are aware of the major challenges across disciplines, and can exploit economies of scale to provide robust answers and better inform policy. We evaluate opportunities and challenges facing the development of a more interactive research environment by developing an interdisciplinary synthesis of research on a single geographic region. We focus on the Amazon as it is of enormous regional and global environmental importance and faces a highly uncertain future. To take stock of existing knowledge and provide a framework for analysis we present a set of mini-reviews from fourteen different areas of research, encompassing taxonomy, biodiversity, biogeography, vegetation dynamics, landscape ecology, earth-atmosphere interactions, ecosystem processes, fire, deforestation dynamics, hydrology, hunting, conservation planning, livelihoods, and payments for ecosystem services. Each review highlights the current state of knowledge and identifies research priorities, including major challenges and opportunities. We show that while substantial progress is being made across many areas of scientific research, our understanding of specific issues is often dependent on knowledge from other disciplines. Accelerating the acquisition of reliable and contextualized knowledge about the fate of complex pristine and modified ecosystems is partly dependent on our ability to exploit economies of scale in shared resources and technical expertise, recognise and make explicit interconnections and feedbacks among sub-disciplines, increase the temporal and spatial scale of existing studies, and improve the dissemination of scientific findings to policy makers and society at large. Enhancing interaction among research efforts is vital if we are to make the most of limited funds and overcome the challenges posed by addressing large-scale interdisciplinary questions. Bringing together a diverse scientific community with a single geographic focus can help increase awareness of research questions both within and among disciplines, and reveal the opportunities that may exist for advancing acquisition of reliable knowledge. This approach could be useful for a variety of globally important scientific questions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We introduce jump processes in R(k), called density-profile processes, to model biological signaling networks. Our modeling setup describes the macroscopic evolution of a finite-size spin-flip model with k types of spins with arbitrary number of internal states interacting through a non-reversible stochastic dynamics. We are mostly interested on the multi-dimensional empirical-magnetization vector in the thermodynamic limit, and prove that, within arbitrary finite time-intervals, its path converges almost surely to a deterministic trajectory determined by a first-order (non-linear) differential equation with explicit bounds on the distance between the stochastic and deterministic trajectories. As parameters of the spin-flip dynamics change, the associated dynamical system may go through bifurcations, associated to phase transitions in the statistical mechanical setting. We present a simple example of spin-flip stochastic model, associated to a synthetic biology model known as repressilator, which leads to a dynamical system with Hopf and pitchfork bifurcations. Depending on the parameter values, the magnetization random path can either converge to a unique stable fixed point, converge to one of a pair of stable fixed points, or asymptotically evolve close to a deterministic orbit in Rk. We also discuss a simple signaling pathway related to cancer research, called p53 module.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

n this paper we propose the use of Networks of Bio-inspired Processors (NBP) to model some biological phenomena within a computational framework. In particular, we propose the use of an extension of NBP named Network Evolutionary Processors Transducers to simulate chemical transformations of substances. Within a biological process, chemical transformations of substances are basic operations in the change of the state of the cell. Previously, it has been proved that NBP are computationally complete, that is, they are able to solve NP complete problems in linear time, using massively parallel computations. In addition, we propose a multilayer architecture that will allow us to design models of biological processes related to cellular communication as well as their implications in the metabolic pathways. Subsequently, these models can be applied not only to biological-cellular instances but, possibly, also to configure instances of interactive processes in many other fields like population interactions, ecological trophic networks, in dustrial ecosystems, etc.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Biological neuronal networks constitute a special class of dynamical systems, as they are formed by individual geometrical components, namely the neurons. In the existing literature, relatively little attention has been given to the influence of neuron shape on the overall connectivity and dynamics of the emerging networks. The current work addresses this issue by considering simplified neuronal shapes consisting of circular regions (soma/axons) with spokes (dendrites). Networks are grown by placing these patterns randomly in the two-dimensional (2D) plane and establishing connections whenever a piece of dendrite falls inside an axon. Several topological and dynamical properties of the resulting graph are measured, including the degree distribution, clustering coefficients, symmetry of connections, size of the largest connected component, as well as three hierarchical measurements of the local topology. By varying the number of processes of the individual basic patterns, we can quantify relationships between the individual neuronal shape and the topological and dynamical features of the networks. Integrate-and-fire dynamics on these networks is also investigated with respect to transient activation from a source node, indicating that long-range connections play an important role in the propagation of avalanches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: There are several studies in the literature depicting measurement error in gene expression data and also, several others about regulatory network models. However, only a little fraction describes a combination of measurement error in mathematical regulatory networks and shows how to identify these networks under different rates of noise. Results: This article investigates the effects of measurement error on the estimation of the parameters in regulatory networks. Simulation studies indicate that, in both time series (dependent) and non-time series (independent) data, the measurement error strongly affects the estimated parameters of the regulatory network models, biasing them as predicted by the theory. Moreover, when testing the parameters of the regulatory network models, p-values computed by ignoring the measurement error are not reliable, since the rate of false positives are not controlled under the null hypothesis. In order to overcome these problems, we present an improved version of the Ordinary Least Square estimator in independent (regression models) and dependent (autoregressive models) data when the variables are subject to noises. Moreover, measurement error estimation procedures for microarrays are also described. Simulation results also show that both corrected methods perform better than the standard ones (i.e., ignoring measurement error). The proposed methodologies are illustrated using microarray data from lung cancer patients and mouse liver time series data. Conclusions: Measurement error dangerously affects the identification of regulatory network models, thus, they must be reduced or taken into account in order to avoid erroneous conclusions. This could be one of the reasons for high biological false positive rates identified in actual regulatory network models.