2 resultados para Image Processing in Molecular Biology Research
em DRUM (Digital Repository at the University of Maryland)
Resumo:
bbd18 is a differentially expressed Borrelia burgdorferi gene that is transcribed at almost undetectable levels in spirochetes grown in vitro but dramatically upregulated during tick infection. The gene also displays low yet detectable expression at various times in tissues of murine hosts. As the gene product bears no homology to known proteins, its biological significance remains enigmatic. To understand the gene function, we created isogenic bbd18-deletion mutants as well as genetically-complemented isolates from an infectious wild-type B. burgdorferi strain. Compared to parental isolates, bbd18 mutants - but not complemented spirochetes - displayed slower in vitro growth. The bbd18 mutants also reflect significantly reduced ability to persist or remain undetectable both in immunocompetent and SCID mice, yet were able to survive in ticks. This suggests BBD18 function is essential in mammalian hosts but redundant in the arthropod vector. Notably, although bbd18 expression and in vitro growth defects are restored in the complemented isolates, their phenotype is similar to the mutants - being unable to persist in mice but able to survive in ticks. Despite low expression in cultured wild-type B. burgdorferi, bbd18 deletion downregulated several genes. Interestingly, expression of some, including ospD and bbi39, could be complemented, while that of others could not be restored via bbd18 re-expression. Correspondingly, bbd18 mutants displayed altered production of several proteins, and similar to RNA levels, some were restored in the bbd18 complement and others not. To understand how bbd18 deletion results in apparently permanent and noncomplementable phenotypic defects, we sought to genetically disturb the DNA topology surrounding the bbd18 locus without deleting the gene. Spirochetes with an antibiotic cassette inserted downstream of the gene, between bbd17 and bbd18, were significantly attenuated in mice, while a similar upstream insertion, between bbd18 and bbd19, did not affect infectivity, suggesting that an unidentified cis element downstream of bbd18 may encode a virulence-associated factor critical for infection.
Resumo:
Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.