820 resultados para literature based discovery
Resumo:
This paper evaluates the efficiency of a number of popular corpus-based distributional models in performing discovery on very large document sets, including online collections. Literature-based discovery is the process of identifying previously unknown connections from text, often published literature, that could lead to the development of new techniques or technologies. Literature-based discovery has attracted growing research interest ever since Swanson's serendipitous discovery of the therapeutic effects of fish oil on Raynaud's disease in 1986. The successful application of distributional models in automating the identification of indirect associations underpinning literature-based discovery has been heavily demonstrated in the medical domain. However, we wish to investigate the computational complexity of distributional models for literature-based discovery on much larger document collections, as they may provide computationally tractable solutions to tasks including, predicting future disruptive innovations. In this paper we perform a computational complexity analysis on four successful corpus-based distributional models to evaluate their fit for such tasks. Our results indicate that corpus-based distributional models that store their representations in fixed dimensions provide superior efficiency on literature-based discovery tasks.
Resumo:
In this paper we discuss our current efforts to develop and implement an exploratory, discovery mode assessment item into the total learning and assessment profile for a target group of about 100 second level engineering mathematics students. The assessment item under development is composed of 2 parts, namely, a set of "pre-lab" homework problems (which focus on relevant prior mathematical knowledge, concepts and skills), and complementary computing laboratory exercises which are undertaken within a fixed (1 hour) time frame. In particular, the computing exercises exploit the algebraic manipulation and visualisation capabilities of the symbolic algebra package MAPLE, with the aim of promoting understanding of certain mathematical concepts and skills via visual and intuitive reasoning, rather than a formal or rigorous approach. The assessment task we are developing is aimed at providing students with a significant learning experience, in addition to providing feedback on their individual knowledge and skills. To this end, a noteworthy feature of the scheme is that marks awarded for the laboratory work are primarily based on the extent to which reflective, critical thinking is demonstrated, rather than the amount of CBE-style tasks completed by the student within the allowed time. With regard to student learning outcomes, a novel and potentially critical feature of our scheme is that the assessment task is designed to be intimately linked to the overall course content, in that it aims to introduce important concepts and skills (via individual student exploration) which will be revisited somewhat later in the pedagogically more restrictive formal lecture component of the course (typically a large group plenary format). Furthermore, the time delay involved, or "incubation period", is also a deliberate design feature: it is intended to allow students the opportunity to undergo potentially important internal re-adjustments in their understanding, before being exposed to lectures on related course content which are invariably delivered in a more condensed, formal and mathematically rigorous manner. In our presentation, we will discuss in more detail our motivation and rationale for trailing such a scheme for the targeted student group. Some of the advantages and disadvantages of our approach (as we perceived them at the initial stages) will also be enumerated. In a companion paper, the theoretical framework for our approach will be more fully elaborated, and measures of student learning outcomes (as obtained from eg. student provided feedback) will be discussed.
Resumo:
This paper derives from research-in-progress intending both Design Research (DR) and Design Science (DS) outputs; the former a management decision tool based in IS-Impact (Gable et al. 2008) kernel theory; the latter being methodological learnings deriving from synthesis of the literature and reflection on the DR ‘case study’ experience. The paper introduces a generic, detailed and pragmatic DS ‘Research Roadmap’ or methodology, deriving at this stage primarily from synthesis and harmonization of relevant concepts identified through systematic archival analysis of related literature. The scope of the Roadmap too has been influenced by the parallel study aim to undertake DR applying and further evolving the Roadmap. The Roadmap is presented in attention to the dearth of detailed guidance available to novice Researchers in Design Science Research (DSR), and though preliminary, is expected to evolve and gradually be substantiated through experience of its application. A key distinction of the Roadmap from other DSR methods is its breadth of coverage of published DSR concepts and activities; its detail and scope. It represents a useful synthesis and integration of otherwise highly disparate DSR-related concepts.
Resumo:
This paper addresses the issue of analogical inference, and its potential role as the mediator of new therapeutic discoveries, by using disjunction operators based on quantum connectives to combine many potential reasoning pathways into a single search expression. In it, we extend our previous work in which we developed an approach to analogical retrieval using the Predication-based Semantic Indexing (PSI) model, which encodes both concepts and the relationships between them in high-dimensional vector space. As in our previous work, we leverage the ability of PSI to infer predicate pathways connecting two example concepts, in this case comprising of known therapeutic relationships. For example, given that drug x TREATS disease z, we might infer the predicate pathway drug x INTERACTS WITH gene y ASSOCIATED WITH disease z, and use this pathway to search for drugs related to another disease in similar ways. As biological systems tend to be characterized by networks of relationships, we evaluate the ability of quantum-inspired operators to mediate inference and retrieval across multiple relations, by testing the ability of different approaches to recover known therapeutic relationships. In addition, we introduce a novel complex vector based implementation of PSI, based on Plate’s Circular Holographic Reduced Representations, which we utilize for all experiments in addition to the binary vector based approach we have applied in our previous research.
Resumo:
Retinal angiogenesis is tightly regulated to meet oxygenation and nutritional requirements. In diseases such as proliferative diabetic retinopathy and neovascular age-related macular degeneration, uncontrolled angiogenesis can lead to blindness. Our goal is to better understand the molecular processes controlling retinal angiogenesis and discover novel drugs that inhibit retinal neovascularization. Phenotype-based chemical screens were performed using the ChemBridge DiversetTM library and inhibition of hyaloid vessel angiogenesis in Tg(fli1:EGFP) zebrafish. 2-[(E)-2-(Quinolin-2-yl)vinyl]phenol, (quininib) robustly inhibits developmental angiogenesis at 4–10 μM in zebrafish and significantly inhibits angiogenic tubule formation in HMEC-1 cells, angiogenic sprouting in aortic ring explants, and retinal revascularization in oxygen-induced retinopathy mice. Quininib is well tolerated in zebrafish, human cell lines, and murine eyes. Profiling screens of 153 angiogenic and inflammatory targets revealed that quininib does not directly target VEGF receptors but antagonizes cysteinyl leukotriene receptors 1 and 2 (CysLT1–2) at micromolar IC50 values. In summary, quininib is a novel anti-angiogenic small-molecule CysLT receptor antagonist. Quininib inhibits angiogenesis in a range of cell and tissue systems, revealing novel physiological roles for CysLT signaling. Quininib has potential as a novel therapeutic agent to treat ocular neovascular pathologies and may complement current anti-VEGF biological agents.
Resumo:
Epothilones are macrocyclic bacterial natural products with potent microtubule-stabilizing and antiproliferative activity. They have served as successful lead structures for the development of several clinical candidates for anticancer therapy. However, the structural diversity of this group of clinical compounds is rather limited, as their structures show little divergence from the original natural product leads. Our own research has explored the question of whether epothilones can serve as a basis for the development of new structural scaffolds, or chemotypes, for microtubule stabilization that might serve as a basis for the discovery of new generations of anticancer drugs. We have elaborated a series of epothilone-derived macrolactones whose overall structural features significantly deviate from those of the natural epothilone scaffold and thus define new structural families of microtubule-stabilizing agents. Key elements of our hypermodification strategy are the change of the natural epoxide geometry from cis to trans, the incorporation of a conformationally constrained side chain, the removal of the C3-hydroxyl group, and the replacement of C12 with nitrogen. So far, this approach has yielded analogs 30 and 40 that are the most advanced, the most rigorously modified, structures, both of which are potent antiproliferative agents with low nanomolar activity against several human cancer cell lines in vitro. The synthesis was achieved through a macrolactone-based strategy or a high-yielding RCM reaction. The 12-aza-epothilone ("azathilone" 40) may be considered a "non-natural" natural product that still retains most of the overall structural characteristics of a true natural product but is structurally unique, because it lies outside of the general scope of Nature's biosynthetic machinery for polyketide synthesis. Like natural epothilones, both 30 and 40 promote tubulin polymerization in vitro and at the cellular level induce cell cycle arrest in mitosis. These facts indicate that cancer cell growth inhibition by these compounds is based on the same mechanistic underpinnings as those for natural epothilones. Interestingly, the 9,10-dehydro analog of 40 is significantly less active than the saturated parent compound, which is contrary to observations for natural epothilones B or D. This may point to differences in the bioactive conformations of N-acyl-12-aza-epothilones like 40 and natural epothilones. In light of their distinct structural features, combined with an epothilone-like (and taxol-like) in vitro biological profile, 30 and 40 can be considered as representative examples of new chemotypes for microtubule stabilization. As such, they may offer the same potential for pharmacological differentiation from the original epothilone leads as various newly discovered microtubule-stabilizing natural products with macrolactone structures, such as laulimalide, peloruside, or dictyostatin.
Resumo:
Includes indexes.
Resumo:
Advances in neural network language models have demonstrated that these models can effectively learn representations of words meaning. In this paper, we explore a variation of neural language models that can learn on concepts taken from structured ontologies and extracted from free-text, rather than directly from terms in free-text. This model is employed for the task of measuring semantic similarity between medical concepts, a task that is central to a number of techniques in medical informatics and information retrieval. The model is built with two medical corpora (journal abstracts and patient records) and empirically validated on two ground-truth datasets of human-judged concept pairs assessed by medical professionals. Empirically, our approach correlates closely with expert human assessors ($\approx$ 0.9) and outperforms a number of state-of-the-art benchmarks for medical semantic similarity. The demonstrated superiority of this model for providing an effective semantic similarity measure is promising in that this may translate into effectiveness gains for techniques in medical information retrieval and medical informatics (e.g., query expansion and literature-based discovery).
Resumo:
Building and maintaining software are not easy tasks. However, thanks to advances in web technologies, a new paradigm is emerging in software development. The Service Oriented Architecture (SOA) is a relatively new approach that helps bridge the gap between business and IT and also helps systems remain exible. However, there are still several challenges with SOA. As the number of available services grows, developers are faced with the problem of discovering the services they need. Public service repositories such as Programmable Web provide only limited search capabilities. Several mechanisms have been proposed to improve web service discovery by using semantics. However, most of these require manually tagging the services with concepts in an ontology. Adding semantic annotations is a non-trivial process that requires a certain skill-set from the annotator and also the availability of domain ontologies that include the concepts related to the topics of the service. These issues have prevented these mechanisms becoming widespread. This thesis focuses on two main problems. First, to avoid the overhead of manually adding semantics to web services, several automatic methods to include semantics in the discovery process are explored. Although experimentation with some of these strategies has been conducted in the past, the results reported in the literature are mixed. Second, Wikipedia is explored as a general-purpose ontology. The benefit of using it as an ontology is assessed by comparing these semantics-based methods to classic term-based information retrieval approaches. The contribution of this research is significant because, to the best of our knowledge, a comprehensive analysis of the impact of using Wikipedia as a source of semantics in web service discovery does not exist. The main output of this research is a web service discovery engine that implements these methods and a comprehensive analysis of the benefits and trade-offs of these semantics-based discovery approaches.
Resumo:
Due to the availability of huge number of web services, finding an appropriate Web service according to the requirements of a service consumer is still a challenge. Moreover, sometimes a single web service is unable to fully satisfy the requirements of the service consumer. In such cases, combinations of multiple inter-related web services can be utilised. This paper proposes a method that first utilises a semantic kernel model to find related services and then models these related Web services as nodes of a graph. An all-pair shortest-path algorithm is applied to find the best compositions of Web services that are semantically related to the service consumer requirement. The recommendation of individual and composite Web services composition for a service request is finally made. Empirical evaluation confirms that the proposed method significantly improves the accuracy of service discovery in comparison to traditional keyword-based discovery methods.
Resumo:
Frequent episode discovery framework is a popular framework in temporal data mining with many applications. Over the years, many different notions of frequencies of episodes have been proposed along with different algorithms for episode discovery. In this paper, we present a unified view of all the apriori-based discovery methods for serial episodes under these different notions of frequencies. Specifically, we present a unified view of the various frequency counting algorithms. We propose a generic counting algorithm such that all current algorithms are special cases of it. This unified view allows one to gain insights into different frequencies, and we present quantitative relationships among different frequencies. Our unified view also helps in obtaining correctness proofs for various counting algorithms as we show here. It also aids in understanding and obtaining the anti-monotonicity properties satisfied by the various frequencies, the properties exploited by the candidate generation step of any apriori-based method. We also point out how our unified view of counting helps to consider generalization of the algorithm to count episodes with general partial orders.
Resumo:
In this review piece, we survey the literature on the cost of equity capital implications of corporate disclosure and conservative accounting policy choice decisions with the principle objective of providing insights into the design and methodological issues, which underlie the empirical investigations. We begin with a review of the analytical studies most typically cited in the empirical research as providing a theoretical foundation. We then turn to consider literature that offers insights into the selection of proxies for each of our points of interest, cost of equity capital, disclosure quality and accounting conservatism. As a final step, we review selected empirical studies to illustrate the relevant evidence found within the literature. Based on our review, we interpret the literature as providing the researcher with only limited direct guidance on the appropriate choice of measure for each of the constructs of interest. Further, we view the literature as raising questions about both the interpretation of empirical findings in the face of measurement concerns and the suitability of certain theoretical arguments to the research setting. Overall, perhaps the message which is most clear is that one of the most controversial and fundamental issues underlying the literature is the issue of the diversifiability or nondiversifiability of information effects.