882 resultados para knowlede discovery
Resumo:
Building and maintaining software are not easy tasks. However, thanks to advances in web technologies, a new paradigm is emerging in software development. The Service Oriented Architecture (SOA) is a relatively new approach that helps bridge the gap between business and IT and also helps systems remain exible. However, there are still several challenges with SOA. As the number of available services grows, developers are faced with the problem of discovering the services they need. Public service repositories such as Programmable Web provide only limited search capabilities. Several mechanisms have been proposed to improve web service discovery by using semantics. However, most of these require manually tagging the services with concepts in an ontology. Adding semantic annotations is a non-trivial process that requires a certain skill-set from the annotator and also the availability of domain ontologies that include the concepts related to the topics of the service. These issues have prevented these mechanisms becoming widespread. This thesis focuses on two main problems. First, to avoid the overhead of manually adding semantics to web services, several automatic methods to include semantics in the discovery process are explored. Although experimentation with some of these strategies has been conducted in the past, the results reported in the literature are mixed. Second, Wikipedia is explored as a general-purpose ontology. The benefit of using it as an ontology is assessed by comparing these semantics-based methods to classic term-based information retrieval approaches. The contribution of this research is significant because, to the best of our knowledge, a comprehensive analysis of the impact of using Wikipedia as a source of semantics in web service discovery does not exist. The main output of this research is a web service discovery engine that implements these methods and a comprehensive analysis of the benefits and trade-offs of these semantics-based discovery approaches.
Resumo:
This study investigates whether and how a firm’s ownership and corporate governance affect its timeliness of price discovery, which is referred to as the speed of incorporation of value-relevant information into the stock price. Using a panel data of 1,138 Australian firm-year observations from 2001 to 2008, we predict and find a non-linear relationship between ownership concentration and the timeliness of price discovery. We test the identity of the largest shareholder and find that only firms with family as the largest shareholder exhibit faster price discovery. There is no evidence that suggests that the presence of a second largest shareholder affects the timeliness of price discovery materially. Although we find a positive association between corporate governance quality and the timeliness of price discovery, as expected, there is no interaction effect between the largest shareholding and corporate governance in relation to the timeliness of price discovery. Further tests show no evidence of severe endogeneity problems in our study.
Resumo:
This article outlines the impact that a conspiracy of silence and denial of difference has had on some adopted and donor conceived persons who have been lied to or misled about their origins. Factors discussed include deceit - expressed as a central secret which undermines the fabric of a family and through distortion mystifies communication processes; the shock of discovery - often revealed accidentally and the associated sense of betrayal when this occurs; and a series of losses, for example, kinship, medical history, culture and agency which result in having to rebuild personal identity. By providing those affected with a voice, validation and vindication healing can begin. Any feelings of disregard, of betrayal of trust, of anger, frustration, sorrow or loss, need to be regarded as real, expected, and above all, a valid reaction to what has occurred. The author is a 'late discoverer' of her adoption and draws on the information from her doctoral research on the same topic which was completed in 2012.
Resumo:
Some children adopted under the now discredited period of closed adoption were never told of their adoptive status until it was revealed to them in adulthood. Yet to date, this ‘late-discovery’ experience has received little research attention. Now a new generation of ‘late discoverers’ is emerging as a result of (heterosexual couple) donor insemination (DI) practices. This study of 25 late-discovery participants of either adoptive or (heterosexual couple) DI offspring status reveals ethical concerns particular to the lateness of discovery. Most of the participants were Australian, with the remainder from the UK, USA and Canada. All were asked to give an ‘open’ account of their experience, with four themes or suggestions provided on request. These accounts were added to those available in relevant publications. The analysis employed a hermeneutic phenomenological methodology and all accounts were analysed using an ethical perspective developed by Walker (2006, 2007). The main themes that emerged were: disrupted personal autonomy, betrayal of deep levels of trust and feelings of injustice and diminished self-worth. The lack of recognition of concerns particular to late discovery has resulted in late discoverers (i) feeling unable to regain a sense of personal control, (ii) significantly disrupted relationships with those closest to them and others, including community and institutions, and (iii) feelings of diminished value and self-worth.
Resumo:
Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case with Wikipedia. Techniques for identifying new and topically relevant cross-lingual links are a current topic of interest at NTCIR where the CrossLink task has been running since the 2011 NTCIR-9. This paper presents the evaluation framework for benchmarking algorithms for cross-lingual link discovery evaluated in the context of NTCIR-9. This framework includes topics, document collections, assessments, metrics, and a toolkit for pooling, assessment, and evaluation. The assessments are further divided into two separate sets: manual assessments performed by human assessors; and automatic assessments based on links extracted from Wikipedia itself. Using this framework we show that manual assessment is more robust than automatic assessment in the context of cross-lingual link discovery.
Resumo:
This paper presents an overview of NTCIR-10 Cross-lingual Link Discovery (CrossLink-2) task. For the task, we continued using the evaluation framework developed for the NTCIR-9 CrossLink-1 task. Overall, recommended links were evaluated at two levels (file-to-file and anchor-to-file); and system performance was evaluated with metrics: LMAP, R-Prec and P@N.
Resumo:
To enhance the therapeutic efficacy and reduce the adverse effects of traditional Chinese medicine, practitioners often prescribe combinations of plant species and/or minerals, called formulae. Unfortunately, the working mechanisms of most of these compounds are difficult to determine and thus remain unknown. In an attempt to address the benefits of formulae based on current biomedical approaches, we analyzed the components of Yinchenhao Tang, a classical formula that has been shown to be clinically effective for treating hepatic injury syndrome. The three principal components of Yinchenhao Tang are Artemisia annua L., Gardenia jasminoids Ellis, and Rheum Palmatum L., whose major active ingredients are 6,7-dimethylesculetin (D), geniposide (G), and rhein (R), respectively. To determine the mechanisms underlying the efficacy of this formula, we conducted a systematic analysis of the therapeutic effects of the DGR compound using immunohistochemistry, biochemistry, metabolomics, and proteomics. Here, we report that the DGR combination exerts a more robust therapeutic effect than any one or two of the three individual compounds by hitting multiple targets in a rat model of hepatic injury. Thus, DGR synergistically causes intensified dynamic changes in metabolic biomarkers, regulates molecular networks through target proteins, has a synergistic/additive effect, and activates both intrinsic and extrinsic pathways.
Resumo:
This thesis is a study for automatic discovery of text features for describing user information needs. It presents an innovative data-mining approach that discovers useful knowledge from both relevance and non-relevance feedback information. The proposed approach can largely reduce noises in discovered patterns and significantly improve the performance of text mining systems. This study provides a promising method for the study of Data Mining and Web Intelligence.
Resumo:
Presently organisations engage in what is termed as Global Business Transformation Projects [GBTPs], for consolidating, innovating, transforming and restructuring their processes and business strategies while undergoing fundamental change. Culture plays an important role in global business transformation projects as these involve people of different cultural backgrounds and span across countries, industries and disciplinary boundaries. Nevertheless, there is scant empirical research on how culture is conceptualised beyond national and organisational cultures but also on how culture is to be taken into account and dealt with within global business transformation projects. This research is situated in a business context and discovers a theory that aids in describing and dealing with culture. It draws on the lived experiences of thirty-two senior management practitioners, reporting on more than sixty-one global business transformation projects in which they were actively involved. The research method used is a qualitative and interpretive one and applies a grounded theory approach, with rich data generated through interviews. In addition, vignettes were developed to illustrate the derived theoretical models. The findings from this study contribute to knowledge in multiple ways. First, it provides a holistic account of global business transformation projects that describe the construct of culture by the elements of culture types, cultural differences and cultural diversity. A typology of culture types has been developed which enlarges the view of culture beyond national and organisational culture including an industry culture, professional service firm culture and 'theme' culture. The amalgamation of the culture types instantiated in a global business transformation project compromises its project culture. Second, the empirically grounded process for managing culture in global business transformation projects integrates the stages of recognition, understanding and management as well as the enablement providing a roadmap for dealing with culture in global business transformation projects. Third, this study identified contextual variables to global business transformation projects, which provide the means of describing the environment global business transformation projects are situated, influence the construct of culture and inform the process for managing culture. Fourth, the contribution to the research method is the positioning of interview research as a strategy for data generation and the detailed documentation applying grounded theory to discover theory.
Resumo:
This paper evaluates the efficiency of a number of popular corpus-based distributional models in performing discovery on very large document sets, including online collections. Literature-based discovery is the process of identifying previously unknown connections from text, often published literature, that could lead to the development of new techniques or technologies. Literature-based discovery has attracted growing research interest ever since Swanson's serendipitous discovery of the therapeutic effects of fish oil on Raynaud's disease in 1986. The successful application of distributional models in automating the identification of indirect associations underpinning literature-based discovery has been heavily demonstrated in the medical domain. However, we wish to investigate the computational complexity of distributional models for literature-based discovery on much larger document collections, as they may provide computationally tractable solutions to tasks including, predicting future disruptive innovations. In this paper we perform a computational complexity analysis on four successful corpus-based distributional models to evaluate their fit for such tasks. Our results indicate that corpus-based distributional models that store their representations in fixed dimensions provide superior efficiency on literature-based discovery tasks.
Resumo:
In vivo small molecules as necessary intermediates are involved in numerous critical metabolic pathways and biological processes associated with many essential biological functions and events. There is growing evidence that MS-based metabolomics is emerging as a powerful tool to facilitate the discovery of functional small molecules that can better our understanding of development, infection, nutrition, disease, toxicity, drug therapeutics, gene modifications and host-pathogen interaction from metabolic perspectives. However, further progress must still be made in MS-based metabolomics because of the shortcomings in the current technologies and knowledge. This technique-driven review aims to explore the discovery of in vivo functional small molecules facilitated by MS-based metabolomics and to highlight the analytic capabilities and promising applications of this discovery strategy. Moreover, the biological significance of the discovery of in vivo functional small molecules with different biological contexts is also interrogated at a metabolic perspective.
Resumo:
Guaranteeing the quality of extracted features that describe relevant knowledge to users or topics is a challenge because of the large number of extracted features. Most popular existing term-based feature selection methods suffer from noisy feature extraction, which is irrelevant to the user needs (noisy). One popular method is to extract phrases or n-grams to describe the relevant knowledge. However, extracted n-grams and phrases usually contain a lot of noise. This paper proposes a method for reducing the noise in n-grams. The method first extracts more specific features (terms) to remove noisy features. The method then uses an extended random set to accurately weight n-grams based on their distribution in the documents and their terms distribution in n-grams. The proposed approach not only reduces the number of extracted n-grams but also improves the performance. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms the state-of-art methods underpinned by Okapi BM25, tf*idf and Rocchio.
Resumo:
Event report on the Open Access and Research 2013 conference which focused on recent developments and the strategic advantages they bring to the research sector.
Resumo:
Automated process discovery techniques aim at extracting process models from information system logs. Existing techniques in this space are effective when applied to relatively small or regular logs, but generate spaghetti-like and sometimes inaccurate models when confronted to logs with high variability. In previous work, trace clustering has been applied in an attempt to reduce the size and complexity of automatically discovered process models. The idea is to split the log into clusters and to discover one model per cluster. This leads to a collection of process models – each one representing a variant of the business process – as opposed to an all-encompassing model. Still, models produced in this way may exhibit unacceptably high complexity and low fitness. In this setting, this paper presents a two-way divide-and-conquer process discovery technique, wherein the discovered process models are split on the one hand by variants and on the other hand hierarchically using subprocess extraction. Splitting is performed in a controlled manner in order to achieve user-defined complexity or fitness thresholds. Experiments on real-life logs show that the technique produces collections of models substantially smaller than those extracted by applying existing trace clustering techniques, while allowing the user to control the fitness of the resulting models.