34 resultados para Knowledge Discovery Tools
Resumo:
Automatic keyword or keyphrase extraction is concerned with assigning keyphrases to documents based on words from within the document. Previous studies have shown that in a significant number of cases author-supplied keywords are not appropriate for the document to which they are attached. This can either be because they represent what the author believes the paper is about not what it actually is, or because they include keyphrases which are more classificatory than explanatory e.g., “University of Poppleton” instead of “Knowledge Discovery in Databases”. Thus, there is a need for a system that can generate appropriate and diverse range of keyphrases that reflect the document. This paper proposes a solution that examines the synonyms of words and phrases in the document to find the underlying themes, and presents these as appropriate keyphrases. The primary method explores taking n-grams of the source document phrases, and examining the synonyms of these, while the secondary considers grouping outputs by their synonyms. The experiments undertaken show the primary method produces good results and that the secondary method produces both good results and potential for future work.
Resumo:
Automatic keyword or keyphrase extraction is concerned with assigning keyphrases to documents based on words from within the document. Previous studies have shown that in a significant number of cases author-supplied keywords are not appropriate for the document to which they are attached. This can either be because they represent what the author believes a paper is about not what it actually is, or because they include keyphrases which are more classificatory than explanatory e.g., “University of Poppleton” instead of “Knowledge Discovery in Databases”. Thus, there is a need for a system that can generate an appropriate and diverse range of keyphrases that reflect the document. This paper proposes two possible solutions that examine the synonyms of words and phrases in the document to find the underlying themes, and presents these as appropriate keyphrases. Using three different freely available thesauri, the work undertaken examines two different methods of producing keywords and compares the outcomes across multiple strands in the timeline. The primary method explores taking n-grams of the source document phrases, and examining the synonyms of these, while the secondary considers grouping outputs by their synonyms. The experiments undertaken show the primary method produces good results and that the secondary method produces both good results and potential for future work. In addition, the different qualities of the thesauri are examined and it is concluded that the more entries in a thesaurus, the better it is likely to perform. The age of the thesaurus or the size of each entry does not correlate to performance.
Resumo:
In the recent years, the area of data mining has been experiencing considerable demand for technologies that extract knowledge from large and complex data sources. There has been substantial commercial interest as well as active research in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from large datasets. Artificial neural networks (NNs) are popular biologically-inspired intelligent methodologies, whose classification, prediction, and pattern recognition capabilities have been utilized successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction, and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks. © 2012 Wiley Periodicals, Inc.
Resumo:
Information systems integration becomes critical in enhancing organisational competitiveness through effective use of information resource provided by the whole host of information systems. Information systems integration in its nature is a process of bringing about the capability of communication and information exchange between systems; while interoperability, often as the result of systems integration, is such a capability. However currently there is a lack of theoretical foundation for representation and measure of the interoperability in organisations. Organisational semiotics provides a theoretical foundation for systems interoperability. A notion of ‘semiotic interoperability’ is proposed in this paper as a paradigm, guiding systems integration and measuring degree of interoperability, covering aspects from physical properties, transmission structure of signs, placing emphasis on communicating meaning, intention to social consequence of information.
Resumo:
In this article, we review the state-of-the-art techniques in mining data streams for mobile and ubiquitous environments. We start the review with a concise background of data stream processing, presenting the building blocks for mining data streams. In a wide range of applications, data streams are required to be processed on small ubiquitous devices like smartphones and sensor devices. Mobile and ubiquitous data mining target these applications with tailored techniques and approaches addressing scarcity of resources and mobility issues. Two categories can be identified for mobile and ubiquitous mining of streaming data: single-node and distributed. This survey will cover both categories. Mining mobile and ubiquitous data require algorithms with the ability to monitor and adapt the working conditions to the available computational resources. We identify the key characteristics of these algorithms and present illustrative applications. Distributed data stream mining in the mobile environment is then discussed, presenting the Pocket Data Mining framework. Mobility of users stimulates the adoption of context-awareness in this area of research. Context-awareness and collaboration are discussed in the Collaborative Data Stream Mining, where agents share knowledge to learn adaptive accurate models.
Resumo:
Twitter is both a micro-blogging service and a platform for public conversation. Direct conversation is facilitated in Twitter through the use of @’s (mentions) and replies. While the conversational element of Twitter is of particular interest to the marketing sector, relatively few data-mining studies have focused on this area. We analyse conversations associated with reciprocated mentions that take place in a data-set consisting of approximately 4 million tweets collected over a period of 28 days that contain at least one mention. We ignore tweet content and instead use the mention network structure and its dynamical properties to identify and characterise Twitter conversations between pairs of users and within larger groups. We consider conversational balance, meaning the fraction of content contributed by each party. The goal of this work is to draw out some of the mechanisms driving conversation in Twitter, with the potential aim of developing conversational models.
Resumo:
Flexibility of information systems (IS) have been studied to improve the adaption in support of the business agility as the set of capabilities to compete more effectively and adapt to rapid changes in market conditions (Glossary of business agility terms, 2003). However, most of work on IS flexibility has been limited to systems architecture, ignoring the analysis of interoperability as a part of flexibility from the requirements. This paper reports a PhD project, which proposes an approach to develop IS with flexibility features, considering some challenges of flexibility in small and medium enterprises (SMEs) such as the lack of interoperability and the agility of their business. The motivation of this research are the high prices of IS in developing countries and the usefulness of organizational semiotics to support the analysis of requirements for IS. (Liu, 2005).
Resumo:
More data will be produced in the next five years than in the entire history of human kind, a digital deluge that marks the beginning of the Century of Information. Through a year-long consultation with UK researchers, a coherent strategy has been developed, which will nurture Century-of-Information Research (CIR); it crystallises the ideas developed by the e-Science Directors' Forum Strategy Working Group. This paper is an abridged version of their latest report which can be found at: http://wikis.nesc.ac.uk/escienvoy/Century_of_Information_Research_Strategy which also records the consultation process and the affiliations of the authors. This document is derived from a paper presented at the Oxford e-Research Conference 2008 and takes into account suggestions made in the ensuing panel discussion. The goals of the CIR Strategy are to facilitate the growth of UK research and innovation that is data and computationally intensive and to develop a new culture of 'digital-systems judgement' that will equip research communities, businesses, government and society as a whole, with the skills essential to compete and prosper in the Century of Information. The CIR Strategy identifies a national requirement for a balanced programme of coordination, research, infrastructure, translational investment and education to empower UK researchers, industry, government and society. The Strategy is designed to deliver an environment which meets the needs of UK researchers so that they can respond agilely to challenges, can create knowledge and skills, and can lead new kinds of research. It is a call to action for those engaged in research, those providing data and computational facilities, those governing research and those shaping education policies. The ultimate aim is to help researchers strengthen the international competitiveness of the UK research base and increase its contribution to the economy. The objectives of the Strategy are to better enable UK researchers across all disciplines to contribute world-leading fundamental research; to accelerate the translation of research into practice; and to develop improved capabilities, facilities and context for research and innovation. It envisages a culture that is better able to grasp the opportunities provided by the growing wealth of digital information. Computing has, of course, already become a fundamental tool in all research disciplines. The UK e-Science programme (2001-06)—since emulated internationally—pioneered the invention and use of new research methods, and a new wave of innovations in digital-information technologies which have enabled them. The Strategy argues that the UK must now harness and leverage its own, plus the now global, investment in digital-information technology in order to spread the benefits as widely as possible in research, education, industry and government. Implementing the Strategy would deliver the computational infrastructure and its benefits as envisaged in the Science & Innovation Investment Framework 2004-2014 (July 2004), and in the reports developing those proposals. To achieve this, the Strategy proposes the following actions: support the continuous innovation of digital-information research methods; provide easily used, pervasive and sustained e-Infrastructure for all research; enlarge the productive research community which exploits the new methods efficiently; generate capacity, propagate knowledge and develop skills via new curricula; and develop coordination mechanisms to improve the opportunities for interdisciplinary research and to make digital-infrastructure provision more cost effective. To gain the best value for money strategic coordination is required across a broad spectrum of stakeholders. A coherent strategy is essential in order to establish and sustain the UK as an international leader of well-curated national data assets and computational infrastructure, which is expertly used to shape policy, support decisions, empower researchers and to roll out the results to the wider benefit of society. The value of data as a foundation for wellbeing and a sustainable society must be appreciated; national resources must be more wisely directed to the collection, curation, discovery, widening access, analysis and exploitation of these data. Every researcher must be able to draw on skills, tools and computational resources to develop insights, test hypotheses and translate inventions into productive use, or to extract knowledge in support of governmental decision making. This foundation plus the skills developed will launch significant advances in research, in business, in professional practice and in government with many consequent benefits for UK citizens. The Strategy presented here addresses these complex and interlocking requirements.
Resumo:
This paper aims to introduce a knowledge-based managemental prototype entitled Eþ for environmental-conscious construction relied on an integration of current environmental management tools in construction area. The overall objective of developing the Eþ prototype is to facilitate selectively reusing the retrievable knowledge in construction engineering and management areas assembled from previous projects for the best practice in environmental-conscious construction. The methodologies adopted in previous and ongoing research related to the development of the Eþ belong to the operations research area and the information technology area, including literature review, questionnaire survey and interview, statistical analysis, system analysis and development, experimental research and simulation, and so on. The content presented in this paper includes an advanced Eþ prototype, a comprehensive review of environmental management tools integrated to the Eþ prototype, and an experimental case study of the implementation of the Eþ prototype. It is expected that the adoption and implementation of the Eþ prototype can effectively facilitate contractors to improve their environmental performance in the lifecycle of projectbased construction and to reduce adverse environmental impacts due to the deployment of various engineering and management processes at each construction stage.
Resumo:
This article considers how visual practices are used to manage knowledge in project-based work. It compares project-based work in a capital goods manufacturer and an architectural firm. Visual representations are used extensively in both cases, but the nature of visual practice differs significantly between the two. The research explores the kinds of knowledge that are (and aren't) developed and made visible in strategizing and planning activities. For example, whereas the emphasis of project-based work in the former firm is on exploitation of knowledge and it visualizes its project context largely in commercial and processual terms, the emphasis in the latter is on exploration and it uses a wide range of visual materials to understand physical interdependencies across the project boundary. We contend particular kinds of visual tools can help project teams step between exploration and exploitation within a project, and articulate the types of representations, foci of attention and patterns of interaction involved. The findings suggest that business managers can make more deliberate choices about how knowledge is made visible, and can change visual practice to align the project with exploring and exploiting opportunities. It raises the question: What don't you see within your organization? The work contributes to academic debates about managing through projects, strategising and organizing, while the focus on visual representation disrupts the tacit-codified dichotomy in the broad debate on knowledge and learning, and highlights the craft skills central to strategizing and organizing.
Resumo:
There has been a clear lack of common data exchange semantics for inter-organisational workflow management systems where the research has mainly focused on technical issues rather than language constructs. This paper presents the neutral data exchanges semantics required for the workflow integration within the AXAEDIS framework and presents the mechanism for object discovery from the object repository where little or no knowledge about the object is available. The paper also presents workflow independent integration architecture with the AXAEDIS Framework.
Resumo:
Routine computer tasks are often difficult for older adult computer users to learn and remember. People tend to learn new tasks by relating new concepts to existing knowledge. However, even for 'basic' computer tasks there is little, if any, existing knowledge on which older adults can base their learning. This paper investigates a custom file management interface that was designed to aid discovery and learnability by providing interface objects that are familiar to the user. A study was conducted which examined the differences between older and younger computer users when undertaking routine file management tasks using the standard Windows desktop as compared with the custom interface. Results showed that older adult computer users requested help more than ten times as often as younger users when using a standard windows/mouse configuration, made more mistakes and also required significantly more confirmations than younger users. The custom interface showed improvements over standard Windows/mouse, with fewer confirmations and less help being required. Hence, there is potential for an interface that closely mimics the real world to improve computer accessibility for older adults, aiding self-discovery and learnability.
Resumo:
There are a number of challenges associated with managing knowledge and information in construction organizations delivering major capital assets. These include the ever-increasing volumes of information, losing people because of retirement or competitors, the continuously changing nature of information, lack of methods on eliciting useful knowledge, development of new information technologies and changes in management and innovation practices. Existing tools and methodologies for valuing intangible assets in fields such as engineering, project management and financial, accounting, do not address fully the issues associated with the valuation of information and knowledge. Information is rarely recorded in a way that a document can be valued, when either produced or subsequently retrieved and re-used. In addition there is a wealth of tacit personal knowledge which, if codified into documentary information, may prove to be very valuable to operators of the finished asset or future designers. This paper addresses the problem of information overload and identifies the differences between data, information and knowledge. An exploratory study was conducted with a leading construction consultant examining three perspectives (business, project management and document management) by structured interviews and specifically how to value information in practical terms. Major challenges in information management are identified. An through-life Information Evaluation methodology (IEM) is presented to reduce information overload and to make the information more valuable in the future.
Resumo:
This paper presents a hierarchical clustering method for semantic Web service discovery. This method aims to improve the accuracy and efficiency of the traditional service discovery using vector space model. The Web service is converted into a standard vector format through the Web service description document. With the help of WordNet, a semantic analysis is conducted to reduce the dimension of the term vector and to make semantic expansion to meet the user’s service request. The process and algorithm of hierarchical clustering based semantic Web service discovery is discussed. Validation is carried out on the dataset.