846 resultados para Open Information Extraction
Resumo:
Introduction In a connected world youth are participating in digital content creating communities. This paper introduces a description of teens' information practices in digital content creating and sharing communities. Method The research design was a constructivist grounded theory methodology. Seventeen interviews with eleven teens were collected and observation of their digital communities occurred over a two-year period. Analysis The data were analysed iteratively to describe teens' interactions with information through open and then focused coding. Emergent categories were shared with participants to confirm conceptual categories. Focused coding provided connections between conceptual categories resulting in the theory, which was also shared with participants for feedback. Results The paper posits a substantive theory of teens' information practices as they create and share content. It highlights that teens engage in the information actions of accessing, evaluating, and using information. They experienced information in five ways: participation, information, collaboration, process, and artefact. The intersection of enacting information actions and experiences of information resulted in five information practices: learning community, negotiating aesthetic, negotiating control, negotiating capacity, and representing knowledge. Conclusion This study contributes to our understanding of youth information actions, experiences, and practices. Further research into these communities might indicate what information practices are foundational to digital communities.
Resumo:
Term-based approaches can extract many features in text documents, but most include noise. Many popular text-mining strategies have been adapted to reduce noisy information from extracted features; however, text-mining techniques suffer from low frequency. The key issue is how to discover relevance features in text documents to fulfil user information needs. To address this issue, we propose a new method to extract specific features from user relevance feedback. The proposed approach includes two stages. The first stage extracts topics (or patterns) from text documents to focus on interesting topics. In the second stage, topics are deployed to lower level terms to address the low-frequency problem and find specific terms. The specific terms are determined based on their appearances in relevance feedback and their distribution in topics or high-level patterns. We test our proposed method with extensive experiments in the Reuters Corpus Volume 1 dataset and TREC topics. Results show that our proposed approach significantly outperforms the state-of-the-art models.
Resumo:
Many emerging economies are dangling the patent system to stimulate bio-technological innovations with the ultimate premise that these will improve their economic and social growth. The patent system mandates full disclosure of the patented invention in exchange of a temporary exclusive patent right. Recently, however, patent offices have fallen short of complying with such a mandate, especially for genetic inventions. Most patent offices provide only static information about disclosed patent sequences and even some do not keep track of the sequence listing data in their own database. The successful partnership of QUT Library and Cambia exemplifies advocacy in Open Access, Open Innovation and User Participation. The library extends its services to various departments within the university, builds and encourages research networks to complement skills needed to make a contribution in the real world.
Resumo:
Event report on the Open Access and Research 2013 conference which focused on recent developments and the strategic advantages they bring to the research sector.
Resumo:
Enterprises, both public and private, have rapidly commenced using the benefits of enterprise resource planning (ERP) combined with business analytics and “open data sets” which are often outside the control of the enterprise to gain further efficiencies, build new service operations and increase business activity. In many cases, these business activities are based around relevant software systems hosted in a “cloud computing” environment. “Garbage in, garbage out”, or “GIGO”, is a term long used to describe problems in unqualified dependency on information systems, dating from the 1960s. However, a more pertinent variation arose sometime later, namely “garbage in, gospel out” signifying that with large scale information systems, such as ERP and usage of open datasets in a cloud environment, the ability to verify the authenticity of those data sets used may be almost impossible, resulting in dependence upon questionable results. Illicit data set “impersonation” becomes a reality. At the same time the ability to audit such results may be an important requirement, particularly in the public sector. This paper discusses the need for enhancement of identity, reliability, authenticity and audit services, including naming and addressing services, in this emerging environment and analyses some current technologies that are offered and which may be appropriate. However, severe limitations to addressing these requirements have been identified and the paper proposes further research work in the area.
Resumo:
Enterprise resource planning (ERP) systems are rapidly being combined with “big data” analytics processes and publicly available “open data sets”, which are usually outside the arena of the enterprise, to expand activity through better service to current clients as well as identifying new opportunities. Moreover, these activities are now largely based around relevant software systems hosted in a “cloud computing” environment. However, the over 50- year old phrase related to mistrust in computer systems, namely “garbage in, garbage out” or “GIGO”, is used to describe problems of unqualified and unquestioning dependency on information systems. However, a more relevant GIGO interpretation arose sometime later, namely “garbage in, gospel out” signifying that with large scale information systems based around ERP and open datasets as well as “big data” analytics, particularly in a cloud environment, the ability to verify the authenticity and integrity of the data sets used may be almost impossible. In turn, this may easily result in decision making based upon questionable results which are unverifiable. Illicit “impersonation” of and modifications to legitimate data sets may become a reality while at the same time the ability to audit any derived results of analysis may be an important requirement, particularly in the public sector. The pressing need for enhancement of identity, reliability, authenticity and audit services, including naming and addressing services, in this emerging environment is discussed in this paper. Some current and appropriate technologies currently being offered are also examined. However, severe limitations in addressing the problems identified are found and the paper proposes further necessary research work for the area. (Note: This paper is based on an earlier unpublished paper/presentation “Identity, Addressing, Authenticity and Audit Requirements for Trust in ERP, Analytics and Big/Open Data in a ‘Cloud’ Computing Environment: A Review and Proposal” presented to the Department of Accounting and IT, College of Management, National Chung Chen University, 20 November 2013.)
Resumo:
In this Article, Petia Wohed, Arthur H. M. ter Hofstede, Nick Russell, Birger Andersson, and Wil M. P. van der Aalst present the results of their examination of existing open source BPM systems. Their conclusions are illuminating for both Open Source developers as well as the user community. Read their Article for the details of their study.
Resumo:
We identify relation completion (RC) as one recurring problem that is central to the success of novel big data applications such as Entity Reconstruction and Data Enrichment. Given a semantic relation, RC attempts at linking entity pairs between two entity lists under the relation. To accomplish the RC goals, we propose to formulate search queries for each query entity α based on some auxiliary information, so that to detect its target entity β from the set of retrieved documents. For instance, a pattern-based method (PaRE) uses extracted patterns as the auxiliary information in formulating search queries. However, high-quality patterns may decrease the probability of finding suitable target entities. As an alternative, we propose CoRE method that uses context terms learned surrounding the expression of a relation as the auxiliary information in formulating queries. The experimental results based on several real-world web data collections demonstrate that CoRE reaches a much higher accuracy than PaRE for the purpose of RC.
Resumo:
The purpose of this book is to open a conversation on the idea of information experience, which we understand to be a complex, multidimensional engagement with information. In developing the book we invited colleagues to propose a chapter on any aspect of information experience, for example conceptual, methodological or empirical. We invited them to express their interpretation of information experience, to contribute to the development of this concept. The book has thus become a vehicle for interested researchers and practitioners to explore their thinking around information experience, including relationships between information experience, learning experience, user experience and similar constructs. It represents a collective awareness of information experience in contemporary research and practice. Through this sharing of multiple perspectives, our insights into possible ways of interpreting information experience, and its relationship to other concepts in information research and practice, is enhanced. In this chapter, we introduce the idea of information experience. We also outline the book and its chapters, and bring together some emerging alternative views and approaches to this important idea.
Resumo:
Organic compounds in Australian coal seam gas produced water (CSG water) are poorly understood despite their environmental contamination potential. In this study, the presence of some organic substances is identified from government-held CSG water-quality data from the Bowen and Surat Basins, Queensland. These records revealed the presence of polycyclic aromatic hydrocarbons (PAHs) in 27% of samples of CSG water from the Walloon Coal Measures at concentrations <1 µg/L, and it is likely these compounds leached from in situ coals. PAHs identified from wells include naphthalene, phenanthrene, chrysene and dibenz[a,h]anthracene. In addition, the likelihood of coal-derived organic compounds leaching to groundwater is assessed by undertaking toxicity leaching experiments using coal rank and water chemistry as variables. These tests suggest higher molecular weight PAHs (including benzo[a]pyrene) leach from higher rank coals, whereas lower molecular weight PAHs leach at greater concentrations from lower rank coal. Some of the identified organic compounds have carcinogenic or health risk potential, but they are unlikely to be acutely toxic at the observed concentrations which are almost negligible (largely due to the hydrophobicity of such compounds). Hence, this study will be useful to practitioners assessing CSG water related environmental and health risk.
Resumo:
Text is the main method of communicating information in the digital age. Messages, blogs, news articles, reviews, and opinionated information abounds on the Internet. People commonly purchase products online and post their opinions about purchased items. This feedback is displayed publicly to assist others with their purchasing decisions, creating the need for a mechanism with which to extract and summarize useful information for enhancing the decision-making process. Our contribution is to improve the accuracy of extraction by combining different techniques from three major areas, named Data Mining, Natural Language Processing techniques and Ontologies. The proposed framework sequentially mines product’s aspects and users’ opinions, groups representative aspects by similarity, and generates an output summary. This paper focuses on the task of extracting product aspects and users’ opinions by extracting all possible aspects and opinions from reviews using natural language, ontology, and frequent “tag” sets. The proposed framework, when compared with an existing baseline model, yielded promising results.
Resumo:
Media architecture’s combination of the digital and the physical can trigger, enhance, and amplify urban experiences. In this paper, we examine how to bring about and foster more open and participatory approaches to engage communities through media architecture by identifying novel ways to put some of the creative process into the hands of laypeople. We review technical, spatial, and social aspects of DIY phenomena with a view to better understand maker cultures, communities, and practices. We synthesise our findings and ask if and how media architects as a community of practice can encourage the ‘open-sourcing’ of information and tools allowing laypeople to not only participate but become active instigators of change in their own right. We argue that enabling true DIY practices in media architecture may increase citizen control. Seeking design strategies that foster DIY approaches, we propose five areas for further work and investigation. The paper begs many questions indicating ample room for further research into DIY Media Architecture.
Resumo:
It is well established that the traditional taxonomy and nomenclature of Chironomidae relies on adult males whose usually characteristic genitalia provide evidence of species distinction. In the early days some names were based on female adults of variable distinctiveness – but females are difficult to identify (Ekrem et al. 2010) and many of these names remain dubious. In Russia especially, a system based on larval morphology grew in parallel to the conventional adult-based system. The systems became reconciled with the studies that underlay the production of the Holarctic generic keys to Chironomidae, commencing notably with the larval volume (Wiederholm, 1983). Ever since Thienemann’s pioneering studies, it has been evident that the pupa, notably the cast skins (exuviae) provide a wealth of features that can aid in identification (e.g. Wiederholm, 1986). Furthermore, the pupae can be readily associated with name-bearing adults when a pharate (‘cloaked’) adult stage is visible within the pupa. Association of larvae with the name-bearing later stages has been much more difficult, time-consuming and fraught with risk of failure. Yet it is identification of the larval stage that is needed by most applied researchers due to the value of the immature stages of the family in aquatic monitoring for water quality, although the pupal stage also has advocates (reviewed by Sinclair & Gresens, 2008). Few use the adult stage for such purposes as their provenance and association with the water body can be verified only by emergence trapping, and sampling of adults lies outside regular aquatic monitoring protocols.
Resumo:
Critical stage in open-pit mining is to determine the optimal extraction sequence of blocks, which has significant impacts on mining profitability. In this paper, a more comprehensive block sequencing optimisation model is developed for the open-pit mines. In the model, material characteristics of blocks, grade control, excavator and block sequencing are investigated and integrated to maximise the short-term benefit of mining. Several case studies are modeled and solved by CPLEX MIP and CP engines. Numerical investigations are presented to illustrate and validate the proposed methodology.