951 resultados para N-Gram Mutual Information
Resumo:
This chapter presents the preliminary findings of a qualitative study exploring people’s information experiences during the 2012 Queensland State election in Australia. Six residents of South East Queensland who were eligible to vote in the state election participated in a semi-structured interview. The interviews revealed five themes that depict participants’ information experience during the election: information sources, information flow, personal politics, party politics and sense making. Together these themes represent what is experienced as information, how information is experienced, as well as contextual aspects that were unique to voting in an election. The study outlined here is one in an emerging area of enquiry that has explored information experience as a research object. This study has revealed that people’s information experiences are rich, complex and dynamic, and that information experience as a construct of scholarly inquiry provides deep insights into the ways in which people relate to their information worlds. More studies exploring information experience within different contexts are needed to help develop our theoretical understanding of this important and emerging construct.
Resumo:
This thesis is an investigation of the media's representation of children and ICT. The study draws on moral panic theory and Queensland newspaper media, to identify the impact of newspaper reporting on the public's perceptions of young people and ICT.
Resumo:
This thesis considers how an information privacy system can and should develop in Libya. Currently, no information privacy system exists in Libya to protect individuals when their data is processed. This research reviews the main features of privacy law in several key jurisdictions in light of Libya's social, cultural, and economic context. The thesis identifies the basic principles that a Libyan privacy law must consider, including issues of scope, exceptions, principles, remedies, penalties, and the establishment of a legitimate data protection authority. This thesis concludes that Libya should adopt a strong information privacy law framework and highlights some of the considerations that will be relevant for the Libyan legislature.
Resumo:
Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, which has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering is rarely known. Patterns are always thought to be more representative than single terms for representing documents. In this paper, a novel information filtering model, Pattern-based Topic Model(PBTM) , is proposed to represent the text documents not only using the topic distributions at general level but also using semantic pattern representations at detailed specific level, both of which contribute to the accurate document representation and document relevance ranking. Extensive experiments are conducted to evaluate the effectiveness of PBTM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model achieves outstanding performance.
Resumo:
The need for native Information Systems (IS) theories has been discussed by several prominent scholars. Contributing to their conjectural discussion, this research moves towards theorizing IS success as a native theory for the discipline. Despite being one of the most cited scholarly works to-date, IS success of DeLone and McLean (1992) has been criticized by some for lacking focus on the theoretical approach. Following theory development frameworks, this study improves the theoretical standing of IS success by minimizing interaction and inconsistency. The empirical investigation of theorizing IS success includes 1396 respondents, gathered through six surveys and a case study. The respondents represent 70 organisations, multiple Information Systems, and both private and public sector organizations.
Resumo:
The Control Theory has provided a useful theoretical foundation for Information Systems development outsourcing (ISD-outsourcing) to examine the co-ordination between the client and the vendor. Recent research identified two control mechanisms: structural (structure of the control mode) and process (the process through which the control mode is enacted). Yet, the Control Theory research to-date does not describe the ways in which the two control mechanisms can be combined to ensure project success. Grounded in case study data of eight ISD-outsourcing projects, we derive three ‘control configurations’; i) aligned, ii) negotiated, and 3) self-managed, which describe the combinative patterns of structural and process control mechanisms within and across control modes.
Resumo:
This study explored the creation, dissemination and exchange of electronic word of mouth, in the form of product reviews and ratings of digital technology products. Based on 43 in-depth interviews and 500 responses to an online survey, it reveals a new communication model describing consumers' info-active and info-passive information search styles. The study delivers an in-depth understanding of consumers' attitudes towards current advertising tools and user-generated content, and points to new marketing techniques emerging in the online environment.
Resumo:
Most recommender systems attempt to use collaborative filtering, content-based filtering or hybrid approach to recommend items to new users. Collaborative filtering recommends items to new users based on their similar neighbours, and content-based filtering approach tries to recommend items that are similar to new users' profiles. The fundamental issues include how to profile new users, and how to deal with the over-specialization in content-based recommender systems. Indeed, the terms used to describe items can be formed as a concept hierarchy. Therefore, we aim to describe user profiles or information needs by using concepts vectors. This paper presents a new method to acquire user information needs, which allows new users to describe their preferences on a concept hierarchy rather than rating items. It also develops a new ranking function to recommend items to new users based on their information needs. The proposed approach is evaluated on Amazon book datasets. The experimental results demonstrate that the proposed approach can largely improve the effectiveness of recommender systems.
Resumo:
Big Data presents many challenges related to volume, whether one is interested in studying past datasets or, even more problematically, attempting to work with live streams of data. The most obvious challenge, in a ‘noisy’ environment such as contemporary social media, is to collect the pertinent information; be that information for a specific study, tweets which can inform emergency services or other responders to an ongoing crisis, or give an advantage to those involved in prediction markets. Often, such a process is iterative, with keywords and hashtags changing with the passage of time, and both collection and analytic methodologies need to be continually adapted to respond to this changing information. While many of the data sets collected and analyzed are preformed, that is they are built around a particular keyword, hashtag, or set of authors, they still contain a large volume of information, much of which is unnecessary for the current purpose and/or potentially useful for future projects. Accordingly, this panel considers methods for separating and combining data to optimize big data research and report findings to stakeholders. The first paper considers possible coding mechanisms for incoming tweets during a crisis, taking a large stream of incoming tweets and selecting which of those need to be immediately placed in front of responders, for manual filtering and possible action. The paper suggests two solutions for this, content analysis and user profiling. In the former case, aspects of the tweet are assigned a score to assess its likely relationship to the topic at hand, and the urgency of the information, whilst the latter attempts to identify those users who are either serving as amplifiers of information or are known as an authoritative source. Through these techniques, the information contained in a large dataset could be filtered down to match the expected capacity of emergency responders, and knowledge as to the core keywords or hashtags relating to the current event is constantly refined for future data collection. The second paper is also concerned with identifying significant tweets, but in this case tweets relevant to particular prediction market; tennis betting. As increasing numbers of professional sports men and women create Twitter accounts to communicate with their fans, information is being shared regarding injuries, form and emotions which have the potential to impact on future results. As has already been demonstrated with leading US sports, such information is extremely valuable. Tennis, as with American Football (NFL) and Baseball (MLB) has paid subscription services which manually filter incoming news sources, including tweets, for information valuable to gamblers, gambling operators, and fantasy sports players. However, whilst such services are still niche operations, much of the value of information is lost by the time it reaches one of these services. The paper thus considers how information could be filtered from twitter user lists and hash tag or keyword monitoring, assessing the value of the source, information, and the prediction markets to which it may relate. The third paper examines methods for collecting Twitter data and following changes in an ongoing, dynamic social movement, such as the Occupy Wall Street movement. It involves the development of technical infrastructure to collect and make the tweets available for exploration and analysis. A strategy to respond to changes in the social movement is also required or the resulting tweets will only reflect the discussions and strategies the movement used at the time the keyword list is created — in a way, keyword creation is part strategy and part art. In this paper we describe strategies for the creation of a social media archive, specifically tweets related to the Occupy Wall Street movement, and methods for continuing to adapt data collection strategies as the movement’s presence in Twitter changes over time. We also discuss the opportunities and methods to extract data smaller slices of data from an archive of social media data to support a multitude of research projects in multiple fields of study. The common theme amongst these papers is that of constructing a data set, filtering it for a specific purpose, and then using the resulting information to aid in future data collection. The intention is that through the papers presented, and subsequent discussion, the panel will inform the wider research community not only on the objectives and limitations of data collection, live analytics, and filtering, but also on current and in-development methodologies that could be adopted by those working with such datasets, and how such approaches could be customized depending on the project stakeholders.
Resumo:
Guaranteeing the quality of extracted features that describe relevant knowledge to users or topics is a challenge because of the large number of extracted features. Most popular existing term-based feature selection methods suffer from noisy feature extraction, which is irrelevant to the user needs (noisy). One popular method is to extract phrases or n-grams to describe the relevant knowledge. However, extracted n-grams and phrases usually contain a lot of noise. This paper proposes a method for reducing the noise in n-grams. The method first extracts more specific features (terms) to remove noisy features. The method then uses an extended random set to accurately weight n-grams based on their distribution in the documents and their terms distribution in n-grams. The proposed approach not only reduces the number of extracted n-grams but also improves the performance. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms the state-of-art methods underpinned by Okapi BM25, tf*idf and Rocchio.
Resumo:
Disagreement within the global science community about the certainty and causes of climate change has led the general public to question what to believe and who to trust on matters related to this issue. This paper reports on qualitative research undertaken with Australian residents from two rural areas to explore their perceptions of climate change and trust in information providers. While overall, residents tended to agree that climate change is a reality, perceptions varied in terms of its causes and how best to address it. Politicians, government, and the media were described as untrustworthy sources of information about climate change, with independent scientists being the most trusted. The vested interests of information providers appeared to be a key reason for their distrust. The findings highlight the importance of improved transparency and consultation with the public when communicating information about climate change and related policies.
Resumo:
The overall aim of this research project was to provide a broader range of value propositions (beyond upfront traditional construction costs) that could transform both the demand side and supply side of the housing industry. The project involved gathering information about how building information is created, used and communicated and classifying building information, leading to the formation of an Information Flow Chart and Stakeholder Relationship Map. These were then tested via broad housing industry focus groups and surveys. The project revealed four key relationships that appear to operate in isolation to the whole housing sector and may have significant impact on the sustainability outcomes and life cycle costs of dwellings over their life cycle. It also found that although a lot of information about individual dwellings does already exist, this information is not coordinated or inventoried in any systematic manner and that national building information files of building passports would present value to a wide range of stakeholders.
Resumo:
Many mature term-based or pattern-based approaches have been used in the field of information filtering to generate users’ information needs from a collection of documents. A fundamental assumption for these approaches is that the documents in the collection are all about one topic. However, in reality users’ interests can be diverse and the documents in the collection often involve multiple topics. Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, and this has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering has not been so well explored. Patterns are always thought to be more discriminative than single terms for describing documents. However, the enormous amount of discovered patterns hinder them from being effectively and efficiently used in real applications, therefore, selection of the most discriminative and representative patterns from the huge amount of discovered patterns becomes crucial. To deal with the above mentioned limitations and problems, in this paper, a novel information filtering model, Maximum matched Pattern-based Topic Model (MPBTM), is proposed. The main distinctive features of the proposed model include: (1) user information needs are generated in terms of multiple topics; (2) each topic is represented by patterns; (3) patterns are generated from topic models and are organized in terms of their statistical and taxonomic features, and; (4) the most discriminative and representative patterns, called Maximum Matched Patterns, are proposed to estimate the document relevance to the user’s information needs in order to filter out irrelevant documents. Extensive experiments are conducted to evaluate the effectiveness of the proposed model by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model significantly outperforms both state-of-the-art term-based models and pattern-based models
Resumo:
Much of what is written about digital technologies in preschool contexts focuses on young children’s acquisition of skills rather than their meaning-making during use of technologies. In this paper, we consider how the viewing of a YouTube video was used by a teacher and children to produce shared understandings about it. Conversation analysis of talk and interaction during the viewing of the video establishes some of the ways that individual accounts of events were produced for others and then endorsed as shared understandings. The analysis establishes how adults and children made use of verbal and embodied actions during interactions to produce shared understandings of the YouTube video, the events it recorded and written commentary about those events
Resumo:
Facilitated discussion with early childhood staff working with children and families affected by natural disasters in Queensland, Australia, raises issues regarding educational communication in emergencies. This paper reports on these discussions as ‘reflections on talk’. It examines discrepancies between the literature and staff talk, gaps in the literature, and the inaccessible style of some literature-demanded collaborative debate and information re-interpretation. Reframing of the discourse style was used to support staff de-briefing, mutual encouragement, and sharing of insights on promoting resilience in children and families. Formal investigation is required regarding effective emergency-situation talk between staff, as well as with children and families.