314 resultados para databases and data mining


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several authors stress the importance of data’s crucial foundation for operational, tactical and strategic decisions (e.g., Redman 1998, Tee et al. 2007). Data provides the basis for decision making as data collection and processing is typically associated with reducing uncertainty in order to make more effective decisions (Daft and Lengel 1986). While the first series of investments of Information Systems/Information Technology (IS/IT) into organizations improved data collection, restricted computational capacity and limited processing power created challenges (Simon 1960). Fifty years on, capacity and processing problems are increasingly less relevant; in fact, the opposite exists. Determining data relevance and usefulness is complicated by increased data capture and storage capacity, as well as continual improvements in information processing capability. As the IT landscape changes, businesses are inundated with ever-increasing volumes of data from both internal and external sources available on both an ad-hoc and real-time basis. More data, however, does not necessarily translate into more effective and efficient organizations, nor does it increase the likelihood of better or timelier decisions. This raises questions about what data managers require to assist their decision making processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discovering proper search intents is a vi- tal process to return desired results. It is constantly a hot research topic regarding information retrieval in recent years. Existing methods are mainly limited by utilizing context-based mining, query expansion, and user profiling techniques, which are still suffering from the issue of ambiguity in search queries. In this pa- per, we introduce a novel ontology-based approach in terms of a world knowledge base in order to construct personalized ontologies for identifying adequate con- cept levels for matching user search intents. An iter- ative mining algorithm is designed for evaluating po- tential intents level by level until meeting the best re- sult. The propose-to-attempt approach is evaluated in a large volume RCV1 data set, and experimental results indicate a distinct improvement on top precision after compared with baseline models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the last few years we have observed a proliferation of approaches for clustering XML docu- ments and schemas based on their structure and content. The presence of such a huge amount of approaches is due to the different applications requiring the XML data to be clustered. These applications need data in the form of similar contents, tags, paths, structures and semantics. In this paper, we first outline the application contexts in which clustering is useful, then we survey approaches so far proposed relying on the abstract representation of data (instances or schema), on the identified similarity measure, and on the clustering algorithm. This presentation leads to draw a taxonomy in which the current approaches can be classified and compared. We aim at introducing an integrated view that is useful when comparing XML data clustering approaches, when developing a new clustering algorithm, and when implementing an XML clustering compo- nent. Finally, the paper moves into the description of future trends and research issues that still need to be faced.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nowadays, everyone can effortlessly access a range of information on the World Wide Web (WWW). As information resources on the web continue to grow tremendously, it becomes progressively more difficult to meet high expectations of users and find relevant information. Although existing search engine technologies can find valuable information, however, they suffer from the problems of information overload and information mismatch. This paper presents a hybrid Web Information Retrieval approach allowing personalised search using ontology, user profile and collaborative filtering. This approach finds the context of user query with least user’s involvement, using ontology. Simultaneously, this approach uses time-based automatic user profile updating with user’s changing behaviour. Subsequently, this approach uses recommendations from similar users using collaborative filtering technique. The proposed method is evaluated with the FIRE 2010 dataset and manually generated dataset. Empirical analysis reveals that Precision, Recall and F-Score of most of the queries for many users are improved with proposed method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The existing Collaborative Filtering (CF) technique that has been widely applied by e-commerce sites requires a large amount of ratings data to make meaningful recommendations. It is not directly applicable for recommending products that are not frequently purchased by users, such as cars and houses, as it is difficult to collect rating data for such products from the users. Many of the e-commerce sites for infrequently purchased products are still using basic search-based techniques whereby the products that match with the attributes given in the target user's query are retrieved and recommended to the user. However, search-based recommenders cannot provide personalized recommendations. For different users, the recommendations will be the same if they provide the same query regardless of any difference in their online navigation behaviour. This paper proposes to integrate collaborative filtering and search-based techniques to provide personalized recommendations for infrequently purchased products. Two different techniques are proposed, namely CFRRobin and CFAg Query. Instead of using the target user's query to search for products as normal search based systems do, the CFRRobin technique uses the products in which the target user's neighbours have shown interest as queries to retrieve relevant products, and then recommends to the target user a list of products by merging and ranking the returned products using the Round Robin method. The CFAg Query technique uses the products that the user's neighbours have shown interest in to derive an aggregated query, which is then used to retrieve products to recommend to the target user. Experiments conducted on a real e-commerce dataset show that both the proposed techniques CFRRobin and CFAg Query perform better than the standard Collaborative Filtering (CF) and the Basic Search (BS) approaches, which are widely applied by the current e-commerce applications. The CFRRobin and CFAg Query approaches also outperform the e- isting query expansion (QE) technique that was proposed for recommending infrequently purchased products.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is a big challenge to clearly identify the boundary between positive and negative streams. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on RCV1, and substantial experiments show that the proposed approach achieves encouraging performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data quality has become a major concern for organisations. The rapid growth in the size and technology of a databases and data warehouses has brought significant advantages in accessing, storing, and retrieving information. At the same time, great challenges arise with rapid data throughput and heterogeneous accesses in terms of maintaining high data quality. Yet, despite the importance of data quality, literature has usually condensed data quality into detecting and correcting poor data such as outliers, incomplete or inaccurate values. As a result, organisations are unable to efficiently and effectively assess data quality. Having an accurate and proper data quality assessment method will enable users to benchmark their systems and monitor their improvement. This paper introduces a granules mining for measuring the random degree of error data which will enable decision makers to conduct accurate quality assessment and allocate the most severe data, thereby providing an accurate estimation of human and financial resources for conducting quality improvement tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The sky is falling because the much-vaunted mining ‘boom’ is heading for ‘bust’. The fear-mongering by politicians, the industry and the media has begun in earnest. On ABC TV's 7:30 program on 22 August 2012, Federal Opposition Leader Tony Abbott blamed the Minerals Resource Rent Tax and the Carbon Tax for making ‘a bad investment environment much, much worse’ for the mining industry. The following day, Australia's Resources and Energy Minister Martin Ferguson told us on ABC radio that ‘the resources boom is over’. This must be true because, remember, we were warned to ‘Get ready for the end of the boom’ (David Uren, Economics Editor for The Australian 19 May 2012) due to the ‘Australian resource boom losing steam’ (David Winning & Robb M. Stewart, Wall Street Journal 21 August 2012). Besides, there is ‘unarguable evidence’ that Australia's production costs are ‘too expensive’ and ‘too uncompetitive’: mining magnate Gina Rinehart said so in a YouTube video placed on the Sydney Mining Club's website on 5 September 2012. Can this really be so? What is happening to the mining boom and to the people who depend upon it?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The interactive art system +-now is modelled on the openness of the natural world. Emergent shapes constitute a novel method for facilitating this openness. With the art system as an example, the relationship between openness and emergence is discussed. Lastly, artist reflections from the creation of the work are presented. These describe the nature of open systems and how they may be created.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Glass Pond is an interactive artwork designed to engender exploration and reflection through an intuitive, tangible interface and a simulation agent. It is being developed using iterative methods. A study has been conducted with the aim of illuminating user experience, interface, design, and performance issues.The paper describes the study methodology and process of data analysis including coding schemes for cognitive states and movements. Analysis reveals that exploration and reflection occurred as well as composing behaviours (unexpected). Results also show that participants interacted to varying degrees. Design discussion includes the artwork's (novel) interface and configuration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Emergence is discussed in the context of a practice-based study of interactive art and a new taxonomy of emergence is proposed. The interactive art system ‘plus minus now’ is described and its relationship to emergence is discussed. ‘Plus minus now’ uses a novel method for instantiating emergent shapes. A preliminary investigation of this art system has been conducted and reveals the creation of temporal compositions by a participant. These temporal compositions and the emergent shapes are described using the taxonomy of emergence. Characteristics of emergent interactions and the implications of designing for them are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sound tagging has been studied for years. Among all sound types, music, speech, and environmental sound are three hottest research areas. This survey aims to provide an overview about the state-of-the-art development in these areas.We discuss about the meaning of tagging in different sound areas at the beginning of the journey. Some examples of sound tagging applications are introduced in order to illustrate the significance of this research. Typical tagging techniques include manual, automatic, and semi-automatic approaches.After reviewing work in music, speech and environmental sound tagging, we compare them and state the research progress to date. Research gaps are identified for each research area and the common features and discriminations between three areas are discovered as well. Published datasets, tools used by researchers, and evaluation measures frequently applied in the analysis are listed. In the end, we summarise the worldwide distribution of countries dedicated to sound tagging research for years.