Biblioteca Digital

An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).

Veja mais

In the Rough : In-Field evaluation of an autonomous vehicle for golf course maintenance

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Maintenance is a time consuming and expensive task for any golf course or driving range manager. For a golf course the primary tasks are grass mowing and maintenance (fertilizer and herbicide spreading), while for a driving range mowing, maintenance and ball collection are required. All these tasks require an operator to drive a vehicle along paths which are generally predefined. This paper presents some preliminary in-field tsting results for an automated tractor vehicle performing golf ball collection on an actual driving range, and mowing on difficult unstructured terrain.

Veja mais

A Two-stage information filtering based on rough decision rule and pattern mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information Overload and Mismatch are two fundamental problems affecting the effectiveness of information filtering systems. Even though both term-based and patternbased approaches have been proposed to address the problems of overload and mismatch, neither of these approaches alone can provide a satisfactory solution to address these problems. This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern-based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experimental results based on the RCV1 corpus show that the proposed twostage filtering model significantly outperforms the both termbased and pattern-based information filtering models.

Veja mais

Rough sets based reasoning and pattern mining for a two-stage information filtering system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern- based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experiments have been conducted to compare the proposed two-stage filtering (T-SM) model with other possible "term-based + pattern-based" or "term-based + term-based" IF models. The results based on the RCV1 corpus show that the T-SM model significantly outperforms other types of "two-stage" IF models.

Veja mais

Multiview point cloud kernels for semisupervised learning [Lecture Notes]

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In semisupervised learning (SSL), a predictive model is learn from a collection of labeled data and a typically much larger collection of unlabeled data. These paper presented a framework called multi-view point cloud regularization (MVPCR), which unifies and generalizes several semisupervised kernel methods that are based on data-dependent regularization in reproducing kernel Hilbert spaces (RKHSs). Special cases of MVPCR include coregularized least squares (CoRLS), manifold regularization (MR), and graph-based SSL. An accompanying theorem shows how to reduce any MVPCR problem to standard supervised learning with a new multi-view kernel.

Veja mais

Rough fuzzy control of SVC for power system stability enhancement

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a new approach to the design of a rough fuzzy controller for the control loop of the SVC (static VAR system) in a two area power system for stability enhancement with particular emphasis on providing effective damping for oscillatory instabilities. The performances of the rough fuzzy and the conventional fuzzy controller are compared with that of the conventional PI controller for a variety of transient disturbances, highlighting the effectiveness of the rough fuzzy controller in damping the inter-area oscillations. The effect of the rough fuzzy controller in improving the CCT (critical clearing time) of the two area system is elaborated in this paper as well.

Veja mais

Rough Game

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Short Story About football Rugby League and City Country Clashes. Written to explore, examine and attempt to subvert generic conventions identified in PhD research. Good football fiction exists and it is plentiful.

Veja mais

Calculation of dose kernels for radiotherapy applications using the GEANT 4 Monte Carlo toolkit

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dose kernels may be used to calculate dose distributions in radiotherapy (as described by Ahnesjo et al., 1999). Their calculation requires use of Monte Carlo methods, usually by forcing interactions to occur at a point. The Geant4 Monte Carlo toolkit provides a capability to force interactions to occur in a particular volume. We have modified this capability and created a Geant4 application to calculate dose kernels in cartesian, cylindrical, and spherical scoring systems. The simulation considers monoenergetic photons incident at the origin of a 3 m x 3 x 9 3 m water volume. Photons interact via compton, photo-electric, pair production, and rayleigh scattering. By default, Geant4 models photon interactions by sampling a physical interaction length (PIL) for each process. The process returning the smallest PIL is then considered to occur. In order to force the interaction to occur within a given length, L_FIL, we scale each PIL according to the formula: PIL_forced = L_FIL 9 (1 - exp(-PIL/PILo)) where PILo is a constant. This ensures that the process occurs within L_FIL, whilst correctly modelling the relative probability of each process. Dose kernels were produced for an incident photon energy of 0.1, 1.0, and 10.0 MeV. In order to benchmark the code, dose kernels were also calculated using the EGSnrc Edknrc user code. Identical scoring systems were used; namely, the collapsed cone approach of the Edknrc code. Relative dose difference images were then produced. Preliminary results demonstrate the ability of the Geant4 application to reproduce the shape of the dose kernels; median relative dose differences of 12.6, 5.75, and 12.6 % were found for an incident photon energy of 0.1, 1.0, and 10.0 MeV respectively.

Veja mais

Rough set based approach to text classification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Textual document set has become an important and rapidly growing information source in the web. Text classification is one of the crucial technologies for information organisation and management. Text classification has become more and more important and attracted wide attention of researchers from different research fields. In this paper, many feature selection methods, the implement algorithms and applications of text classification are introduced firstly. However, because there are much noise in the knowledge extracted by current data-mining techniques for text classification, it leads to much uncertainty in the process of text classification which is produced from both the knowledge extraction and knowledge usage, therefore, more innovative techniques and methods are needed to improve the performance of text classification. It has been a critical step with great challenge to further improve the process of knowledge extraction and effectively utilization of the extracted knowledge. Rough Set decision making approach is proposed to use Rough Set decision techniques to more precisely classify the textual documents which are difficult to separate by the classic text classification methods. The purpose of this paper is to give an overview of existing text classification technologies, to demonstrate the Rough Set concepts and the decision making approach based on Rough Set theory for building more reliable and effective text classification framework with higher precision, to set up an innovative evaluation metric named CEI which is very effective for the performance assessment of the similar research, and to propose a promising research direction for addressing the challenging problems in text classification, text mining and other relative fields.

Veja mais

Supplementary information : weighted tree kernels for sequence analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genomic sequences are fundamentally text documents, admitting various representations according to need and tokenization. Gene expression depends crucially on binding of enzymes to the DNA sequence at small, poorly conserved binding sites, limiting the utility of standard pattern search. However, one may exploit the regular syntactic structure of the enzyme's component proteins and the corresponding binding sites, framing the problem as one of detecting grammatically correct genomic phrases. In this paper we propose new kernels based on weighted tree structures, traversing the paths within them to capture the features which underpin the task. Experimentally, we and that these kernels provide performance comparable with state of the art approaches for this problem, while offering significant computational advantages over earlier methods. The methods proposed may be applied to a broad range of sequence or tree-structured data in molecular biology and other domains.

Veja mais

Rough Seas

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Favel Parrett’s second novel, When the Night Comes, opens with its teenage protagonist Isla lying awake in her bunk on a night ferry to Tasmania in the mid-1980s, ‘waiting for the rough seas’. Her younger brother sleeps beside her, and her distracted, emotionally distant mother – the kind of woman who is ‘always sitting places by herself in the night’ – is smoking on deck. Together, the three are weathering the roiling overnight passage in order to escape a violent past and make a new life in Hobart. The rough seas the novel goes on to navigate are, as one might expect, both literal and metaphorical...

Veja mais

926 resultados para Rough Kernels

Filtro por publicador