843 resultados para Spam email filtering
Resumo:
Information Overload and Mismatch are two fundamental problems affecting the effectiveness of information filtering systems. Even though both term-based and patternbased approaches have been proposed to address the problems of overload and mismatch, neither of these approaches alone can provide a satisfactory solution to address these problems. This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern-based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experimental results based on the RCV1 corpus show that the proposed twostage filtering model significantly outperforms the both termbased and pattern-based information filtering models.
Resumo:
The following report considers a number of key challenges the Australian Federal Government faces in designing the regulatory framework and the reach of its planned mandatory internet filter. Previous reports on the mandatory filtering scheme have concentrated on the filtering technologies, their efficacy, their cost and their likely impact on the broadband environment. This report focuses on the scope and the nature of content that is likely to be caught by the proposed filter and on identifying associated public policy implications.
Resumo:
This paper presents a framework for performing real-time recursive estimation of landmarks’ visual appearance. Imaging data in its original high dimensional space is probabilistically mapped to a compressed low dimensional space through the definition of likelihood functions. The likelihoods are subsequently fused with prior information using a Bayesian update. This process produces a probabilistic estimate of the low dimensional representation of the landmark visual appearance. The overall filtering provides information complementary to the conventional position estimates which is used to enhance data association. In addition to robotics observations, the filter integrates human observations in the appearance estimates. The appearance tracks as computed by the filter allow landmark classification. The set of labels involved in the classification task is thought of as an observation space where human observations are made by selecting a label. The low dimensional appearance estimates returned by the filter allow for low cost communication in low bandwidth sensor networks. Deployment of the filter in such a network is demonstrated in an outdoor mapping application involving a human operator, a ground and an air vehicle.
Resumo:
Record 8 of 29
Resumo:
This paper presents a modified approach to evaluate access control policy similarity and dissimilarity based on the proposal by Lin et al. (2007). Lin et al.'s policy similarity approach is intended as a filter stage which identifies similar XACML policies that can be analysed further using more computationally demanding techniques based on model checking or logical reasoning. This paper improves the approach of computing similarity of Lin et al. and also proposes a mechanism to calculate a dissimilarity score by identifying related policies that are likely to produce different access decisions. Departing from the original algorithm, the modifications take into account the policy obligation, rule or policy combining algorithm and the operators between attribute name and value. The algorithms are useful in activities involving parties from multiple security domains such as secured collaboration or secured task distribution. The algorithms allow various comparison options for evaluating policies while retaining control over the restriction level via a number of thresholds and weight factors.
Resumo:
It is a big challenge to clearly identify the boundary between positive and negative streams for information filtering systems. Several attempts have used negative feedback to solve this challenge; however, there are two issues for using negative relevance feedback to improve the effectiveness of information filtering. The first one is how to select constructive negative samples in order to reduce the space of negative documents. The second issue is how to decide noisy extracted features that should be updated based on the selected negative samples. This paper proposes a pattern mining based approach to select some offenders from the negative documents, where an offender can be used to reduce the side effects of noisy features. It also classifies extracted features (i.e., terms) into three categories: positive specific terms, general terms, and negative specific terms. In this way, multiple revising strategies can be used to update extracted features. An iterative learning algorithm is also proposed to implement this approach on the RCV1 data collection, and substantial experiments show that the proposed approach achieves encouraging performance and the performance is also consistent for adaptive filtering as well.
Resumo:
This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern- based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experiments have been conducted to compare the proposed two-stage filtering (T-SM) model with other possible "term-based + pattern-based" or "term-based + term-based" IF models. The results based on the RCV1 corpus show that the T-SM model significantly outperforms other types of "two-stage" IF models.
Resumo:
Ethernet is a key component of the standards used for digital process buses in transmission substations, namely IEC 61850 and IEEE Std 1588-2008 (PTPv2). These standards use multicast Ethernet frames that can be processed by more than one device. This presents some significant engineering challenges when implementing a sampled value process bus due to the large amount of network traffic. A system of network traffic segregation using a combination of Virtual LAN (VLAN) and multicast address filtering using managed Ethernet switches is presented. This includes VLAN prioritisation of traffic classes such as the IEC 61850 protocols GOOSE, MMS and sampled values (SV), and other protocols like PTPv2. Multicast address filtering is used to limit SV/GOOSE traffic to defined subsets of subscribers. A method to map substation plant reference designations to multicast address ranges is proposed that enables engineers to determine the type of traffic and location of the source by inspecting the destination address. This method and the proposed filtering strategy simplifies future changes to the prioritisation of network traffic, and is applicable to both process bus and station bus applications.
Resumo:
Despite many incidents about fake online consumer reviews have been reported, very few studies have been conducted to date to examine the trustworthiness of online consumer reviews. One of the reasons is the lack of an effective computational method to separate the untruthful reviews (i.e., spam) from the legitimate ones (i.e., ham) given the fact that prominent spam features are often missing in online reviews. The main contribution of our research work is the development of a novel review spam detection method which is underpinned by an unsupervised inferential language modeling framework. Another contribution of this work is the development of a high-order concept association mining method which provides the essential term association knowledge to bootstrap the performance for untruthful review detection. Our experimental results confirm that the proposed inferential language model equipped with high-order concept association knowledge is effective in untruthful review detection when compared with other baseline methods.