979 resultados para Filtering systems


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The explosive growth of the World-Wide-Web and the emergence of ecommerce are the major two factors that have led to the development of recommender systems (Resnick and Varian, 1997). The main task of recommender systems is to learn from users and recommend items (e.g. information, products or books) that match the users’ personal preferences. Recommender systems have been an active research area for more than a decade. Many different techniques and systems with distinct strengths have been developed to generate better quality recommendations. One of the main factors that affect recommenders’ recommendation quality is the amount of information resources that are available to the recommenders. The main feature of the recommender systems is their ability to make personalised recommendations for different individuals. However, for many ecommerce sites, it is difficult for them to obtain sufficient knowledge about their users. Hence, the recommendations they provided to their users are often poor and not personalised. This information insufficiency problem is commonly referred to as the cold-start problem. Most existing research on recommender systems focus on developing techniques to better utilise the available information resources to achieve better recommendation quality. However, while the amount of available data and information remains insufficient, these techniques can only provide limited improvements to the overall recommendation quality. In this thesis, a novel and intuitive approach towards improving recommendation quality and alleviating the cold-start problem is attempted. This approach is enriching the information resources. It can be easily observed that when there is sufficient information and knowledge base to support recommendation making, even the simplest recommender systems can outperform the sophisticated ones with limited information resources. Two possible strategies are suggested in this thesis to achieve the proposed information enrichment for recommenders: • The first strategy suggests that information resources can be enriched by considering other information or data facets. Specifically, a taxonomy-based recommender, Hybrid Taxonomy Recommender (HTR), is presented in this thesis. HTR exploits the relationship between users’ taxonomic preferences and item preferences from the combination of the widely available product taxonomic information and the existing user rating data, and it then utilises this taxonomic preference to item preference relation to generate high quality recommendations. • The second strategy suggests that information resources can be enriched simply by obtaining information resources from other parties. In this thesis, a distributed recommender framework, Ecommerce-oriented Distributed Recommender System (EDRS), is proposed. The proposed EDRS allows multiple recommenders from different parties (i.e. organisations or ecommerce sites) to share recommendations and information resources with each other in order to improve their recommendation quality. Based on the results obtained from the experiments conducted in this thesis, the proposed systems and techniques have achieved great improvement in both making quality recommendations and alleviating the cold-start problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Road features extraction from remote sensed imagery has been a long-term topic of great interest within the photogrammetry and remote sensing communities for over three decades. The majority of the early work only focused on linear feature detection approaches, with restrictive assumption on image resolution and road appearance. The widely available of high resolution digital aerial images makes it possible to extract sub-road features, e.g. road pavement markings. In this paper, we will focus on the automatic extraction of road lane markings, which are required by various lane-based vehicle applications, such as, autonomous vehicle navigation, and lane departure warning. The proposed approach consists of three phases: i) road centerline extraction from low resolution image, ii) road surface detection in the original image, and iii) pavement marking extraction on the generated road surface. The proposed method was tested on the aerial imagery dataset of the Bruce Highway, Queensland, and the results demonstrate the efficiency of our approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

World economies increasingly demand reliable and economical power supply and distribution. To achieve this aim the majority of power systems are becoming interconnected, with several power utilities supplying the one large network. One problem that occurs in a large interconnected power system is the regular occurrence of system disturbances which can result in the creation of intra-area oscillating modes. These modes can be regarded as the transient responses of the power system to excitation, which are generally characterised as decaying sinusoids. For a power system operating ideally these transient responses would ideally would have a “ring-down” time of 10-15 seconds. Sometimes equipment failures disturb the ideal operation of power systems and oscillating modes with ring-down times greater than 15 seconds arise. The larger settling times associated with such “poorly damped” modes cause substantial power flows between generation nodes, resulting in significant physical stresses on the power distribution system. If these modes are not just poorly damped but “negatively damped”, catastrophic failures of the system can occur. To ensure system stability and security of large power systems, the potentially dangerous oscillating modes generated from disturbances (such as equipment failure) must be quickly identified. The power utility must then apply appropriate damping control strategies. In power system monitoring there exist two facets of critical interest. The first is the estimation of modal parameters for a power system in normal, stable, operation. The second is the rapid detection of any substantial changes to this normal, stable operation (because of equipment breakdown for example). Most work to date has concentrated on the first of these two facets, i.e. on modal parameter estimation. Numerous modal parameter estimation techniques have been proposed and implemented, but all have limitations [1-13]. One of the key limitations of all existing parameter estimation methods is the fact that they require very long data records to provide accurate parameter estimates. This is a particularly significant problem after a sudden detrimental change in damping. One simply cannot afford to wait long enough to collect the large amounts of data required for existing parameter estimators. Motivated by this gap in the current body of knowledge and practice, the research reported in this thesis focuses heavily on rapid detection of changes (i.e. on the second facet mentioned above). This thesis reports on a number of new algorithms which can rapidly flag whether or not there has been a detrimental change to a stable operating system. It will be seen that the new algorithms enable sudden modal changes to be detected within quite short time frames (typically about 1 minute), using data from power systems in normal operation. The new methods reported in this thesis are summarised below. The Energy Based Detector (EBD): The rationale for this method is that the modal disturbance energy is greater for lightly damped modes than it is for heavily damped modes (because the latter decay more rapidly). Sudden changes in modal energy, then, imply sudden changes in modal damping. Because the method relies on data from power systems in normal operation, the modal disturbances are random. Accordingly, the disturbance energy is modelled as a random process (with the parameters of the model being determined from the power system under consideration). A threshold is then set based on the statistical model. The energy method is very simple to implement and is computationally efficient. It is, however, only able to determine whether or not a sudden modal deterioration has occurred; it cannot identify which mode has deteriorated. For this reason the method is particularly well suited to smaller interconnected power systems that involve only a single mode. Optimal Individual Mode Detector (OIMD): As discussed in the previous paragraph, the energy detector can only determine whether or not a change has occurred; it cannot flag which mode is responsible for the deterioration. The OIMD seeks to address this shortcoming. It uses optimal detection theory to test for sudden changes in individual modes. In practice, one can have an OIMD operating for all modes within a system, so that changes in any of the modes can be detected. Like the energy detector, the OIMD is based on a statistical model and a subsequently derived threshold test. The Kalman Innovation Detector (KID): This detector is an alternative to the OIMD. Unlike the OIMD, however, it does not explicitly monitor individual modes. Rather it relies on a key property of a Kalman filter, namely that the Kalman innovation (the difference between the estimated and observed outputs) is white as long as the Kalman filter model is valid. A Kalman filter model is set to represent a particular power system. If some event in the power system (such as equipment failure) causes a sudden change to the power system, the Kalman model will no longer be valid and the innovation will no longer be white. Furthermore, if there is a detrimental system change, the innovation spectrum will display strong peaks in the spectrum at frequency locations associated with changes. Hence the innovation spectrum can be monitored to both set-off an “alarm” when a change occurs and to identify which modal frequency has given rise to the change. The threshold for alarming is based on the simple Chi-Squared PDF for a normalised white noise spectrum [14, 15]. While the method can identify the mode which has deteriorated, it does not necessarily indicate whether there has been a frequency or damping change. The PPM discussed next can monitor frequency changes and so can provide some discrimination in this regard. The Polynomial Phase Method (PPM): In [16] the cubic phase (CP) function was introduced as a tool for revealing frequency related spectral changes. This thesis extends the cubic phase function to a generalised class of polynomial phase functions which can reveal frequency related spectral changes in power systems. A statistical analysis of the technique is performed. When applied to power system analysis, the PPM can provide knowledge of sudden shifts in frequency through both the new frequency estimate and the polynomial phase coefficient information. This knowledge can be then cross-referenced with other detection methods to provide improved detection benchmarks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a framework for performing real-time recursive estimation of landmarks’ visual appearance. Imaging data in its original high dimensional space is probabilistically mapped to a compressed low dimensional space through the definition of likelihood functions. The likelihoods are subsequently fused with prior information using a Bayesian update. This process produces a probabilistic estimate of the low dimensional representation of the landmark visual appearance. The overall filtering provides information complementary to the conventional position estimates which is used to enhance data association. In addition to robotics observations, the filter integrates human observations in the appearance estimates. The appearance tracks as computed by the filter allow landmark classification. The set of labels involved in the classification task is thought of as an observation space where human observations are made by selecting a label. The low dimensional appearance estimates returned by the filter allow for low cost communication in low bandwidth sensor networks. Deployment of the filter in such a network is demonstrated in an outdoor mapping application involving a human operator, a ground and an air vehicle.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information overload has become a serious issue for web users. Personalisation can provide effective solutions to overcome this problem. Recommender systems are one popular personalisation tool to help users deal with this issue. As the base of personalisation, the accuracy and efficiency of web user profiling affects the performances of recommender systems and other personalisation systems greatly. In Web 2.0, the emerging user information provides new possible solutions to profile users. Folksonomy or tag information is a kind of typical Web 2.0 information. Folksonomy implies the users‘ topic interests and opinion information. It becomes another source of important user information to profile users and to make recommendations. However, since tags are arbitrary words given by users, folksonomy contains a lot of noise such as tag synonyms, semantic ambiguities and personal tags. Such noise makes it difficult to profile users accurately or to make quality recommendations. This thesis investigates the distinctive features and multiple relationships of folksonomy and explores novel approaches to solve the tag quality problem and profile users accurately. Harvesting the wisdom of crowds and experts, three new user profiling approaches are proposed: folksonomy based user profiling approach, taxonomy based user profiling approach, hybrid user profiling approach based on folksonomy and taxonomy. The proposed user profiling approaches are applied to recommender systems to improve their performances. Based on the generated user profiles, the user and item based collaborative filtering approaches, combined with the content filtering methods, are proposed to make recommendations. The proposed new user profiling and recommendation approaches have been evaluated through extensive experiments. The effectiveness evaluation experiments were conducted on two real world datasets collected from Amazon.com and CiteULike websites. The experimental results demonstrate that the proposed user profiling and recommendation approaches outperform those related state-of-the-art approaches. In addition, this thesis proposes a parallel, scalable user profiling implementation approach based on advanced cloud computing techniques such as Hadoop, MapReduce and Cascading. The scalability evaluation experiments were conducted on a large scaled dataset collected from Del.icio.us website. This thesis contributes to effectively use the wisdom of crowds and expert to help users solve information overload issues through providing more accurate, effective and efficient user profiling and recommendation approaches. It also contributes to better usages of taxonomy information given by experts and folksonomy information contributed by users in Web 2.0.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Social tags in web 2.0 are becoming another important information source to describe the content of items as well as to profile users’ topic preferences. However, as arbitrary words given by users, tags contains a lot of noise such as tag synonym and semantic ambiguity a large number personal tags that only used by one user, which brings challenges to effectively use tags to make item recommendations. To solve these problems, this paper proposes to use a set of related tags along with their weights to represent semantic meaning of each tag for each user individually. A hybrid recommendation generation approaches that based on the weighted tags are proposed. We have conducted experiments using the real world dataset obtained from Amazon.com. The experimental results show that the proposed approaches outperform the other state of the art approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern- based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experiments have been conducted to compare the proposed two-stage filtering (T-SM) model with other possible "term-based + pattern-based" or "term-based + term-based" IF models. The results based on the RCV1 corpus show that the T-SM model significantly outperforms other types of "two-stage" IF models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Ethernet is a key component of the standards used for digital process buses in transmission substations, namely IEC 61850 and IEEE Std 1588-2008 (PTPv2). These standards use multicast Ethernet frames that can be processed by more than one device. This presents some significant engineering challenges when implementing a sampled value process bus due to the large amount of network traffic. A system of network traffic segregation using a combination of Virtual LAN (VLAN) and multicast address filtering using managed Ethernet switches is presented. This includes VLAN prioritisation of traffic classes such as the IEC 61850 protocols GOOSE, MMS and sampled values (SV), and other protocols like PTPv2. Multicast address filtering is used to limit SV/GOOSE traffic to defined subsets of subscribers. A method to map substation plant reference designations to multicast address ranges is proposed that enables engineers to determine the type of traffic and location of the source by inspecting the destination address. This method and the proposed filtering strategy simplifies future changes to the prioritisation of network traffic, and is applicable to both process bus and station bus applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Approximately 20 years have passed now since the NTSB issued its original recommendation to expedite development, certification and production of low-cost proximity warning and conflict detection systems for general aviation [1]. While some systems are in place (TCAS [2]), ¡¨see-and-avoid¡¨ remains the primary means of separation between light aircrafts sharing the national airspace. The requirement for a collision avoidance or sense-and-avoid capability onboard unmanned aircraft has been identified by leading government, industry and regulatory bodies as one of the most significant challenges facing the routine operation of unmanned aerial systems (UAS) in the national airspace system (NAS) [3, 4]. In this thesis, we propose and develop a novel image-based collision avoidance system to detect and avoid an upcoming conflict scenario (with an intruder) without first estimating or filtering range. The proposed collision avoidance system (CAS) uses relative bearing ƒÛ and angular-area subtended ƒê , estimated from an image, to form a test statistic AS C . This test statistic is used in a thresholding technique to decide if a conflict scenario is imminent. If deemed necessary, the system will command the aircraft to perform a manoeuvre based on ƒÛ and constrained by the CAS sensor field-of-view. Through the use of a simulation environment where the UAS is mathematically modelled and a flight controller developed, we show that using Monte Carlo simulations a probability of a Mid Air Collision (MAC) MAC RR or a Near Mid Air Collision (NMAC) RiskRatio can be estimated. We also show the performance gain this system has over a simplified version (bearings-only ƒÛ ). This performance gain is demonstrated in the form of a standard operating characteristic curve. Finally, it is shown that the proposed CAS performs at a level comparable to current manned aviations equivalent level of safety (ELOS) expectations for Class E airspace. In some cases, the CAS may be oversensitive in manoeuvring the owncraft when not necessary, but this constitutes a more conservative and therefore safer, flying procedures in most instances.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The existing Collaborative Filtering (CF) technique that has been widely applied by e-commerce sites requires a large amount of ratings data to make meaningful recommendations. It is not directly applicable for recommending products that are not frequently purchased by users, such as cars and houses, as it is difficult to collect rating data for such products from the users. Many of the e-commerce sites for infrequently purchased products are still using basic search-based techniques whereby the products that match with the attributes given in the target user's query are retrieved and recommended to the user. However, search-based recommenders cannot provide personalized recommendations. For different users, the recommendations will be the same if they provide the same query regardless of any difference in their online navigation behaviour. This paper proposes to integrate collaborative filtering and search-based techniques to provide personalized recommendations for infrequently purchased products. Two different techniques are proposed, namely CFRRobin and CFAg Query. Instead of using the target user's query to search for products as normal search based systems do, the CFRRobin technique uses the products in which the target user's neighbours have shown interest as queries to retrieve relevant products, and then recommends to the target user a list of products by merging and ranking the returned products using the Round Robin method. The CFAg Query technique uses the products that the user's neighbours have shown interest in to derive an aggregated query, which is then used to retrieve products to recommend to the target user. Experiments conducted on a real e-commerce dataset show that both the proposed techniques CFRRobin and CFAg Query perform better than the standard Collaborative Filtering (CF) and the Basic Search (BS) approaches, which are widely applied by the current e-commerce applications. The CFRRobin and CFAg Query approaches also outperform the e- isting query expansion (QE) technique that was proposed for recommending infrequently purchased products.