995 resultados para Prototype Selection
Resumo:
In this paper we consider the task of prototype selection whose primary goal is to reduce the storage and computational requirements of the Nearest Neighbor classifier while achieving better classification accuracies. We propose a solution to the prototype selection problem using techniques from cooperative game theory and show its efficacy experimentally.
Resumo:
Since streaming data keeps coming continuously as an ordered sequence, massive amounts of data is created. A big challenge in handling data streams is the limitation of time and space. Prototype selection on streaming data requires the prototypes to be updated in an incremental manner as new data comes in. We propose an incremental algorithm for prototype selection. This algorithm can also be used to handle very large datasets. Results have been presented on a number of large datasets and our method is compared to an existing algorithm for streaming data. Our algorithm saves time and the prototypes selected gives good classification accuracy.
Resumo:
Prototype Selection (PS) algorithms allow a faster Nearest Neighbor classification by keeping only the most profitable prototypes of the training set. In turn, these schemes typically lower the performance accuracy. In this work a new strategy for multi-label classifications tasks is proposed to solve this accuracy drop without the need of using all the training set. For that, given a new instance, the PS algorithm is used as a fast recommender system which retrieves the most likely classes. Then, the actual classification is performed only considering the prototypes from the initial training set belonging to the suggested classes. Results show that this strategy provides a large set of trade-off solutions which fills the gap between PS-based classification efficiency and conventional kNN accuracy. Furthermore, this scheme is not only able to, at best, reach the performance of conventional kNN with barely a third of distances computed, but it does also outperform the latter in noisy scenarios, proving to be a much more robust approach.
Resumo:
In the current Information Age, data production and processing demands are ever increasing. This has motivated the appearance of large-scale distributed information. This phenomenon also applies to Pattern Recognition so that classic and common algorithms, such as the k-Nearest Neighbour, are unable to be used. To improve the efficiency of this classifier, Prototype Selection (PS) strategies can be used. Nevertheless, current PS algorithms were not designed to deal with distributed data, and their performance is therefore unknown under these conditions. This work is devoted to carrying out an experimental study on a simulated framework in which PS strategies can be compared under classical conditions as well as those expected in distributed scenarios. Our results report a general behaviour that is degraded as conditions approach to more realistic scenarios. However, our experiments also show that some methods are able to achieve a fairly similar performance to that of the non-distributed scenario. Thus, although there is a clear need for developing specific PS methodologies and algorithms for tackling these situations, those that reported a higher robustness against such conditions may be good candidates from which to start.
Resumo:
In this paper we propose a prototype size selection method for a set of sample graphs. Our first contribution is to show how approximate set coding can be extended from the vector to graph domain. With this framework to hand we show how prototype selection can be posed as optimizing the mutual information between two partitioned sets of sample graphs. We show how the resulting method can be used for prototype graph size selection. In our experiments, we apply our method to a real-world dataset and investigate its performance on prototype size selection tasks. © 2012 Springer-Verlag Berlin Heidelberg.
Resumo:
In this paper, we study different methods for prototype selection for recognizing handwritten characters of Tamil script. In the first method, cumulative pairwise- distances of the training samples of a given class are used to select prototypes. In the second method, cumulative distance to allographs of different orientation is used as a criterion to decide if the sample is representative of the group. The latter method is presumed to offset the possible orientation effect. This method still uses fixed number of prototypes for each of the classes. Finally, a prototype set growing algorithm is proposed, with a view to better model the differences in complexity of different character classes. The proposed algorithms are tested and compared for both writer independent and writer adaptation scenarios.
Resumo:
An approach towards shape description, based on prototype modification and generalized cylinders, has been developed and applied to the object domains pottery and polyhedra: (1) A program describes and identifies pottery from vase outlines entered as lists of points. The descriptions have been modeled after descriptions by archeologists, with the result that identifications made by the program are remarkably consisten with those of the archeologists. It has been possible to quantify their shape descriptors, which are everyday terms in our language applied to many sorts of objects besides pottery, so that the resulting descriptions seem very natural. (2) New parsing strategies for polyhedra overcome some limitations of previous work. A special feature is that the processes of parsing and identification are carried out simultaneously.
Resumo:
A foundational issue underlying many overlay network applications ranging from routing to P2P file sharing is that of connectivity management, i.e., folding new arrivals into the existing mesh and re-wiring to cope with changing network conditions. Previous work has considered the problem from two perspectives: devising practical heuristics for specific applications designed to work well in real deployments, and providing abstractions for the underlying problem that are tractable to address via theoretical analyses, especially game-theoretic analysis. Our work unifies these two thrusts first by distilling insights gleaned from clean theoretical models, notably that under natural resource constraints, selfish players can select neighbors so as to efficiently reach near-equilibria that also provide high global performance. Using Egoist, a prototype overlay routing system we implemented on PlanetLab, we demonstrate that our neighbor selection primitives significantly outperform existing heuristics on a variety of performance metrics; that Egoist is competitive with an optimal, but unscalable full-mesh approach; and that it remains highly effective under significant churn. We also describe variants of Egoist's current design that would enable it to scale to overlays of much larger scale and allow it to cater effectively to applications, such as P2P file sharing in unstructured overlays, based on the use of primitives such as scoped-flooding rather than routing.
Resumo:
A foundational issue underlying many overlay network applications ranging from routing to P2P file sharing is that of connectivity management, i.e., folding new arrivals into an existing overlay, and re-wiring to cope with changing network conditions. Previous work has considered the problem from two perspectives: devising practical heuristics for specific applications designed to work well in real deployments, and providing abstractions for the underlying problem that are analytically tractable, especially via game-theoretic analysis. In this paper, we unify these two thrusts by using insights gleaned from novel, realistic theoretic models in the design of Egoist – a prototype overlay routing system that we implemented, deployed, and evaluated on PlanetLab. Using measurements on PlanetLab and trace-based simulations, we demonstrate that Egoist's neighbor selection primitives significantly outperform existing heuristics on a variety of performance metrics, including delay, available bandwidth, and node utilization. Moreover, we demonstrate that Egoist is competitive with an optimal, but unscalable full-mesh approach, remains highly effective under significant churn, is robust to cheating, and incurs minimal overhead. Finally, we discuss some of the potential benefits Egoist may offer to applications.
Resumo:
The process of resources systems selection takes an important part in Distributed/Agile/Virtual Enterprises (D/A/V Es) integration. However, the resources systems selection is still a difficult matter to solve in a D/A/VE, as it is pointed out in this paper. Globally, we can say that the selection problem has been equated from different aspects, originating different kinds of models/algorithms to solve it. In order to assist the development of a web prototype tool (broker tool), intelligent and flexible, that integrates all the selection model activities and tools, and with the capacity to adequate to each D/A/V E project or instance (this is the major goal of our final project), we intend in this paper to show: a formulation of a kind of resources selection problem and the limitations of the algorithms proposed to solve it. We formulate a particular case of the problem as an integer programming, which is solved using simplex and branch and bound algorithms, and identify their performance limitations (in terms of processing time) based on simulation results. These limitations depend on the number of processing tasks and on the number of pre-selected resources per processing tasks, defining the domain of applicability of the algorithms for the problem studied. The limitations detected open the necessity of the application of other kind of algorithms (approximate solution algorithms) outside the domain of applicability founded for the algorithms simulated. However, for a broker tool it is very important the knowledge of algorithms limitations, in order to, based on problem features, develop and select the most suitable algorithm that guarantees a good performance.
Resumo:
The survival of organisations, especially SMEs, depends, to the greatest extent, on those who supply them with the required material input. This is because if the supplier fails to deliver the right materials at the right time and place, and at the right price, then the recipient organisation is bound to fail in its obligations to satisfy the needs of its customers, and to stay in business. Hence, the task of choosing a supplier(s) from a list of vendors, that an organisation will trust with its very existence, is not an easy one. This project investigated how purchasing personnel in organisations solve the problem of vendor selection. The investigation went further to ascertain whether an Expert Systems model could be developed and used as a plausible solution to the problem. An extensive literature review indicated that very scanty research has been conducted in the area of Expert Systems for Vendor Selection, whereas many research theories in expert systems and in purchasing and supply management chain, respectively, had been reported. A survey questionnaire was designed and circulated to people in the industries who actually perform the vendor selection tasks. Analysis of the collected data confirmed the various factors which are considered during the selection process, and established the order in which those factors are ranked. Five of the factors, namely, Production Methods Used, Vendors Financial Background, Manufacturing Capacity, Size of Vendor Organisations, and Suppliers Position in the Industry; appeared to have similar patterns in the way organisations ranked them. These patterns suggested that the bigger the organisation, the more importantly they regarded the above factors. Further investigations revealed that respondents agreed that the most important factors were: Product Quality, Product Price and Delivery Date. The most apparent pattern was observed for the Vendors Financial Background. This generated curiosity which led to the design and development of a prototype expert system for assessing the financial profile of a potential supplier(s). This prototype was called ESfNS. It determines whether a prospective supplier(s) has good financial background or not. ESNS was tested by the potential users who then confirmed that expert systems have great prospects and commercial viability in the domain for solving vendor selection problems.