187 resultados para Similarity measure

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many business situations, products or user profile data are so complex that they need to be described by use of tree structures. Evaluating the similarity between tree-structured data is essential in many applications, such as recommender systems. To evaluate the similarity between two trees, concept corresponding nodes should be identified by constructing an edit distance mapping between them. Sometimes, the intension of one concept includes the intensions of several other concepts. In that situation, a one-to-many mapping should be constructed from the point of view of structures. This paper proposes a tree similarity measure model that can construct this kind of mapping. The similarity measure model takes into account all the information on nodes’ concepts, weights, and values. The conceptual similarity and the value similarity between two trees are evaluated based on the constructed mapping, and the final similarity measure is assessed as a weighted sum of their conceptual and value similarities. The effectiveness of the proposed similarity measure model is shown by an illustrative example and is also demonstrated by applying it into a recommender system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a new spectral clustering method called correlation preserving indexing (CPI), which is performed in the correlation similarity measure space. In this framework, the documents are projected into a low-dimensional semantic space in which the correlations between the documents in the local patches are maximized while the correlations between the documents outside these patches are minimized simultaneously. Since the intrinsic geometrical structure of the document space is often embedded in the similarities between the documents, correlation as a similarity measure is more suitable for detecting the intrinsic geometrical structure of the document space than euclidean distance. Consequently, the proposed CPI method can effectively discover the intrinsic structures embedded in high-dimensional document space. The effectiveness of the new method is demonstrated by extensive experiments conducted on various data sets and by comparison with existing document clustering methods.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We present a new approach for defining similarity measures for Atanassov's intuitionistic fuzzy sets (AIFS), in which a similarity measure has two components indicating the similarity and hesitancy aspects. We justify that there are at least two facets of uncertainty of an AIFS, one of which is related to fuzziness while other is related to lack of knowledge or non-specificity. We propose a set of axioms and build families of similarity measures that avoid counterintuitive examples that are used to justify one similarity measure over another. We also investigate a relation to entropies of AIFS, and outline possible application of our method in decision making and image segmentation. © 2014 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One of the content-based image retrieval techniques is the shape-based technique, which allows users to ask for objects similar in shape to a query object. Sajjanhar and Lu proposed a method for shape representation and similarity measure called the grid-based method [1]. They have shown that the method is effective for the retrieval of segmented objects based on shape. In this paper, we describe a system which uses the grid-based method for retrieval of images with multiple objects. We perform experiments on the prototype system to compare the performance of the grid-based method with the Fourier descriptors method [2]. Preliminary results have been presented.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis provides a unified and comprehensive treatment of the fuzzy neural networks as the intelligent controllers. This work has been motivated by a need to develop the solid control methodologies capable of coping with the complexity, the nonlinearity, the interactions, and the time variance of the processes under control. In addition, the dynamic behavior of such processes is strongly influenced by the disturbances and the noise, and such processes are characterized by a large degree of uncertainty. Therefore, it is important to integrate an intelligent component to increase the control system ability to extract the functional relationships from the process and to change such relationships to improve the control precision, that is, to display the learning and the reasoning abilities. The objective of this thesis was to develop a self-organizing learning controller for above processes by using a combination of the fuzzy logic and the neural networks. An on-line, direct fuzzy neural controller using the process input-output measurement data and the reference model with both structural and parameter tuning has been developed to fulfill the above objective. A number of practical issues were considered. This includes the dynamic construction of the controller in order to alleviate the bias/variance dilemma, the universal approximation property, and the requirements of the locality and the linearity in the parameters. Several important issues in the intelligent control were also considered such as the overall control scheme, the requirement of the persistency of excitation and the bounded learning rates of the controller for the overall closed loop stability. Other important issues considered in this thesis include the dependence of the generalization ability and the optimization methods on the data distribution, and the requirements for the on-line learning and the feedback structure of the controller. Fuzzy inference specific issues such as the influence of the choice of the defuzzification method, T-norm operator and the membership function on the overall performance of the controller were also discussed. In addition, the e-completeness requirement and the use of the fuzzy similarity measure were also investigated. Main emphasis of the thesis has been on the applications to the real-world problems such as the industrial process control. The applicability of the proposed method has been demonstrated through the empirical studies on several real-world control problems of industrial complexity. This includes the temperature and the number-average molecular weight control in the continuous stirred tank polymerization reactor, and the torsional vibration, the eccentricity, the hardness and the thickness control in the cold rolling mills. Compared to the traditional linear controllers and the dynamically constructed neural network, the proposed fuzzy neural controller shows the highest promise as an effective approach to such nonlinear multi-variable control problems with the strong influence of the disturbances and the noise on the dynamic process behavior. In addition, the applicability of the proposed method beyond the strictly control area has also been investigated, in particular to the data mining and the knowledge elicitation. When compared to the decision tree method and the pruned neural network method for the data mining, the proposed fuzzy neural network is able to achieve a comparable accuracy with a more compact set of rules. In addition, the performance of the proposed fuzzy neural network is much better for the classes with the low occurrences in the data set compared to the decision tree method. Thus, the proposed fuzzy neural network may be very useful in situations where the important information is contained in a small fraction of the available data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Conventional relevance feedback schemes may not be suitable to all practical applications of content-based image retrieval (CBIR), since most ordinary users would like to complete their search in a single interaction, especially on the web search. In this paper, we explore a new approach to improve the retrieval performance based on a new concept, bag of images, rather than relevance feedback. We consider that image collection comprises of image bags instead of independent individual images. Each image bag includes some relevant images with the same perceptual meaning. A theoretical case study demonstrates that image retrieval can benefit from the new concept. A number of experimental results show that the CBIR scheme based on bag of images can improve the retrieval performance dramatically.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Stereo matching tries to find correspondences between locations in a pair of displaced images of the same scene in order to extract the underlying depth information. Pixel correspondence estimation suffers from occlusions, noise or bias. In this work, we introduce a novel approach to represent images by means of interval-valued fuzzy sets to overcome the uncertainty due to the above mentioned problems. Our aim is to take advantage of this representation in the stereo matching algorithm. The image interval-valued fuzzification process that we propose is based on image segmentation in a different way to the common use of segmentation in stereo vision. We introduce interval-valued fuzzy similarities to compare windows whose pixels are represented by intervals. In the experimental analysis we show the goodness of this representation in the stereo matching problem. The new representation together with the new similarity measure that we introduce shows a better overall behavior with respect to other very well-known methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

It is important to derive priority weights from interval-valued fuzzy preferences when a pairwise comparative mechanism is used. By focusing on the significance of consistency in the pairwise comparison matrix, two numerical-valued consistent comparison matrices are extracted from an interval fuzzy judgement matrix. Both consistent matrices are derived by solving the linear or nonlinear programming models with the aid of assessments from Decision Makers (DMs). An interval priority weight vector from the extracted consistent matrices is generated. In order to retain more information hidden in the intervals, a new probability-based method for comparison of the interval priority weights is introduced. An algorithm for deriving the final priority interval weights for both consistent and inconsistent interval matrices is proposed. The algorithm is also generalized to handle the pairwise comparison matrix with fuzzy numbers. The comparative results from the five examples reveal that the proposed method, as compared with eight existing methods, exhibits a smaller degree of uncertainty pertaining to the priority weights, and is also more reliable based on the similarity measure. © 2014 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Failure mode and effect analysis (FMEA) is a popular safety and reliability analysis tool in examining potential failures of products, process, designs, or services, in a wide range of industries. While FMEA is a popular tool, the limitations of the traditional Risk Priority Number (RPN) model in FMEA have been highlighted in the literature. Even though many alternatives to the traditional RPN model have been proposed, there are not many investigations on the use of clustering techniques in FMEA. The main aim of this paper was to examine the use of a new Euclidean distance-based similarity measure and an incremental-learning clustering model, i.e., fuzzy adaptive resonance theory neural network, for similarity analysis and clustering of failure modes in FMEA; therefore, allowing the failure modes to be analyzed, visualized, and clustered. In this paper, the concept of a risk interval encompassing a group of failure modes is investigated. Besides that, a new approach to analyze risk ordering of different failure groups is introduced. These proposed methods are evaluated using a case study related to the edible bird nest industry in Sarawak, Malaysia. In short, the contributions of this paper are threefold: (1) a new Euclidean distance-based similarity measure, (2) a new risk interval measure for a group of failure modes, and (3) a new analysis of risk ordering of different failure groups. © 2014 The Natural Computing Applications Forum.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Virtualization brought an immense commute in the modern technology especially in computer networks since last decade. The enormity of big data has led the massive graphs to be increased in size exponentially in recent years so that normal tools and algorithms are going weak to process it. Size diminution of the massive graphs is a big challenge in the current era and extraction of useful information from huge graphs is also problematic. In this paper, we presented a concept to design the virtual graph vGraph in the virtual plane above the original plane having original massive graph and proposed a novel cumulative similarity measure for vGraph. The use of vGraph is utile in lieu of massive graph in terms of space and time. Our proposed algorithm has two main parts. In the first part, virtual nodes are designed from the original nodes based on the calculation of cumulative similarity among them. In the second part, virtual edges are designed to link the virtual nodes based on the calculation of similarity measure among the original edges of the original massive graph. The algorithm is tested on synthetic and real-world datasets which shows the efficiency of our proposed algorithms.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The spectrum nature and heterogeneity within autism spectrum disorders (ASD) pose as a challenge for treatment. Personalisation of syllabus for children with ASD can improve the efficacy of learning by adjusting the number of opportunities and deciding the course of syllabus. We research the data-motivated approach in an attempt to disentangle this heterogeneity for personalisation of syllabus. With the help of technology and a structured syllabus, collecting data while a child with ASD masters the skills is made possible. The performance data collected are, however, growing and contain missing elements based on the pace and the course each child takes while navigating through the syllabus. Bayesian nonparametric methods are known for automatically discovering the number of latent components and their parameters when the model involves higher complexity. We propose a nonparametric Bayesian matrix factorisation model that discovers learning patterns and the way participants associate with them. Our model is built upon the linear Poisson gamma model (LPGM) with an Indian buffet process prior and extended to incorporate data with missing elements. In this paper, for the first time we have presented learning patterns deduced automatically from data mining and machine learning methods using intervention data recorded for over 500 children with ASD. We compare the results with non-negative matrix factorisation and K-means, which being parametric, not only require us to specify the number of learning patterns in advance, but also do not have a principle approach to deal with missing data. The F1 score observed over varying degree of similarity measure (Jaccard Index) suggests that LPGM yields the best outcome. By observing these patterns with additional knowledge regarding the syllabus it may be possible to observe the progress and dynamically modify the syllabus for improved learning.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Failure mode and effect analysis (FMEA) is a popular safety and reliability analysis tool in examining potential failures of products, process, designs, or services, in a wide range of industries. While FMEA is a popular tool, the limitations of the traditional Risk Priority Number (RPN) model in FMEA have been highlighted in the literature. Even though many alternatives to the traditional RPN model have been proposed, there are not many investigations on the use of clustering techniques in FMEA. The main aim of this paper was to examine the use of a new Euclidean distance-based similarity measure and an incremental-learning clustering model, i.e., fuzzy adaptive resonance theory neural network, for similarity analysis and clustering of failure modes in FMEA; therefore, allowing the failure modes to be analyzed, visualized, and clustered. In this paper, the concept of a risk interval encompassing a group of failure modes is investigated. Besides that, a new approach to analyze risk ordering of different failure groups is introduced. These proposed methods are evaluated using a case study related to the edible bird nest industry in Sarawak, Malaysia. In short, the contributions of this paper are threefold: (1) a new Euclidean distance-based similarity measure, (2) a new risk interval measure for a group of failure modes, and (3) a new analysis of risk ordering of different failure groups.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While the largest common subgraph (LCSG) between a query and a database of models can provide an elegant and intuitive measure of similarity for many applications, it is computationally expensive to compute. Recently developed algorithms for subgraph isomorphism detection take advantage of prior knowledge of a database of models to improve the speed of on-line matching. This paper presents a new algorithm based on similar principles to solve the largest common subgraph problem. The new algorithm significantly reduces the computational complexity of detection of the LCSG between a known database of models, and a query given on-line.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the context of group decision making with fuzzy preferences, consensus measures are employed to provide feedback and help guide automatic or semi-automatic decision reaching processes. These measures attempt to capture the intuitive notion of how much inputs, individuals or groups agree with one another. Meanwhile, in ecological studies there has been an ongoing research effort to define measures of community evenness based on how evenly the proportional abundances of species are distributed. The question hence arises as to whether there can be any cross-fertilization from developments in these fields given their intuitive similarity. Here we investigate some of the models used in ecology toward their potential use in measuring consensus. We found that although many consensus characteristics are exhibited by evenness indices, lack of reciprocity and a tendency towards a minimum when a single input is non-zero would make them undesirable for inputs expressed on an interval scale. On the other hand, we note that some of the general frameworks could still be useful for other types of inputs like ranking profiles and that in the opposite direction consensus measures have the potential to provide new insights in ecology.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper focuses on the issue of comparing social groups or collectivities using measures derived from individual-level multivariate data. In this case, groups need to be differentiated such that: (a) between-group differences are maximized; (b) within-group differences are minimised; and (c) `differences' are calibrated to a scale that reflects a set indicators or observed variables.This paper demonstrates empirically how correspondence analysis can achieve this. It presents a scale of `workplace morale' derived from the responses of employees in a large sample of workplaces to questions concerning satisfaction with various facets of their job and their workplace. The scale derived through correspondence analysis is shown to achieve the three criteria described above.