23 resultados para Positive Matrix Factorization

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nonnegative matrix factorization based methods provide one of the simplest and most effective approaches to text mining. However, their applicability is mainly limited to analyzing a single data source. In this paper, we propose a novel joint matrix factorization framework which can jointly analyze multiple data sources by exploiting their shared and individual structures. The proposed framework is flexible to handle any arbitrary sharing configurations encountered in real world data. We derive an efficient algorithm for learning the factorization and show that its convergence is theoretically guaranteed. We demonstrate the utility and effectiveness of the proposed framework in two real-world applications–improving social media retrieval using auxiliary sources and cross-social media retrieval. Representing each social media source using their textual tags, for both applications, we show that retrieval performance exceeds the existing state-of-the-art techniques. The proposed solution provides a generic framework and can be applicable to a wider context in data mining wherever one needs to exploit mutual and individual knowledge present across multiple data sources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently, nonnegative matrix factorization (NMF) attracts more and more attentions for the promising of wide applications. A problem that still remains is that, however, the factors resulted from it may not necessarily be realistically interpretable. Some constraints are usually added to the standard NMF to generate such interpretive results. In this paper, a minimum-volume constrained NMF is proposed and an efficient multiplicative update algorithm is developed based on the natural gradient optimization. The proposed method can be applied to the blind source separation (BSS) problem, a hot topic with many potential applications, especially if the sources are mutually dependent. Simulation results of BSS for images show the superiority of the proposed method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nonnegative matrix factorization (NMF) is widely used in signal separation and image compression. Motivated by its successful applications, we propose a new cryptosystem based on NMF, where the nonlinear mixing (NLM) model with a strong noise is introduced for encryption and NMF is used for decryption. The security of the cryptosystem relies on following two facts: 1) the constructed multivariable nonlinear function is not invertible; 2) the process of NMF is unilateral, if the inverse matrix of the constructed linear mixing matrix is not nonnegative. Comparing with Lin's method (2006) that is a theoretical scheme using one-time padding in the cryptosystem, our cipher can be used repeatedly for the practical request, i.e., multitme padding is used in our cryptosystem. Also, there is no restriction on statistical characteristics of the ciphers and the plaintexts. Thus, more signals can be processed (successfully encrypted and decrypted), no matter they are correlative, sparse, or Gaussian. Furthermore, instead of the number of zero-crossing-based method that is often unstable in encryption and decryption, an improved method based on the kurtosis of the signals is introduced to solve permutation ambiguities in waveform reconstruction. Simulations are given to illustrate security and availability of our cryptosystem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nonnegative matrix factorization (NMF) is a widely used method for blind spectral unmixing (SU), which aims at obtaining the endmembers and corresponding fractional abundances, knowing only the collected mixing spectral data. It is noted that the abundance may be sparse (i.e., the endmembers may be with sparse distributions) and sparse NMF tends to lead to a unique result, so it is intuitive and meaningful to constrain NMF with sparseness for solving SU. However, due to the abundance sum-to-one constraint in SU, the traditional sparseness measured by L0/L1-norm is not an effective constraint any more. A novel measure (termed as S-measure) of sparseness using higher order norms of the signal vector is proposed in this paper. It features the physical significance. By using the S-measure constraint (SMC), a gradient-based sparse NMF algorithm (termed as NMF-SMC) is proposed for solving the SU problem, where the learning rate is adaptively selected, and the endmembers and abundances are simultaneously estimated. In the proposed NMF-SMC, there is no pure index assumption and no need to know the exact sparseness degree of the abundance in prior. Yet, it does not require the preprocessing of dimension reduction in which some useful information may be lost. Experiments based on synthetic mixtures and real-world images collected by AVIRIS and HYDICE sensors are performed to evaluate the validity of the proposed method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Online blind source separation (BSS) is proposed to overcome the high computational cost problem, which limits the practical applications of traditional batch BSS algorithms. However, the existing online BSS methods are mainly used to separate independent or uncorrelated sources. Recently, nonnegative matrix factorization (NMF) shows great potential to separate the correlative sources, where some constraints are often imposed to overcome the non-uniqueness of the factorization. In this paper, an incremental NMF with volume constraint is derived and utilized for solving online BSS. The volume constraint to the mixing matrix enhances the identifiability of the sources, while the incremental learning mode reduces the computational cost. The proposed method takes advantage of the natural gradient based multiplication updating rule, and it performs especially well in the recovery of dependent sources. Simulations in BSS for dual-energy X-ray images, online encrypted speech signals, and high correlative face images show the validity of the proposed method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spectral unmixing (SU) is an emerging problem in the remote sensing image processing. Since both the endmember signatures and their abundances have nonnegative values, it is a natural choice to employ the attractive nonnegative matrix factorization (NMF) methods to solve this problem. Motivated by that the abundances are sparse, the NMF with local smoothness constraint (NMF-LSC) is proposed in this paper. In the proposed method, the smoothness constraint is utilized to impose the sparseness, instead of the traditional L1-norm which is restricted by the underlying column-sum-to-one requirement of the to the abundance matrix. Simulations show the advantages of our algorithm over the compared methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nonnegative matrix factorization based methods provide one of the simplest and most effective approaches to text mining. However, their applicability is mainly limited to analyzing a single data source. In this chapter, we propose a novel joint matrix factorization framework which can jointly analyze multiple data sources by exploiting their shared and individual structures. The proposed framework is flexible to handle any arbitrary sharing configurations encountered in real world data. We derive an efficient algorithm for learning the factorization and show that its convergence is theoretically guaranteed. We demonstrate the utility and effectiveness of the proposed framework in two real-world applications—improving social media retrieval using auxiliary sources and cross-social media retrieval. Representing each social media source using their textual tags, for both applications, we show that retrieval performance exceeds the existing state-of-the-art techniques. The proposed solution provides a generic framework and can be applicable to a wider context in data mining wherever one needs to exploit mutual and individual knowledge present across multiple data sources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nonnegative matrix factorization (NMF) is a hot topic in machine learning and data processing. Recently, a constrained version, non-smooth NMF (NsNMF), shows a great potential in learning meaningful sparse representation of the observed data. However, it suffers from a slow linear convergence rate, discouraging its applications to large-scale data representation. In this paper, a fast NsNMF (FNsNMF) algorithm is proposed to speed up NsNMF. In the proposed method, it first shows that the cost function of the derived sub-problem is convex and the corresponding gradient is Lipschitz continuous. Then, the optimization to this function is replaced by solving a proximal function, which is designed based on the Lipschitz constant and can be solved through utilizing a constructed fast convergent sequence. Due to the usage of the proximal function and its efficient optimization, our method can achieve a nonlinear convergence rate, much faster than NsNMF. Simulations in both computer generated data and the real-world data show the advantages of our algorithm over the compared methods.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Least square problem with l1 regularization has been proposed as a promising method for sparse signal reconstruction (e.g., basis pursuit de-noising and compressed sensing) and feature selection (e.g., the Lasso algorithm) in signal processing, statistics, and related fields. These problems can be cast as l1-regularized least-square program (LSP). In this paper, we propose a novel monotonic fixed point method to solve large-scale l1-regularized LSP. And we also prove the stability and convergence of the proposed method. Furthermore we generalize this method to least square matrix problem and apply it in nonnegative matrix factorization (NMF). The method is illustrated on sparse signal reconstruction, partner recognition and blind source separation problems, and the method tends to convergent faster and sparser than other l1-regularized algorithms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Joint modeling of related data sources has the potential to improve various data mining tasks such as transfer learning, multitask clustering, information retrieval etc. However, diversity among various data sources might outweigh the advantages of the joint modeling, and thus may result in performance degradations. To this end, we propose a regularized shared subspace learning framework, which can exploit the mutual strengths of related data sources while being immune to the effects of the variabilities of each source. This is achieved by further imposing a mutual orthogonality constraint on the constituent subspaces which segregates the common patterns from the source specific patterns, and thus, avoids performance degradations. Our approach is rooted in nonnegative matrix factorization and extends it further to enable joint analysis of related data sources. Experiments performed using three real world data sets for both retrieval and clustering applications demonstrate the benefits of regularization and validate the effectiveness of the model. Our proposed solution provides a formal framework appropriate for jointly analyzing related data sources and therefore, it is applicable to a wider context in data mining.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Although tagging has become increasingly popular in online image and video sharing systems, tags are known to be noisy, ambiguous, incomplete and subjective. These factors can seriously affect the precision of a social tag-based web retrieval system. Therefore improving the precision performance of these social tag-based web retrieval systems has become an increasingly important research topic. To this end, we propose a shared subspace learning framework to leverage a secondary source to improve retrieval performance from a primary dataset. This is achieved by learning a shared subspace between the two sources under a joint Nonnegative Matrix Factorization in which the level of subspace sharing can be explicitly controlled. We derive an efficient algorithm for learning the factorization, analyze its complexity, and provide proof of convergence. We validate the framework on image and video retrieval tasks in which tags from the LabelMe dataset are used to improve image retrieval performance from a Flickr dataset and video retrieval performance from a YouTube dataset. This has implications for how to exploit and transfer knowledge from readily available auxiliary tagging resources to improve another social web retrieval system. Our shared subspace learning framework is applicable to a range of problems where one needs to exploit the strengths existing among multiple and heterogeneous datasets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a novel Bayesian formulation to exploit shared structures across multiple data sources, constructing foundations for effective mining and retrieval across disparate domains. We jointly analyze diverse data sources using a unifying piece of metadata (textual tags). We propose a method based on Bayesian Probabilistic Matrix Factorization (BPMF) which is able to explicitly model the partial knowledge common to the datasets using shared subspaces and the knowledge specific to each dataset using individual subspaces. For the proposed model, we derive an efficient algorithm for learning the joint factorization based on Gibbs sampling. The effectiveness of the model is demonstrated by social media retrieval tasks across single and multiple media. The proposed solution is applicable to a wider context, providing a formal framework suitable for exploiting individual as well as mutual knowledge present across heterogeneous data sources of many kinds.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We propose a nonparametric Bayesian, linear Poisson gamma model for count data and use it for dictionary learning. A key property of this model is that it captures the parts-based representation similar to nonnegative matrix factorization. We present an auxiliary variable Gibbs sampler, which turns the intractable inference into a tractable one. Combining this inference procedure with the slice sampler of Indian buffet process, we show that our model can learn the number of factors automatically. Using synthetic and real-world datasets, we show that the proposed model outperforms other state-of-the-art nonparametric factor models.