58 resultados para NMF


Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we formulate the nonnegative matrix factorisation (NMF) problem as a maximum likelihood estimation problem for hidden Markov models and propose online expectation-maximisation (EM) algorithms to estimate the NMF and the other unknown static parameters. We also propose a sequential Monte Carlo approximation of our online EM algorithm. We show the performance of the proposed method with two numerical examples. © 2012 IFAC.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

互联网个性化推荐系统(Internet personal recommender systems)是根据用户的兴趣推荐最相关的互联网信息给用户的系统。在网上信息过载矛盾越来越严重、用户信息检索的个性化需求日益增强的现状下,推荐系统已经在搜索引擎、电子商务、网上社区等互联网关键应用中起到了关键性的作用,并且越来越受到重视。 然而,在大型网站上部署一个成熟推荐系统的代价依然很大,需要大量的计算和存储资源,推荐的准确性也依然有很大提升空间和需求,这就为推荐系统的研究提供了很多挑战。在这些挑战中推荐算法的准确性和可扩展性一直是该领域最为关注的两个问题,所谓推荐的准确性是指推荐的信息中用户真正感兴趣的比例,而可扩展性指的是系统能否在可容忍的时间和空间复杂度内处理海量的数据。如何在提高算法推荐准确性的同时增强算法的可扩展性是推荐系统改进的主要研究目标。然而,目前学术界的研究更多侧重于提高推荐算法的准确性,而对于可扩展性,很多准确性很高的算法由于需要比较复杂的计算,处理大规模动态数据的能力往往比较有限,并且它们的评测实验中并没有将可扩展性纳入到评价范畴,导致这些算法目前还很难在工业界大规模应用。 本论文的研究试图解决这一问题。通过在推荐算法中借鉴增量学习(Incremental learning)的思想,即考虑最新的训练数据来更新原有的机器学习模型,不需要或仅需要参考部分旧的训练数据,相对于使用全部数据也即批量的处理方式,增量式改进可以大大降低模型更新的复杂度,从而可以大幅度提高推荐算法在遇到新的训练数据时推荐模型更新的效率,降低计算代价,使得推荐模型的更新可以更加及时,进而提高推荐结果的准确性。具体来说,我们在提出了两种新的增量式协同过滤算法的同时,采用增量式学习的方法对目前准确性最好的若干推荐算法进行加速,特别是提高这些算法面对新的训练数据的更新模型的速度和效率,从而为这些算法的大规模的应用提供了可能。另一方面,新的训练数据包含了最新的用户兴趣,因此相对于旧的训练数据,算法在做更新时应给予更高的权重,这样才能做到推荐的结果在考虑到用户长期兴趣的同时,特别考虑用户近期的兴趣,从而使得推荐结果更加准确。这两方面归纳起来,我们旨在通过增量式学习使得推荐算法在更新时更加高效和精确,真正适用于互联网上海量数据的推荐,同时对其他增量式推荐系统方面的研究也具有借鉴意义。我们的改进工作主要包括以下几个方面: 基于主题模型的增量式推荐算法。主题模型,特别是概率隐含主题模型(PLSA)是一种广泛应用于推荐系统的主流方法,在文本推荐、图像推荐以及协同过滤推荐领域都有着很好的推荐效果。目前制约PLSA算法取得更大成功的重要因素就是PLSA算法更新的复杂度过高,使得学习模型的更新只能做批量式处理,这样就导致推荐的时效性不高,也没有办法体现用户的最新的兴趣和整体的最新动态。我们提出了一种增量式学习方法,可以应用于文本分析领域和协同过滤领域,当有新的训练数据到来时,对于基于文本的推荐,增量式更新方法仅寻找最相关的用户和文本以及涉及到的单词进行主题分布的更新,并给予新的文本以更高权重;对于协同过滤,我们的方法仅对当前用户所评分过得物品以及当前物品所涉及的用户进行更新,大大降低了更新的运算复杂度,提高了新数据在推荐算法中所占的权重,使得推荐更加准确、及时。我们的算法在天涯问答文本数据集上和MovieLens电影推荐数据集、Last.FM歌曲推荐数据集、豆瓣图书推荐数据集等协同过滤数据集上取得了很好的效果。 基于蚁群算法(Ant colony algorithm)的协同过滤推荐方法。受到群体智能(Swarm intelligence)算法的启发,我们提出了一种类似于蚁群算法的协同过滤推荐方法——Ant Collaborative Filtering,初始化阶段该方法给予每个用户或一组用户以全局唯一的单位数量的信息素,当用户对物品评分或者用户表示对该物品感兴趣时,用户所携带的信息素相应的传播到该物品上,同时该物品上已有的信息素(初始化为0)也会相应的传播给该用户;此外,用户和物品所携带的信息素会随着时间的推移有一定速率的挥发,通过挥发机制,可以在推荐时更重视用户近期的兴趣;推荐阶段,按照用户和物品所携带的信息素的种类和数量,我们可以得到相应的相似度,进而通过经典的相似度比较的方法来进行推荐。基于蚁群的协同过滤方法的优势在于可以有效的降低训练数据中的稀疏性,并且推荐算法可以实时的进行更新和推荐,同时考虑了用户兴趣随着时间的变化。我们在MovieLens电影评分、豆瓣书籍推荐、Last.FM音乐推荐数据集上验证了我们的方法。最后,我们建立了一个互联网新闻推荐系统,该系统以Firefox插件形式实现,自动采集用户浏览兴趣和偏好,后端使用不同的推荐算法推荐用户感兴趣的新闻给用户。 基于联合聚类(Co-clustering)的两阶段协同过滤方法。聚类(Clustering)是一种缩小数据规模、降低数据稀疏性的有效方法。对于庞大而稀疏的协同过滤训练数据来说,聚类是一种很自然事实上也的确很有效的预处理方法。因此我们提出了一种两阶段协同过滤框架:首先通过我们提出的一种联合聚类的方法,将原始评分矩阵分解成很多维度很小的块,每一块里面包含相似的用户对相似的物品的评分,然后通过矩阵拟合的方法(我们使用了非负矩阵分解NMF和主题模型PLSA)来对这些小块中的未知评分进行预测。当用户新增了对于某物品的一条评分,我们仅需要更新该用户或该物品所处的数据块进行重新评分预估,大大加快了评分预估的速度。我们在MovieLens电影评分数据集上验证了该算法的效果。 本文的研究成果不仅可以直接应用于大型推荐系统中,而且对于增量式推荐系统的后续研究也具有一定的指导意义。首先基于PLSA的增量式推荐算法对于其他基于图模型的推荐系统具有借鉴价值,其次蚁群推荐算法为一类新的、基于群体智能(Swarm intellignece)的协同过滤算法做出了有价值的探索,最后我们提出的两阶段协同过滤框架对于提高推荐算法的可扩展性和更新效率提出了一个通用的有效解决方案。 推荐系统是一个无止尽的优化的过程,除了推荐精度的不断提高之外,推荐算法的性能随着互联网上数据量的增加也需要进一步提高,增量式学习无疑是提高推荐算法更新速度最重要的方法,本文的研究为这一方向提供了参考。

Relevância:

10.00% 10.00%

Publicador:

Resumo:

On étudie l’application des algorithmes de décomposition matricielles tel que la Factorisation Matricielle Non-négative (FMN), aux représentations fréquentielles de signaux audio musicaux. Ces algorithmes, dirigés par une fonction d’erreur de reconstruction, apprennent un ensemble de fonctions de base et un ensemble de coef- ficients correspondants qui approximent le signal d’entrée. On compare l’utilisation de trois fonctions d’erreur de reconstruction quand la FMN est appliquée à des gammes monophoniques et harmonisées: moindre carré, divergence Kullback-Leibler, et une mesure de divergence dépendente de la phase, introduite récemment. Des nouvelles méthodes pour interpréter les décompositions résultantes sont présentées et sont comparées aux méthodes utilisées précédemment qui nécessitent des connaissances du domaine acoustique. Finalement, on analyse la capacité de généralisation des fonctions de bases apprises par rapport à trois paramètres musicaux: l’amplitude, la durée et le type d’instrument. Pour ce faire, on introduit deux algorithmes d’étiquetage des fonctions de bases qui performent mieux que l’approche précédente dans la majorité de nos tests, la tâche d’instrument avec audio monophonique étant la seule exception importante.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biological systems exhibit rich and complex behavior through the orchestrated interplay of a large array of components. It is hypothesized that separable subsystems with some degree of functional autonomy exist; deciphering their independent behavior and functionality would greatly facilitate understanding the system as a whole. Discovering and analyzing such subsystems are hence pivotal problems in the quest to gain a quantitative understanding of complex biological systems. In this work, using approaches from machine learning, physics and graph theory, methods for the identification and analysis of such subsystems were developed. A novel methodology, based on a recent machine learning algorithm known as non-negative matrix factorization (NMF), was developed to discover such subsystems in a set of large-scale gene expression data. This set of subsystems was then used to predict functional relationships between genes, and this approach was shown to score significantly higher than conventional methods when benchmarking them against existing databases. Moreover, a mathematical treatment was developed to treat simple network subsystems based only on their topology (independent of particular parameter values). Application to a problem of experimental interest demonstrated the need for extentions to the conventional model to fully explain the experimental data. Finally, the notion of a subsystem was evaluated from a topological perspective. A number of different protein networks were examined to analyze their topological properties with respect to separability, seeking to find separable subsystems. These networks were shown to exhibit separability in a nonintuitive fashion, while the separable subsystems were of strong biological significance. It was demonstrated that the separability property found was not due to incomplete or biased data, but is likely to reflect biological structure.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

La presente tesis aborda el conflicto bananero que ha enfrentado a Ecuador y a la Unión Europea (UE) a causa del incumplimiento del bloque a los acuerdos y las resoluciones de la Organización Mundial del Comercio (OMC), y pretende discutir de qué manera este proceso ha incidido en la negociación para concertar un acuerdo con la UE. Como respuesta a los fallos en el Mecanismo de Solución de Controversias de la OMC, la UE aplicó desde el 2006 el régimen de importación de sólo aranceles. Prosiguieron negociaciones a fin de pactar un acuerdo. El Ecuador ha planteado la firma de un Acuerdo de Comercio para el Desarrollo, que incluye también otros ámbitos además del comercial y que está orientado al desarrollo. Los países firmantes de acuerdos de asociación con la UE pactaron aranceles más bajos y desgravación paulatina, mientras Ecuador mantiene la tarifa de NMF. Se presenta un marco conceptual en el cual se analiza aspectos relacionados con el comercio exterior. A continuación, se profundiza en la importancia de la producción de banano en el Ecuador y se aborda la historia del conflicto bananero así como las alternativas planteadas de acuerdos con la UE. La tesis analiza la influencia de los acuerdos del bloque con Centroamérica, y Perú y Colombia sobre el tema bananero en la comercialización del banano ecuatoriano y la negociación de Ecuador con la Unión Europea.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The issue of how children learn the meaning of words is fundamental to developmental psychology. The recent attempts to develop or evolve efficient communication protocols among interacting robots or Virtual agents have brought that issue to a central place in more applied research fields, such as computational linguistics and neural networks, as well. An attractive approach to learning an object-word mapping is the so-called cross-situational learning. This learning scenario is based on the intuitive notion that a learner can determine the meaning of a word by finding something in common across all observed uses of that word. Here we show how the deterministic Neural Modeling Fields (NMF) categorization mechanism can be used by the learner as an efficient algorithm to infer the correct object-word mapping. To achieve that we first reduce the original on-line learning problem to a batch learning problem where the inputs to the NMF mechanism are all possible object-word associations that Could be inferred from the cross-situational learning scenario. Since many of those associations are incorrect, they are considered as clutter or noise and discarded automatically by a clutter detector model included in our NMF implementation. With these two key ingredients - batch learning and clutter detection - the NMF mechanism was capable to infer perfectly the correct object-word mapping. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The relationship between thought and language and, in particular, the issue of whether and how language influences thought is still a matter of fierce debate. Here we consider a discrimination task scenario to study language acquisition in which an agent receives linguistic input from an external teacher, in addition to sensory stimuli from the objects that exemplify the overlapping categories that make up the environment. Sensory and linguistic input signals are fused using the Neural Modelling Fields (NMF) categorization algorithm. We find that the agent with language is capable of differentiating object features that it could not distinguish without language. In this sense, the linguistic stimuli prompt the agent to redefine and refine the discrimination capacity of its sensory channels. (C) 2007 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background
Breast carcinoma is accompanied by changes in the acellular and cellular components of the microenvironment, the latter typified by a switch from fibroblasts to myofibroblasts.


Methods
We utilised conditioned media cultures, Western blot analysis and immunocytochemistry to investigate the differential effects of normal mammary fibroblasts (NMFs) and mammary cancer-associated fibroblasts (CAFs) on the phenotype and behaviour of PMC42-LA breast cancer cells. NMFs were obtained from a mammary gland at reduction mammoplasty, and CAFs from a mammary carcinoma after resection.


Results
We found greater expression of myofibroblastic markers in CAFs than in NMFs. Medium from both CAFs and NMFs induced novel expression of α-smooth muscle actin and cytokeratin-14 in PMC42-LA organoids. However, although conditioned media from NMFs resulted in distribution of vimentin-positive cells to the periphery of PMC42-LA organoids, this was not seen with CAF-conditioned medium. Upregulation of vimentin was accompanied by a mis-localization of E-cadherin, suggesting a loss of adhesive function. This was confirmed by visualizing the change in active β-catenin, localized to the cell junctions in control cells/cells in NMF-conditioned medium, to inactive β-catenin, localized to nuclei and cytoplasm in cells in CAF-conditioned medium.


Conclusion
We found no significant difference between the influences of NMFs and CAFs on PMC42-LA cell proliferation, viability, or apoptosis; significantly, we demonstrated a role for CAFs, but not for NMFs, in increasing the migratory ability of PMC42-LA cells. By concentrating NMF-conditioned media, we demonstrated the presence of factor(s) that induce epithelial-mesenchymal transition in NMF-conditioned media that are present at higher levels in CAF-conditioned media. Our in vitro results are consistent with observations in vivo showing that alterations in stroma influence the phenotype and behaviour of surrounding cells and provide evidence for a role for CAFs in stimulating cancer progression via an epithelial-mesenchymal transition. These findings have implications for our understanding of the roles of signalling between epithelial and stromal cells in the development and progression of mammary carcinoma.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Least square problem with l1 regularization has been proposed as a promising method for sparse signal reconstruction (e.g., basis pursuit de-noising and compressed sensing) and feature selection (e.g., the Lasso algorithm) in signal processing, statistics, and related fields. These problems can be cast as l1-regularized least-square program (LSP). In this paper, we propose a novel monotonic fixed point method to solve large-scale l1-regularized LSP. And we also prove the stability and convergence of the proposed method. Furthermore we generalize this method to least square matrix problem and apply it in nonnegative matrix factorization (NMF). The method is illustrated on sparse signal reconstruction, partner recognition and blind source separation problems, and the method tends to convergent faster and sparser than other l1-regularized algorithms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recently, nonnegative matrix factorization (NMF) attracts more and more attentions for the promising of wide applications. A problem that still remains is that, however, the factors resulted from it may not necessarily be realistically interpretable. Some constraints are usually added to the standard NMF to generate such interpretive results. In this paper, a minimum-volume constrained NMF is proposed and an efficient multiplicative update algorithm is developed based on the natural gradient optimization. The proposed method can be applied to the blind source separation (BSS) problem, a hot topic with many potential applications, especially if the sources are mutually dependent. Simulation results of BSS for images show the superiority of the proposed method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nonnegative matrix factorization (NMF) is widely used in signal separation and image compression. Motivated by its successful applications, we propose a new cryptosystem based on NMF, where the nonlinear mixing (NLM) model with a strong noise is introduced for encryption and NMF is used for decryption. The security of the cryptosystem relies on following two facts: 1) the constructed multivariable nonlinear function is not invertible; 2) the process of NMF is unilateral, if the inverse matrix of the constructed linear mixing matrix is not nonnegative. Comparing with Lin's method (2006) that is a theoretical scheme using one-time padding in the cryptosystem, our cipher can be used repeatedly for the practical request, i.e., multitme padding is used in our cryptosystem. Also, there is no restriction on statistical characteristics of the ciphers and the plaintexts. Thus, more signals can be processed (successfully encrypted and decrypted), no matter they are correlative, sparse, or Gaussian. Furthermore, instead of the number of zero-crossing-based method that is often unstable in encryption and decryption, an improved method based on the kurtosis of the signals is introduced to solve permutation ambiguities in waveform reconstruction. Simulations are given to illustrate security and availability of our cryptosystem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nonnegative matrix factorization (NMF) is a widely used method for blind spectral unmixing (SU), which aims at obtaining the endmembers and corresponding fractional abundances, knowing only the collected mixing spectral data. It is noted that the abundance may be sparse (i.e., the endmembers may be with sparse distributions) and sparse NMF tends to lead to a unique result, so it is intuitive and meaningful to constrain NMF with sparseness for solving SU. However, due to the abundance sum-to-one constraint in SU, the traditional sparseness measured by L0/L1-norm is not an effective constraint any more. A novel measure (termed as S-measure) of sparseness using higher order norms of the signal vector is proposed in this paper. It features the physical significance. By using the S-measure constraint (SMC), a gradient-based sparse NMF algorithm (termed as NMF-SMC) is proposed for solving the SU problem, where the learning rate is adaptively selected, and the endmembers and abundances are simultaneously estimated. In the proposed NMF-SMC, there is no pure index assumption and no need to know the exact sparseness degree of the abundance in prior. Yet, it does not require the preprocessing of dimension reduction in which some useful information may be lost. Experiments based on synthetic mixtures and real-world images collected by AVIRIS and HYDICE sensors are performed to evaluate the validity of the proposed method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Online blind source separation (BSS) is proposed to overcome the high computational cost problem, which limits the practical applications of traditional batch BSS algorithms. However, the existing online BSS methods are mainly used to separate independent or uncorrelated sources. Recently, nonnegative matrix factorization (NMF) shows great potential to separate the correlative sources, where some constraints are often imposed to overcome the non-uniqueness of the factorization. In this paper, an incremental NMF with volume constraint is derived and utilized for solving online BSS. The volume constraint to the mixing matrix enhances the identifiability of the sources, while the incremental learning mode reduces the computational cost. The proposed method takes advantage of the natural gradient based multiplication updating rule, and it performs especially well in the recovery of dependent sources. Simulations in BSS for dual-energy X-ray images, online encrypted speech signals, and high correlative face images show the validity of the proposed method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Autism Spectrum Disorder (ASD) is growing at a staggering rate, but, little is known about the cause of this condition. Inferring learning patterns from therapeutic performance data, and subsequently clustering ASD children into subgroups, is important to understand this domain, and more importantly to inform evidence-based intervention. However, this data-driven task was difficult in the past due to insufficiency of data to perform reliable analysis. For the first time, using data from a recent application for early intervention in autism (TOBY Play pad), whose download count is now exceeding 4500, we present in this paper the automatic discovery of learning patterns across 32 skills in sensory, imitation and language. We use unsupervised learning methods for this task, but a notorious problem with existing methods is the correct specification of number of patterns in advance, which in our case is even more difficult due to complexity of the data. To this end, we appeal to recent Bayesian nonparametric methods, in particular the use of Bayesian Nonparametric Factor Analysis. This model uses Indian Buffet Process (IBP) as prior on a binary matrix of infinite columns to allocate groups of intervention skills to children. The optimal number of learning patterns as well as subgroup assignments are inferred automatically from data. Our experimental results follow an exploratory approach, present different newly discovered learning patterns. To provide quantitative results, we also report the clustering evaluation against K-means and Nonnegative matrix factorization (NMF). In addition to the novelty of this new problem, we were able to demonstrate the suitability of Bayesian nonparametric models over parametric rivals.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Spectral unmixing (SU) is an emerging problem in the remote sensing image processing. Since both the endmember signatures and their abundances have nonnegative values, it is a natural choice to employ the attractive nonnegative matrix factorization (NMF) methods to solve this problem. Motivated by that the abundances are sparse, the NMF with local smoothness constraint (NMF-LSC) is proposed in this paper. In the proposed method, the smoothness constraint is utilized to impose the sparseness, instead of the traditional L1-norm which is restricted by the underlying column-sum-to-one requirement of the to the abundance matrix. Simulations show the advantages of our algorithm over the compared methods.