578 resultados para Dirichlet-multinomial
Resumo:
Joint sentiment-topic (JST) model was previously proposed to detect sentiment and topic simultaneously from text. The only supervision required by JST model learning is domain-independent polarity word priors. In this paper, we modify the JST model by incorporating word polarity priors through modifying the topic-word Dirichlet priors. We study the polarity-bearing topics extracted by JST and show that by augmenting the original feature space with polarity-bearing topics, the in-domain supervised classifiers learned from augmented feature representation achieve the state-of-the-art performance of 95% on the movie review data and an average of 90% on the multi-domain sentiment dataset. Furthermore, using feature augmentation and selection according to the information gain criteria for cross-domain sentiment classification, our proposed approach performs either better or comparably compared to previous approaches. Nevertheless, our approach is much simpler and does not require difficult parameter tuning.
Resumo:
Web APIs have gained increasing popularity in recent Web service technology development owing to its simplicity of technology stack and the proliferation of mashups. However, efficiently discovering Web APIs and the relevant documentations on the Web is still a challenging task even with the best resources available on the Web. In this paper we cast the problem of detecting the Web API documentations as a text classification problem of classifying a given Web page as Web API associated or not. We propose a supervised generative topic model called feature latent Dirichlet allocation (feaLDA) which offers a generic probabilistic framework for automatic detection of Web APIs. feaLDA not only captures the correspondence between data and the associated class labels, but also provides a mechanism for incorporating side information such as labelled features automatically learned from data that can effectively help improving classification performance. Extensive experiments on our Web APIs documentation dataset shows that the feaLDA model outperforms three strong supervised baselines including naive Bayes, support vector machines, and the maximum entropy model, by over 3% in classification accuracy. In addition, feaLDA also gives superior performance when compared against other existing supervised topic models.
Resumo:
This article applies a multinomial logit estimator to investigate which factors affect SME owners' expectations to grow their businesses in Lithuania. Our findings provide evidence that SME owners' human capital (education) matters and that growth expectations are positively related to exporting. In addition, we analyse the link between the perceptions of business constraints and growth expectations and find that the factors, which are perceived as main business barriers, are not necessarily those which are associated with reduced growth expectations. However, perceptions of corruption seem to affect growth expectations the most.
Resumo:
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed JST model is fully unsupervised. The model has been evaluated on the movie review dataset to classify the review sentiment polarity and minimum prior information have also been explored to further improve the sentiment classification accuracy. Preliminary experiments have shown promising results achieved by JST.
Resumo:
We investigate two numerical procedures for the Cauchy problem in linear elasticity, involving the relaxation of either the given boundary displacements (Dirichlet data) or the prescribed boundary tractions (Neumann data) on the over-specified boundary, in the alternating iterative algorithm of Kozlov et al. (1991). The two mixed direct (well-posed) problems associated with each iteration are solved using the method of fundamental solutions (MFS), in conjunction with the Tikhonov regularization method, while the optimal value of the regularization parameter is chosen via the generalized cross-validation (GCV) criterion. An efficient regularizing stopping criterion which ceases the iterative procedure at the point where the accumulation of noise becomes dominant and the errors in predicting the exact solutions increase, is also presented. The MFS-based iterative algorithms with relaxation are tested for Cauchy problems for isotropic linear elastic materials in various geometries to confirm the numerical convergence, stability, accuracy and computational efficiency of the proposed method.
Resumo:
We propose two algorithms involving the relaxation of either the given Dirichlet data or the prescribed Neumann data on the over-specified boundary, in the case of the alternating iterative algorithm of ` 12 ` 12 `$12 `&12 `#12 `^12 `_12 `%12 `~12 *Kozlov91 applied to Cauchy problems for the modified Helmholtz equation. A convergence proof of these relaxation methods is given, along with a stopping criterion. The numerical results obtained using these procedures, in conjunction with the boundary element method (BEM), show the numerical stability, convergence, consistency and computational efficiency of the proposed methods.
Resumo:
A Cauchy problem for general elliptic second-order linear partial differential equations in which the Dirichlet data in H½(?1 ? ?3) is assumed available on a larger part of the boundary ? of the bounded domain O than the boundary portion ?1 on which the Neumann data is prescribed, is investigated using a conjugate gradient method. We obtain an approximation to the solution of the Cauchy problem by minimizing a certain discrete functional and interpolating using the finite diference or boundary element method. The minimization involves solving equations obtained by discretising mixed boundary value problems for the same operator and its adjoint. It is proved that the solution of the discretised optimization problem converges to the continuous one, as the mesh size tends to zero. Numerical results are presented and discussed.
Resumo:
We propose two algorithms involving the relaxation of either the given Dirichlet data (boundary displacements) or the prescribed Neumann data (boundary tractions) on the over-specified boundary in the case of the alternating iterative algorithm of Kozlov et al. [16] applied to Cauchy problems in linear elasticity. A convergence proof of these relaxation methods is given, along with a stopping criterion. The numerical results obtained using these procedures, in conjunction with the boundary element method (BEM), show the numerical stability, convergence, consistency and computational efficiency of the proposed method.
Resumo:
In this paper, we consider analytical and numerical solutions to the Dirichlet boundary-value problem for the biharmonic partial differential equation on a disc of finite radius in the plane. The physical interpretation of these solutions is that of the harmonic oscillations of a thin, clamped plate. For the linear, fourth-order, biharmonic partial differential equation in the plane, it is well known that the solution method of separation in polar coordinates is not possible, in general. However, in this paper, for circular domains in the plane, it is shown that a method, here called quasi-separation of variables, does lead to solutions of the partial differential equation. These solutions are products of solutions of two ordinary linear differential equations: a fourth-order radial equation and a second-order angular differential equation. To be expected, without complete separation of the polar variables, there is some restriction on the range of these solutions in comparison with the corresponding separated solutions of the second-order harmonic differential equation in the plane. Notwithstanding these restrictions, the quasi-separation method leads to solutions of the Dirichlet boundary-value problem on a disc with centre at the origin, with boundary conditions determined by the solution and its inward drawn normal taking the value 0 on the edge of the disc. One significant feature for these biharmonic boundary-value problems, in general, follows from the form of the biharmonic differential expression when represented in polar coordinates. In this form, the differential expression has a singularity at the origin, in the radial variable. This singularity translates to a singularity at the origin of the fourth-order radial separated equation; this singularity necessitates the application of a third boundary condition in order to determine a self-adjoint solution to the Dirichlet boundary-value problem. The penultimate section of the paper reports on numerical solutions to the Dirichlet boundary-value problem; these results are also presented graphically. Two specific cases are studied in detail and numerical values of the eigenvalues are compared with the results obtained in earlier studies.
Resumo:
We investigate a mixed problem with variable lateral conditions for the heat equation that arises in modelling exocytosis, i.e. the opening of a cell boundary in specific biological species for the release of certain molecules to the exterior of the cell. The Dirichlet condition is imposed on a surface patch of the boundary and this patch is occupying a larger part of the boundary as time increases modelling where the cell is opening (the fusion pore), and on the remaining part, a zero Neumann condition is imposed (no molecules can cross this boundary). Uniform concentration is assumed at the initial time. We introduce a weak formulation of this problem and show that there is a unique weak solution. Moreover, we give an asymptotic expansion for the behaviour of the solution near the opening point and for small values in time. We also give an integral equation for the numerical construction of the leading term in this expansion.
Resumo:
We consider a Cauchy problem for the Laplace equation in a bounded region containing a cut, where the region is formed by removing a sufficiently smooth arc (the cut) from a bounded simply connected domain D. The aim is to reconstruct the solution on the cut from the values of the solution and its normal derivative on the boundary of the domain D. We propose an alternating iterative method which involves solving direct mixed problems for the Laplace operator in the same region. These mixed problems have either a Dirichlet or a Neumann boundary condition imposed on the cut and are solved by a potential approach. Each of these mixed problems is reduced to a system of integral equations of the first kind with logarithmic and hypersingular kernels and at most a square root singularity in the densities at the endpoints of the cut. The full discretization of the direct problems is realized by a trigonometric quadrature method which has super-algebraic convergence. The numerical examples presented illustrate the feasibility of the proposed method.
Resumo:
Latent topics derived by topic models such as Latent Dirichlet Allocation (LDA) are the result of hidden thematic structures which provide further insights into the data. The automatic labelling of such topics derived from social media poses however new challenges since topics may characterise novel events happening in the real world. Existing automatic topic labelling approaches which depend on external knowledge sources become less applicable here since relevant articles/concepts of the extracted topics may not exist in external sources. In this paper we propose to address the problem of automatic labelling of latent topics learned from Twitter as a summarisation problem. We introduce a framework which apply summarisation algorithms to generate topic labels. These algorithms are independent of external sources and only rely on the identification of dominant terms in documents related to the latent topic. We compare the efficiency of existing state of the art summarisation algorithms. Our results suggest that summarisation algorithms generate better topic labels which capture event-related context compared to the top-n terms returned by LDA. © 2014 Association for Computational Linguistics.
Resumo:
Improving the performance of private sector small and medium sized enterprises (SMEs) in a cost effective manner is a major concern for government. Governments have saved costs by moving information online rather than through more expensive face-to-face exchanges between advisers and clients. Building on previous work that distinguished between types of advice, this article evaluates whether these changes to delivery mechanisms affect the type of advice received. Using a multinomial logit model of 1334 cases of business advice to small firms collected in England, the study found that advice to improve capabilities was taken by smaller firms who were less likely to have limited liability or undertake business planning. SMEs sought word-of-mouth referrals before taking internal, capability-enhancing advice. This is also the case when that advice was part of a wider package of assistance involving both internal and external aspects. Only when firms took advice that used extant capabilities did they rely on the Internet. Therefore, when the Internet is privileged over face-to-face advice the changes made by each recipient of advice are likely to diminish causing less impact from advice within the economy. It implies that fewer firms will adopt the sorts of management practices that would improve their productivity. © 2014 Taylor & Francis.
Resumo:
Resource Space Model is a kind of data model which can effectively and flexibly manage the digital resources in cyber-physical system from multidimensional and hierarchical perspectives. This paper focuses on constructing resource space automatically. We propose a framework that organizes a set of digital resources according to different semantic dimensions combining human background knowledge in WordNet and Wikipedia. The construction process includes four steps: extracting candidate keywords, building semantic graphs, detecting semantic communities and generating resource space. An unsupervised statistical language topic model (i.e., Latent Dirichlet Allocation) is applied to extract candidate keywords of the facets. To better interpret meanings of the facets found by LDA, we map the keywords to Wikipedia concepts, calculate word relatedness using WordNet's noun synsets and construct corresponding semantic graphs. Moreover, semantic communities are identified by GN algorithm. After extracting candidate axes based on Wikipedia concept hierarchy, the final axes of resource space are sorted and picked out through three different ranking strategies. The experimental results demonstrate that the proposed framework can organize resources automatically and effectively.©2013 Published by Elsevier Ltd. All rights reserved.
Resumo:
* The work is supported by RFBR, grant 04-01-00858-a