992 resultados para Latent Semantic Indexing


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Knowledge-based clusters are studied from the structural point of view. Generalized descriptions for such clusters are stated and illustrated. Peculiarities of certain knowledge-based cluster configurations are highlighted. The adequacy of the connectives logical and (“and”) logical or (“exclusive-or”) in describing such clusters is justified. The definition of “concept” is elaborated from the clustering point of view and used to establish the equivalence between, descriptions of clusters and concepts. The order-independence of semantic-directed clustering approach is established formally based on axiomatic considerations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is important to identify the ``correct'' number of topics in mechanisms like Latent Dirichlet Allocation(LDA) as they determine the quality of features that are presented as features for classifiers like SVM. In this work we propose a measure to identify the correct number of topics and offer empirical evidence in its favor in terms of classification accuracy and the number of topics that are naturally present in the corpus. We show the merit of the measure by applying it on real-world as well as synthetic data sets(both text and images). In proposing this measure, we view LDA as a matrix factorization mechanism, wherein a given corpus C is split into two matrix factors M-1 and M-2 as given by C-d*w = M1(d*t) x Q(t*w).Where d is the number of documents present in the corpus anti w is the size of the vocabulary. The quality of the split depends on ``t'', the right number of topics chosen. The measure is computed in terms of symmetric KL-Divergence of salient distributions that are derived from these matrix factors. We observe that the divergence values are higher for non-optimal number of topics - this is shown by a `dip' at the right value for `t'.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The decision to patent a technology is a difficult one to make for the top management of any organization. The expected value that the patent might deliver in the market is an important factor that impacts this judgement. Earlier researchers have suggested that patent prices are better indicators of value of a patent and that auction prices are the best way of determining value. However, the lack of public data on pricing has prevented research on understanding the dynamics of patent pricing. Our paper uses singleton patent auction price data of Ocean Tomo LLC to study the prices of patents. We describe price characteristics of these patents. The price of these patents was correlated with their age, and a significant correlation was found. A price - age matrix was developed and we describe the price characteristics of patents using four quadrants of the matrix, namely young and old patents with low and high prices. We also found that patents owned by small firms get transacted more often and inventor owned patents attracted a better price than assignee owned patents.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Indexing of a decagonal quasicrystal using the scheme utilizing five planar vectors and one perpendicular to them is examined in detail. A method for determining the indices of zone axes that a reciprocal vector would make in a decagonal phase of any periodicity has been proposed. By this method, the location of the zone axes made by any reciprocal vector can be predicted. The orthogonality condition has been simplified for the zone axes containing twofold vectors. The locations of zone axes have also been determined by an alternative method, utilizing spherical trigonometric calculations, which confirm the zone-axis locations given by the indices. The effect of one-dimensional periodicity on the indices and the accuracy of the zone-axis determination is discussed. Rules for the formation of zone axes between several reciprocal vectors and the prediction of all the reciprocal vectors in a zone are evolved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The least path criterion or least path length in the context of redundant basis vector systems is discussed and a mathematical proof is presented of the uniqueness of indices obtained by applying the least path criterion. Though the method has greater generality, this paper concentrates on the two-dimensional decagonal lattice. The order of redundancy is also discussed; this will help eventually to correlate with other redundant but desirable indexing sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Land use and land cover changes affect the partitioning of latent and sensible heat, which impacts the broader climate system. Increased latent heat flux to the atmosphere has a local cooling influence known as `evaporative cooling', but this energy will be released back to the atmosphere wherever the water condenses. However, the extent to which local evaporative cooling provides a global cooling influence has not been well characterized. Here, we perform a highly idealized set of climate model simulations aimed at understanding the effects that changes in the balance between surface sensible and latent heating have on the global climate system. We find that globally adding a uniform 1 W m(-2) source of latent heat flux along with a uniform 1 W m(-2) sink of sensible heat leads to a decrease in global mean surface air temperature of 0.54 +/- 0.04 K. This occurs largely as a consequence of planetary albedo increases associated with an increase in low elevation cloudiness caused by increased evaporation. Thus, our model results indicate that, on average, when latent heating replaces sensible heating, global, and not merely local, surface temperatures decrease.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper focuses on studying the relationship between patent latent variables and patent price. From the existing literature, seven patent latent variables, namely age, generality, originality, foreign filings, technology field, forward citations, and backward citations were identified as having an influence on patent value. We used Ocean Tomo's patent auction price data in this study. We transformed the price and the predictor variables (excluding the dummy variables) to its logarithmic value. The OLS estimates revealed that forward citations and foreign filings were positively correlated to price. Both the variables jointly explained 14.79% of the variance in patent pricing. We did not find sufficient evidence to come up with any definite conclusions on the relationship between price and the variables such as age, technology field, generality, backward citations and originality. The Heckman two-stage sample selection model was used to test for selection bias. (C) 2011 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Indexing of a decagonal quasicrystal using the scheme utilizing five planar vectors and one perpendicular to them is examined in detail. A method for determining the indices of zone axes that a reciprocal vector would make in a decagonal phase of any periodicity has been proposed. By this method, the location of the zone axes made by any reciprocal vector can be predicted. The orthogonality condition has been simplified for the zone axes containing twofold vectors. The locations of zone axes have also been determined by an alternative method, utilizing spherical trigonometric calculations, which confirm the zone-axis locations given by the indices. The effect of one-dimensional periodicity on the indices and the accuracy of the zone-axis determination is discussed. Rules for the formation of zone axes between several reciprocal vectors and the prediction of all the reciprocal vectors in a zone are evolved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Mycobacterium tuberculosis, a causative agent of chronic tuberculosis disease, is widespread among some animal species too. There is paucity of information on the distribution, prevalence and true disease status of tuberculosis in Asian elephants (Elephas maximus). The aim of this study was to estimate the sensitivity and specificity of serological tests to diagnose M. tuberculosis infection in captive elephants in southern India while simultaneously estimating sero-prevalence. Methodology/Principal Findings: Health assessment of 600 elephants was carried out and their sera screened with a commercially available rapid serum test. Trunk wash culture of select rapid serum test positive animals yielded no animal positive for M. tuberculosis isolation. Under Indian field conditions where the true disease status is unknown, we used a latent class model to estimate the diagnostic characteristics of an existing (rapid serum test) and new (four in-house ELISA) tests. One hundred and seventy nine sera were randomly selected for screening in the five tests. Diagnostic sensitivities of the four ELISAs were 91.3-97.6% (95% Credible Interval (CI): 74.8-99.9) and diagnostic specificity were 89.6-98.5% (95% CI: 79.4-99.9) based on the model we assumed. We estimate that 53.6% (95% CI: 44.6-62.8) of the samples tested were free from infection with M. tuberculosis and 15.9% (97.5% CI: 9.8 - to 24.0) tested positive on all five tests. Conclusions/Significance: Our results provide evidence for high prevalence of asymptomatic M. tuberculosis infection in Asian elephants in a captive Indian setting. Further validation of these tests would be important in formulating area-specific effective surveillance and control measures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Song-selection and mood are interdependent. If we capture a song’s sentiment, we can determine the mood of the listener, which can serve as a basis for recommendation systems. Songs are generally classified according to genres, which don’t entirely reflect sentiments. Thus, we require an unsupervised scheme to mine them. Sentiments are classified into either two (positive/negative) or multiple (happy/angry/sad/...) classes, depending on the application. We are interested in analyzing the feelings invoked by a song, involving multi-class sentiments. To mine the hidden sentimental structure behind a song, in terms of “topics”, we consider its lyrics and use Latent Dirichlet Allocation (LDA). Each song is a mixture of moods. Topics mined by LDA can represent moods. Thus we get a scheme of collecting similar-mood songs. For validation, we use a dataset of songs containing 6 moods annotated by users of a particular website.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There are many popular models available for classification of documents like Naïve Bayes Classifier, k-Nearest Neighbors and Support Vector Machine. In all these cases, the representation is based on the “Bag of words” model. This model doesn't capture the actual semantic meaning of a word in a particular document. Semantics are better captured by proximity of words and their occurrence in the document. We propose a new “Bag of Phrases” model to capture this discriminative power of phrases for text classification. We present a novel algorithm to extract phrases from the corpus using the well known topic model, Latent Dirichlet Allocation(LDA), and to integrate them in vector space model for classification. Experiments show a better performance of classifiers with the new Bag of Phrases model against related representation models.