930 resultados para Probabilistic latent semantic analysis (PLSA)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of our work is to present solutions and a methodical support for automated techniques and procedures in domain engineering, in particular for variability modeling. Our approach is based upon Semantic Modeling concepts, for which semantic description, representation patterns and inference mechanisms are defined. Thus, model-driven techniques enriched with semantics will allow flexibility and variability in representation means, reasoning power and the required analysis depth for the identification, interpretation and adaptation of artifact properties and qualities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents an approach to extraction of facts from texts of documents. This approach is based on using knowledge about the subject domain, specialized dictionary and the schemes of facts that describe fact structures taking into consideration both semantic and syntactic compatibility of elements of facts. Actually extracted facts combine into one structure the dictionary lexical objects found in the text and match them against concepts of subject domain ontology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The approaches to the analysis of various information resources pertinent to user requirements at a semantic level are determined by the thesauruses of the appropriate subject domains. The algorithms of formation and normalization of the multilinguistic thesaurus, and also methods of their comparison are given.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

* This work was financially supported by RFBF-04-01-00858.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new distance function to compare arbitrary partitions is proposed. Clustering of image collections and image segmentation give objects to be matched. Offered metric intends for combination of visual features and metadata analysis to solve a semantic gap between low-level visual features and high-level human concept.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the ultimate aims of Natural Language Processing is to automate the analysis of the meaning of text. A fundamental step in that direction consists in enabling effective ways to automatically link textual references to their referents, that is, real world objects. The work presented in this paper addresses the problem of attributing a sense to proper names in a given text, i.e., automatically associating words representing Named Entities with their referents. The method for Named Entity Disambiguation proposed here is based on the concept of semantic relatedness, which in this work is obtained via a graph-based model over Wikipedia. We show that, without building the traditional bag of words representation of the text, but instead only considering named entities within the text, the proposed method achieves results competitive with the state-of-the-art on two different datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We develop, implement and study a new Bayesian spatial mixture model (BSMM). The proposed BSMM allows for spatial structure in the binary activation indicators through a latent thresholded Gaussian Markov random field. We develop a Gibbs (MCMC) sampler to perform posterior inference on the model parameters, which then allows us to assess the posterior probabilities of activation for each voxel. One purpose of this article is to compare the HJ model and the BSMM in terms of receiver operating characteristics (ROC) curves. Also we consider the accuracy of the spatial mixture model and the BSMM for estimation of the size of the activation region in terms of bias, variance and mean squared error. We perform a simulation study to examine the aforementioned characteristics under a variety of configurations of spatial mixture model and BSMM both as the size of the region changes and as the magnitude of activation changes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sentiment lexicons for sentiment analysis offer a simple, yet effective way to obtain the prior sentiment information of opinionated words in texts. However, words’ sentiment orientations and strengths often change throughout various contexts in which the words appear. In this paper, we propose a lexicon adaptation approach that uses the contextual semantics of words to capture their contexts in tweet messages and update their prior sentiment orientations and/or strengths accordingly. We evaluate our approach on one state-of-the-art sentiment lexicon using three different Twitter datasets. Results show that the sentiment lexicons adapted by our approach outperform the original lexicon in accuracy and F-measure in two datasets, but give similar accuracy and slightly lower F-measure in one dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the main concepts of a project under development concerning the analysis process of a scene containing a large number of objects, represented as unstructured point clouds. To achieve what we called the "optimal scene interpretation" (the shortest scene description satisfying the MDL principle) we follow an approach for managing 3-D objects based on a semantic framework based on ontologies for adding and sharing conceptual knowledge about spatial objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a research of linguistic structure of Bulgarian bells knowledge. The idea of building semantic structure of Bulgarian bells appeared during the “Multimedia fund - BellKnow” project. In this project was collected a lots of data about bells, their structure, history, technical data, etc. This is the first attempt for computation linguistic explain of bell knowledge and deliver a semantic representation of that knowledge. Based on this research some linguistic components, aiming to realize different types of analysis of text objects are implemented in term dictionaries. Thus, we lay the foundation of the linguistic analysis services in these digital dictionaries aiding the research of kinds, number and frequency of the lexical units that constitute various bell objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: Primary 60J45, 60J50, 35Cxx; Secondary 31Cxx.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Center for Epidemiologic Studies-Depression Scale (CES-D) is the most frequently used scale for measuring depressive symptomatology in caregiving research. The aim of this study is to test its construct structure and measurement equivalence between caregivers from two Spanish-speaking countries. Face-to-face interviews were carried out with 595 female dementia caregivers from Madrid, Spain, and from Coahuila, Mexico. The structure of the CES-D was analyzed using exploratory and confirmatory factor analysis (EFA and CFA, respectively). Measurement invariance across samples was analyzed comparing a baseline model with a more restrictive model. Significant differences between means were found for 7 items. The results of the EFA clearly supported a four-factor solution. The CFA for the whole sample with the four factors revealed high and statistically significant loading coefficients for all items (except item number 4). When equality constraints were imposed to test for the invariance between countries, the change in chi-square was significant, indicating that complete invariance could not be assumed. Significant between-countries differences were found for three of the four latent factor mean scores. Although the results provide general support for the original four-factor structure, caution should be exercised on reporting comparisons of depression scores between Spanish-speaking countries.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The focus of this thesis is the extension of topographic visualisation mappings to allow for the incorporation of uncertainty. Few visualisation algorithms in the literature are capable of mapping uncertain data with fewer able to represent observation uncertainties in visualisations. As such, modifications are made to NeuroScale, Locally Linear Embedding, Isomap and Laplacian Eigenmaps to incorporate uncertainty in the observation and visualisation spaces. The proposed mappings are then called Normally-distributed NeuroScale (N-NS), T-distributed NeuroScale (T-NS), Probabilistic LLE (PLLE), Probabilistic Isomap (PIso) and Probabilistic Weighted Neighbourhood Mapping (PWNM). These algorithms generate a probabilistic visualisation space with each latent visualised point transformed to a multivariate Gaussian or T-distribution, using a feed-forward RBF network. Two types of uncertainty are then characterised dependent on the data and mapping procedure. Data dependent uncertainty is the inherent observation uncertainty. Whereas, mapping uncertainty is defined by the Fisher Information of a visualised distribution. This indicates how well the data has been interpolated, offering a level of ‘surprise’ for each observation. These new probabilistic mappings are tested on three datasets of vectorial observations and three datasets of real world time series observations for anomaly detection. In order to visualise the time series data, a method for analysing observed signals and noise distributions, Residual Modelling, is introduced. The performance of the new algorithms on the tested datasets is compared qualitatively with the latent space generated by the Gaussian Process Latent Variable Model (GPLVM). A quantitative comparison using existing evaluation measures from the literature allows performance of each mapping function to be compared. Finally, the mapping uncertainty measure is combined with NeuroScale to build a deep learning classifier, the Cascading RBF. This new structure is tested on the MNist dataset achieving world record performance whilst avoiding the flaws seen in other Deep Learning Machines.