930 resultados para Probabilistic latent semantic analysis (PLSA)


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Lexicon-based approaches to Twitter sentiment analysis are gaining much popularity due to their simplicity, domain independence, and relatively good performance. These approaches rely on sentiment lexicons, where a collection of words are marked with fixed sentiment polarities. However, words' sentiment orientation (positive, neural, negative) and/or sentiment strengths could change depending on context and targeted entities. In this paper we present SentiCircle; a novel lexicon-based approach that takes into account the contextual and conceptual semantics of words when calculating their sentiment orientation and strength in Twitter. We evaluate our approach on three Twitter datasets using three different sentiment lexicons. Results show that our approach significantly outperforms two lexicon baselines. Results are competitive but inconclusive when comparing to state-of-art SentiStrength, and vary from one dataset to another. SentiCircle outperforms SentiStrength in accuracy on average, but falls marginally behind in F-measure. © 2014 Springer International Publishing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Cloud computing is a new technological paradigm offering computing infrastructure, software and platforms as a pay-as-you-go, subscription-based service. Many potential customers of cloud services require essential cost assessments to be undertaken before transitioning to the cloud. Current assessment techniques are imprecise as they rely on simplified specifications of resource requirements that fail to account for probabilistic variations in usage. In this paper, we address these problems and propose a new probabilistic pattern modelling (PPM) approach to cloud costing and resource usage verification. Our approach is based on a concise expression of probabilistic resource usage patterns translated to Markov decision processes (MDPs). Key costing and usage queries are identified and expressed in a probabilistic variant of temporal logic and calculated to a high degree of precision using quantitative verification techniques. The PPM cost assessment approach has been implemented as a Java library and validated with a case study and scalability experiments. © 2012 Springer-Verlag Berlin Heidelberg.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Sociolinguists have documented the substrate influence of various languages on the formation of dialects in numerous ethnic-regional setting throughout the United States. This literature shows that while phonological and grammatical influences from other languages may be instantiated as durable dialect features, lexical phenomena often fade over time as ethnolinguistic communities assimilate with contiguous dialect groups. In preliminary investigations of emerging Miami Latino English, we have observed that lexical forms based on Spanish lexical forms are not only ubiquitous among the speech of the first generation Cuban Americans but also of the second. Examples, observed in field work, casual observation, and studied formally in an experimental context include the following: “get down from the car,” which derives from the Spanish equivalent, bajar del carro instead of “get out of the car”. The translation task administered to thirty-one participants showed a variety lexical phenomena are still maintained at equal or higher frequencies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Until recently the use of biometrics was restricted to high-security environments and criminal identification applications, for economic and technological reasons. However, in recent years, biometric authentication has become part of daily lives of people. The large scale use of biometrics has shown that users within the system may have different degrees of accuracy. Some people may have trouble authenticating, while others may be particularly vulnerable to imitation. Recent studies have investigated and identified these types of users, giving them the names of animals: Sheep, Goats, Lambs, Wolves, Doves, Chameleons, Worms and Phantoms. The aim of this study is to evaluate the existence of these users types in a database of fingerprints and propose a new way of investigating them, based on the performance of verification between subjects samples. Once introduced some basic concepts in biometrics and fingerprint, we present the biometric menagerie and how to evaluate them.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Until recently the use of biometrics was restricted to high-security environments and criminal identification applications, for economic and technological reasons. However, in recent years, biometric authentication has become part of daily lives of people. The large scale use of biometrics has shown that users within the system may have different degrees of accuracy. Some people may have trouble authenticating, while others may be particularly vulnerable to imitation. Recent studies have investigated and identified these types of users, giving them the names of animals: Sheep, Goats, Lambs, Wolves, Doves, Chameleons, Worms and Phantoms. The aim of this study is to evaluate the existence of these users types in a database of fingerprints and propose a new way of investigating them, based on the performance of verification between subjects samples. Once introduced some basic concepts in biometrics and fingerprint, we present the biometric menagerie and how to evaluate them.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We formally compare fundamental factor and latent factor approaches to oil price modelling. Fundamental modelling has a long history in seeking to understand oil price movements, while latent factor modelling has a more recent and limited history, but has gained popularity in other financial markets. The two approaches, though competing, have not formally been compared as to effectiveness. For a range of short- medium- and long-dated WTI oil futures we test a recently proposed five-factor fundamental model and a Principal Component Analysis latent factor model. Our findings demonstrate that there is no discernible difference between the two techniques in a dynamic setting. We conclude that this infers some advantages in adopting the latent factor approach due to the difficulty in determining a well specified fundamental model.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

ABSTRACT Researchers frequently have to analyze scales in which some participants have failed to respond to some items. In this paper we focus on the exploratory factor analysis of multidimensional scales (i.e., scales that consist of a number of subscales) where each subscale is made up of a number of Likert-type items, and the aim of the analysis is to estimate participants' scores on the corresponding latent traits. We propose a new approach to deal with missing responses in such a situation that is based on (1) multiple imputation of non-responses and (2) simultaneous rotation of the imputed datasets. We applied the approach in a real dataset where missing responses were artificially introduced following a real pattern of non-responses, and a simulation study based on artificial datasets. The results show that our approach (specifically, Hot-Deck multiple imputation followed of Consensus Promin rotation) was able to successfully compute factor score estimates even for participants that have missing data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We study a climatologically important interaction of two of the main components of the geophysical system by adding an energy balance model for the averaged atmospheric temperature as dynamic boundary condition to a diagnostic ocean model having an additional spatial dimension. In this work, we give deeper insight than previous papers in the literature, mainly with respect to the 1990 pioneering model by Watts and Morantine. We are taking into consideration the latent heat for the two phase ocean as well as a possible delayed term. Non-uniqueness for the initial boundary value problem, uniqueness under a non-degeneracy condition and the existence of multiple stationary solutions are proved here. These multiplicity results suggest that an S-shaped bifurcation diagram should be expected to occur in this class of models generalizing previous energy balance models. The numerical method applied to the model is based on a finite volume scheme with nonlinear weighted essentially non-oscillatory reconstruction and Runge–Kutta total variation diminishing for time integration.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Maintenance of transport infrastructure assets is widely advocated as the key in minimizing current and future costs of the transportation network. While effective maintenance decisions are often a result of engineering skills and practical knowledge, efficient decisions must also account for the net result over an asset's life-cycle. One essential aspect in the long term perspective of transport infrastructure maintenance is to proactively estimate maintenance needs. In dealing with immediate maintenance actions, support tools that can prioritize potential maintenance candidates are important to obtain an efficient maintenance strategy. This dissertation consists of five individual research papers presenting a microdata analysis approach to transport infrastructure maintenance. Microdata analysis is a multidisciplinary field in which large quantities of data is collected, analyzed, and interpreted to improve decision-making. Increased access to transport infrastructure data enables a deeper understanding of causal effects and a possibility to make predictions of future outcomes. The microdata analysis approach covers the complete process from data collection to actual decisions and is therefore well suited for the task of improving efficiency in transport infrastructure maintenance. Statistical modeling was the selected analysis method in this dissertation and provided solutions to the different problems presented in each of the five papers. In Paper I, a time-to-event model was used to estimate remaining road pavement lifetimes in Sweden. In Paper II, an extension of the model in Paper I assessed the impact of latent variables on road lifetimes; displaying the sections in a road network that are weaker due to e.g. subsoil conditions or undetected heavy traffic. The study in Paper III incorporated a probabilistic parametric distribution as a representation of road lifetimes into an equation for the marginal cost of road wear. Differentiated road wear marginal costs for heavy and light vehicles are an important information basis for decisions regarding vehicle miles traveled (VMT) taxation policies. In Paper IV, a distribution based clustering method was used to distinguish between road segments that are deteriorating and road segments that have a stationary road condition. Within railway networks, temporary speed restrictions are often imposed because of maintenance and must be addressed in order to keep punctuality. The study in Paper V evaluated the empirical effect on running time of speed restrictions on a Norwegian railway line using a generalized linear mixed model.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We create and study a generative model for Irish traditional music based on Variational Autoencoders and analyze the learned latent space trying to find musically significant correlations in the latent codes' distributions in order to perform musical analysis on data. We train two kinds of models: one trained on a dataset of Irish folk melodies, one trained on bars extrapolated from the melodies dataset, each one in five variations of increasing size. We conduct the following experiments: we inspect the latent space of tunes and bars in relation to key, time signature, and estimated harmonic function of bars; we search for links between tunes in a particular style (i.e. "reels'") and their positioning in latent space relative to other tunes; we compute distances between embedded bars in a tune to gain insight into the model's understanding of the similarity between bars. Finally, we show and evaluate generative examples. We find that the learned latent space does not explicitly encode musical information and is thus unusable for musical analysis of data, while generative results are generally good and not strictly dependent on the musical coherence of the model's internal representation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Alpha oscillatory activity has long been associated with perceptual and cognitive processes related to attention control. The aim of this study is to explore the task-dependent role of alpha frequency in a lateralized visuo-spatial detection task. Specifically, the thesis focuses on consolidating the scientific literature's knowledge about the role of alpha frequency in perceptual accuracy, and deepening the understanding of what determines trial-by-trial fluctuations of alpha parameters and how these fluctuations influence overall task performance. The hypotheses, confirmed empirically, were that different implicit strategies are put in place based on the task context, in order to maximize performance with optimal resource distribution (namely alpha frequency, associated positively with performance): “Lateralization” of the attentive resources towards one hemifield should be associated with higher alpha frequency difference between contralateral and ipsilateral hemisphere; “Distribution” of the attentive resources across hemifields should be associated with lower alpha frequency difference between hemispheres; These strategies, used by the participants according to their brain capabilities, have proven themselves adaptive or maladaptive depending on the different tasks to which they have been set: "Distribution" of the attentive resources seemed to be the best strategy when the distribution probability between hemifields was balanced: i.e. the neutral condition task. "Lateralization" of the attentive resources seemed to be more effective when the distribution probability between hemifields was biased towards one hemifield: i.e., the biased condition task.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

ABSTRACT Microphysical and thermodynamical features of two tropical systems, namely Hurricane Ivan and Typhoon Conson, and one sub-tropical, Catarina, have been analyzed based on space-born radar PR measurements available on the TRMM satellite. The procedure to classify the reflectivity profiles followed the Heymsfield et al (2000) and Steiner et al (1995) methodologies. The water and ice content have been calculated using a relationship obtained with data of the surface SPOL radar and PR in Rondonia State in Brazil. The diabatic heating rate due to latent heat release has been estimated using the methodology developed by Tao et al (1990). A more detailed analysis has been performed for Hurricane Catarina, the first of its kind in South Atlantic. High water content mean value has been found in Conson and Ivan at low levels and close to their centers. Results indicate that hurricane Catarina was shallower than the other two systems, with less water and the water was concentrated closer to its center. The mean ice content in Catarina was about 0.05 g kg-1 while in Conson it was 0.06 g kg-1 and in Ivan 0.08 g kg-1. Conson and Ivan had water content up to 0.3 g kg-1 above the 0ºC layer, while Catarina had less than 0.15 g kg-1. The latent heat released by Catarina showed to be very similar to the other two systems, except in the regions closer to the center.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Stavskaya's model is a one-dimensional probabilistic cellular automaton (PCA) introduced in the end of the 1960s as an example of a model displaying a nonequilibrium phase transition. Although its absorbing state phase transition is well understood nowadays, the model never received a full numerical treatment to investigate its critical behavior. In this Brief Report we characterize the critical behavior of Stavskaya's PCA by means of Monte Carlo simulations and finite-size scaling analysis. The critical exponents of the model are calculated and indicate that its phase transition belongs to the directed percolation universality class of critical behavior, as would be expected on the basis of the directed percolation conjecture. We also explicitly establish the relationship of the model with the Domany-Kinzel PCA on its directed site percolation line, a connection that seems to have gone unnoticed in the literature so far.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.