903 resultados para Probabilistic latent semantic analysis (PLSA)


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Colombia is one of the largest per capita mercury polluters in the world as a consequence of its artisanal gold mining activities. The severity of this problem in terms of potential health effects was evaluated by means of a probabilistic risk assessment carried out in the twelve departments (or provinces) in Colombia with the largest gold production. The two exposure pathways included in the risk assessment were inhalation of elemental Hg vapors and ingestion of fish contaminated with methyl mercury. Exposure parameters for the adult population (especially rates of fish consumption) were obtained from nation-wide surveys and concentrations of Hg in air and of methyl-mercury in fish were gathered from previous scientific studies. Fish consumption varied between departments and ranged from 0 to 0.3 kg d?1. Average concentrations of total mercury in fish (70 data) ranged from 0.026 to 3.3 lg g?1. A total of 550 individual measurements of Hg in workshop air (ranging from menor queDL to 1 mg m?3) and 261 measurements of Hg in outdoor air (ranging from menor queDL to 0.652 mg m?3) were used to generate the probability distributions used as concentration terms in the calculation of risk. All but two of the distributions of Hazard Quotients (HQ) associated with ingestion of Hg-contaminated fish for the twelve regions evaluated presented median values higher than the threshold value of 1 and the 95th percentiles ranged from 4 to 90. In the case of exposure to Hg vapors, minimum values of HQ for the general population exceeded 1 in all the towns included in this study, and the HQs for miner-smelters burning the amalgam is two orders of magnitude higher, reaching values of 200 for the 95th percentile. Even acknowledging the conservative assumptions included in the risk assessment and the uncertainties associated with it, its results clearly reveal the exorbitant levels of risk endured not only by miner-smelters but also by the general population of artisanal gold mining communities in Colombia.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Data from an attitudinal survey and stated preference ranking experiment conducted in two urban European interchanges (i.e. City-HUBs) in Madrid (Spain) and Thessaloniki (Greece) show that the importance that City-HUBs users attach to the intermodal infrastructure varies strongly as a function of their perceptions of time spent in the interchange (i.e.intermodal transfer and waiting time). A principal components analysis allocates respondents (i.e. city-HUB users) to two classes with substantially different perceptions of time saving when they make a transfer and of time using during their waiting time.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Semantic interoperability is essential to facilitate efficient collaboration in heterogeneous multi-site healthcare environments. The deployment of a semantic interoperability solution has the potential to enable a wide range of informatics supported applications in clinical care and research both within as ingle healthcare organization and in a network of organizations. At the same time, building and deploying a semantic interoperability solution may require significant effort to carryout data transformation and to harmonize the semantics of the information in the different systems. Our approach to semantic interoperability leverages existing healthcare standards and ontologies, focusing first on specific clinical domains and key applications, and gradually expanding the solution when needed. An important objective of this work is to create a semantic link between clinical research and care environments to enable applications such as streamlining the execution of multi-centric clinical trials, including the identification of eligible patients for the trials. This paper presents an analysis of the suitability of several widely-used medical ontologies in the clinical domain: SNOMED-CT, LOINC, MedDRA, to capture the semantics of the clinical trial eligibility criteria, of the clinical trial data (e.g., Clinical Report Forms), and of the corresponding patient record data that would enable the automatic identification of eligible patients. Next to the coverage provided by the ontologies we evaluate and compare the sizes of the sets of relevant concepts and their relative frequency to estimate the cost of data transformation, of building the necessary semantic mappings, and of extending the solution to new domains. This analysis shows that our approach is both feasible and scalable.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Deregulations and market practices in power industry have brought great challenges to the system planning area. In particular, they introduce a variety of uncertainties to system planning. New techniques are required to cope with such uncertainties. As a promising approach, probabilistic methods are attracting more and more attentions by system planners. In small signal stability analysis, generation control parameters play an important role in determining the stability margin. The objective of this paper is to investigate power system state matrix sensitivity characteristics with respect to system parameter uncertainties with analytical and numerical approaches and to identify those parameters have great impact on system eigenvalues, therefore, the system stability properties. Those identified parameter variations need to be investigated with priority. The results can be used to help Regional Transmission Organizations (RTOs) and Independent System Operators (ISOs) perform planning studies under the open access environment.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a method to analyze the first order eigenvalue sensitivity with respect to the operating parameters of a power system. The method is based on explicitly expressing the system state matrix into sub-matrices. The eigenvalue sensitivity is calculated based on the explicitly formed system state matrix. The 4th order generator model and 4th order exciter system model are used to form the system state matrix. A case study using New England 10-machine 39-bus system is provided to demonstrate the effectiveness of the proposed method. This method can be applied into large scale power system eigenvalue sensitivity with respect to operating parameters.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a combination of local linear PCA projections. However, conventional PCA does not correspond to a probability density, and so there is no unique way to combine PCA models. Previous attempts to formulate mixture models for PCA have therefore to some extent been ad hoc. In this paper, PCA is formulated within a maximum-likelihood framework, based on a specific form of Gaussian latent variable model. This leads to a well-defined mixture model for probabilistic principal component analysers, whose parameters can be determined using an EM algorithm. We discuss the advantages of this model in the context of clustering, density modelling and local dimensionality reduction, and we demonstrate its application to image compression and handwritten digit recognition.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY WITH PRIOR ARRANGEMENT

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Sentiment analysis over Twitter offer organisations a fast and effective way to monitor the publics' feelings towards their brand, business, directors, etc. A wide range of features and methods for training sentiment classifiers for Twitter datasets have been researched in recent years with varying results. In this paper, we introduce a novel approach of adding semantics as additional features into the training set for sentiment analysis. For each extracted entity (e.g. iPhone) from tweets, we add its semantic concept (e.g. Apple product) as an additional feature, and measure the correlation of the representative concept with negative/positive sentiment. We apply this approach to predict sentiment for three different Twitter datasets. Our results show an average increase of F harmonic accuracy score for identifying both negative and positive sentiment of around 6.5% and 4.8% over the baselines of unigrams and part-of-speech features respectively. We also compare against an approach based on sentiment-bearing topic analysis, and find that semantic features produce better Recall and F score when classifying negative sentiment, and better Precision with lower Recall and F score in positive sentiment classification.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed JST model is fully unsupervised. The model has been evaluated on the movie review dataset to classify the review sentiment polarity and minimum prior information have also been explored to further improve the sentiment classification accuracy. Preliminary experiments have shown promising results achieved by JST.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Lexicon-based approaches to Twitter sentiment analysis are gaining much popularity due to their simplicity, domain independence, and relatively good performance. These approaches rely on sentiment lexicons, where a collection of words are marked with fixed sentiment polarities. However, words' sentiment orientation (positive, neural, negative) and/or sentiment strengths could change depending on context and targeted entities. In this paper we present SentiCircle; a novel lexicon-based approach that takes into account the contextual and conceptual semantics of words when calculating their sentiment orientation and strength in Twitter. We evaluate our approach on three Twitter datasets using three different sentiment lexicons. Results show that our approach significantly outperforms two lexicon baselines. Results are competitive but inconclusive when comparing to state-of-art SentiStrength, and vary from one dataset to another. SentiCircle outperforms SentiStrength in accuracy on average, but falls marginally behind in F-measure. © 2014 Springer International Publishing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Cloud computing is a new technological paradigm offering computing infrastructure, software and platforms as a pay-as-you-go, subscription-based service. Many potential customers of cloud services require essential cost assessments to be undertaken before transitioning to the cloud. Current assessment techniques are imprecise as they rely on simplified specifications of resource requirements that fail to account for probabilistic variations in usage. In this paper, we address these problems and propose a new probabilistic pattern modelling (PPM) approach to cloud costing and resource usage verification. Our approach is based on a concise expression of probabilistic resource usage patterns translated to Markov decision processes (MDPs). Key costing and usage queries are identified and expressed in a probabilistic variant of temporal logic and calculated to a high degree of precision using quantitative verification techniques. The PPM cost assessment approach has been implemented as a Java library and validated with a case study and scalability experiments. © 2012 Springer-Verlag Berlin Heidelberg.