25 resultados para Discriminative model training
em Aston University Research Archive
Resumo:
Since wind at the earth's surface has an intrinsically complex and stochastic nature, accurate wind power forecasts are necessary for the safe and economic use of wind energy. In this paper, we investigated a combination of numeric and probabilistic models: a Gaussian process (GP) combined with a numerical weather prediction (NWP) model was applied to wind-power forecasting up to one day ahead. First, the wind-speed data from NWP was corrected by a GP, then, as there is always a defined limit on power generated in a wind turbine due to the turbine controlling strategy, wind power forecasts were realized by modeling the relationship between the corrected wind speed and power output using a censored GP. To validate the proposed approach, three real-world datasets were used for model training and testing. The empirical results were compared with several classical wind forecast models, and based on the mean absolute error (MAE), the proposed model provides around 9% to 14% improvement in forecasting accuracy compared to an artificial neural network (ANN) model, and nearly 17% improvement on a third dataset which is from a newly-built wind farm for which there is a limited amount of training data. © 2013 IEEE.
Resumo:
In this paper, we discuss how discriminative training can be applied to the hidden vector state (HVS) model in different task domains. The HVS model is a discrete hidden Markov model (HMM) in which each HMM state represents the state of a push-down automaton with a finite stack size. In previous applications, maximum-likelihood estimation (MLE) is used to derive the parameters of the HVS model. However, MLE makes a number of assumptions and unfortunately some of these assumptions do not hold. Discriminative training, without making such assumptions, can improve the performance of the HVS model by discriminating the correct hypothesis from the competing hypotheses. Experiments have been conducted in two domains: the travel domain for the semantic parsing task using the DARPA Communicator data and the Air Travel Information Services (ATIS) data and the bioinformatics domain for the information extraction task using the GENIA corpus. The results demonstrate modest improvements of the performance of the HVS model using discriminative training. In the travel domain, discriminative training of the HVS model gives a relative error reduction rate of 31 percent in F-measure when compared with MLE on the DARPA Communicator data and 9 percent on the ATIS data. In the bioinformatics domain, a relative error reduction rate of 4 percent in F-measure is achieved on the GENIA corpus.
Resumo:
Attractor properties of a popular discrete-time neural network model are illustrated through numerical simulations. The most complex dynamics is found to occur within particular ranges of parameters controlling the symmetry and magnitude of the weight matrix. A small network model is observed to produce fixed points, limit cycles, mode-locking, the Ruelle-Takens route to chaos, and the period-doubling route to chaos. Training algorithms for tuning this dynamical behaviour are discussed. Training can be an easy or difficult task, depending whether the problem requires the use of temporal information distributed over long time intervals. Such problems require training algorithms which can handle hidden nodes. The most prominent of these algorithms, back propagation through time, solves the temporal credit assignment problem in a way which can work only if the relevant information is distributed locally in time. The Moving Targets algorithm works for the more general case, but is computationally intensive, and prone to local minima.
Resumo:
Radial Basis Function networks with linear outputs are often used in regression problems because they can be substantially faster to train than Multi-layer Perceptrons. For classification problems, the use of linear outputs is less appropriate as the outputs are not guaranteed to represent probabilities. We show how RBFs with logistic and softmax outputs can be trained efficiently using the Fisher scoring algorithm. This approach can be used with any model which consists of a generalised linear output function applied to a model which is linear in its parameters. We compare this approach with standard non-linear optimisation algorithms on a number of datasets.
Resumo:
We are concerned with the problem of image segmentation in which each pixel is assigned to one of a predefined finite number of classes. In Bayesian image analysis, this requires fusing together local predictions for the class labels with a prior model of segmentations. Markov Random Fields (MRFs) have been used to incorporate some of this prior knowledge, but this not entirely satisfactory as inference in MRFs is NP-hard. The multiscale quadtree model of Bouman and Shapiro (1994) is an attractive alternative, as this is a tree-structured belief network in which inference can be carried out in linear time (Pearl 1988). It is an hierarchical model where the bottom-level nodes are pixels, and higher levels correspond to downsampled versions of the image. The conditional-probability tables (CPTs) in the belief network encode the knowledge of how the levels interact. In this paper we discuss two methods of learning the CPTs given training data, using (a) maximum likelihood and the EM algorithm and (b) emphconditional maximum likelihood (CML). Segmentations obtained using networks trained by CML show a statistically-significant improvement in performance on synthetic images. We also demonstrate the methods on a real-world outdoor-scene segmentation task.
Resumo:
This field work study furthers understanding about expatriate management, in particular, the nature of cross-cultural management in Hong Kong involving Anglo-American expatriate and Chinese host national managers, the important features of adjustment for expatriates living and working there, and the type of training which will assist them to adjust and to work successfully in this Asian environment. Qualitative and quantitative data on each issue was gathered during in-depth interviews in Hong Kong, using structured interview schedules, with 39 expatriate and 31 host national managers drawn from a cross-section of functional areas and organizations. Despite the adoption of Western technology and the influence of Western business practices, micro-level management in Hong Kong retains a cultural specificity which is consistent with the norms and values of Chinese culture. There are differences in how expatriates and host nationals define their social roles, and Hong Kong's recent colonial history appears to influence cross-cultural interpersonal interactions. The inability of the spouse and/or family to adapt to Hong Kong is identified as a major reason for expatriate assignments to fail, though the causes have less to do with living away from family and friends, than with Hong Kong's highly urbanized environment and the heavy demands of work. Culture shock is not identified as a major problem, but in Hong Kong micro-level social factors require greater adjustment than macro-level societal factors. The adjustment of expatriate managers is facilitated by a strong orientation towards career development and hard work, possession of technical/professional expertise, and a willingness to engage in a process of continuous 'active learning' with respect to the host national society and culture. A four-part model of manager training suitable for Hong Kong is derived from the study data. It consists of a pre-departure briefing, post-arrival cross-cultural training, language training in basic Cantonese and in how to communicate more effectively in English with non-native speakers, and the assignment of a mentor to newly arrived expatriate managers.
Resumo:
Despite its increasing popularity, much intercultural training is not developed with the same level of rigour as training in other areas. Further, research on intercultural training has brought inconsistent results about the effectiveness of such training. This PhD thesis develops a rigorous model of intercultural training and applies it to the preparation of British students going on work/study placements in France and Germany. It investigates the reasons for inconsistent training success by looking at the cognitive learning processes in intercultural training, relating them to training goals, and by examining the short- and long-term transfer of intercultural training into real-life encounters with people from other cultures. Two cognitive trainings based on critical incidents were designed for online delivery. The training content relied on cultural practice dimensions from the GWBE study (House, Hanges, Javidan, Dorfman & Gupta, 2004). Of the two trainings, the 'singlemode training' aimed to develop declarative knowledge, which is necessary to analyse and understand other cultures. The 'concurrent training' aimed to develop declarative and procedural knowledge, which is needed to develop skills for dealing with difficult situations in a culturally appropriate way. Participants (N-48) were randomly assigned to one of the two training conditions. Declarative learning appeared as a process of steady knowledge increase, while procedural learning involved cognitive re-categorisation rather than knowledge increase. In a negotiation role play with host-country nationals directly after the online training, participants of the concurrent training exhibited a more initiative negotiation style than participants of the single-mode training. Comparing cultural adjustment and performance of training participants during their time abroad with an untrained control group, participants of the concurrent training showed the qualitatively best development in adjustment and performance. Besides intercultural training, multicultural personality traits were assessed and proved to be a powerful predictor of adjustment and, indirectly, of performance abroad.
Resumo:
It has been widely recognised that an in-depth textual analysis of a source text is relevant for translation. This book discusses the role of discourse analysis for translation and translator training. One particular model of discourse analysis is presented in detail, and its application in the context of translator training is critically examined.
Resumo:
Sentiment analysis concerns about automatically identifying sentiment or opinion expressed in a given piece of text. Most prior work either use prior lexical knowledge defined as sentiment polarity of words or view the task as a text classification problem and rely on labeled corpora to train a sentiment classifier. While lexicon-based approaches do not adapt well to different domains, corpus-based approaches require expensive manual annotation effort. In this paper, we propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon with preferences on expectations of sentiment labels of those lexicon words being expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudo-labeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances. Experiments on both the movie-review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than existing weakly-supervised sentiment classification methods despite using no labeled documents.
Resumo:
Objective: Biomedical events extraction concerns about events describing changes on the state of bio-molecules from literature. Comparing to the protein-protein interactions (PPIs) extraction task which often only involves the extraction of binary relations between two proteins, biomedical events extraction is much harder since it needs to deal with complex events consisting of embedded or hierarchical relations among proteins, events, and their textual triggers. In this paper, we propose an information extraction system based on the hidden vector state (HVS) model, called HVS-BioEvent, for biomedical events extraction, and investigate its capability in extracting complex events. Methods and material: HVS has been previously employed for extracting PPIs. In HVS-BioEvent, we propose an automated way to generate abstract annotations for HVS training and further propose novel machine learning approaches for event trigger words identification, and for biomedical events extraction from the HVS parse results. Results: Our proposed system achieves an F-score of 49.57% on the corpus used in the BioNLP'09 shared task, which is only 2.38% lower than the best performing system by UTurku in the BioNLP'09 shared task. Nevertheless, HVS-BioEvent outperforms UTurku's system on complex events extraction with 36.57% vs. 30.52% being achieved for extracting regulation events, and 40.61% vs. 38.99% for negative regulation events. Conclusions: The results suggest that the HVS model with the hierarchical hidden state structure is indeed more suitable for complex event extraction since it could naturally model embedded structural context in sentences.
Resumo:
We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HM-SVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fully-annotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences. © 2008. Licensed under the Creative Commons.
Resumo:
Natural language understanding (NLU) aims to map sentences to their semantic mean representations. Statistical approaches to NLU normally require fully-annotated training data where each sentence is paired with its word-level semantic annotations. In this paper, we propose a novel learning framework which trains the Hidden Markov Support Vector Machines (HM-SVMs) without the use of expensive fully-annotated data. In particular, our learning approach takes as input a training set of sentences labeled with abstract semantic annotations encoding underlying embedded structural relations and automatically induces derivation rules that map sentences to their semantic meaning representations. The proposed approach has been tested on the DARPA Communicator Data and achieved 93.18% in F-measure, which outperforms the previously proposed approaches of training the hidden vector state model or conditional random fields from unaligned data, with a relative error reduction rate of 43.3% and 10.6% being achieved.
Resumo:
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed JST model is fully unsupervised. The model has been evaluated on the movie review dataset to classify the review sentiment polarity and minimum prior information have also been explored to further improve the sentiment classification accuracy. Preliminary experiments have shown promising results achieved by JST.
Resumo:
The appraisal and relative performance evaluation of nurses are very important and beneficial for both nurses and employers in an era of clinical governance, increased accountability and high standards of health care services. They enhance and consolidate the knowledge and practical skills of nurses by identification of training and career development plans as well as improvement in health care quality services, increase in job satisfaction and use of cost-effective resources. In this paper, a data envelopment analysis (DEA) model is proposed for the appraisal and relative performance evaluation of nurses. The model is validated on thirty-two nurses working at an Intensive Care Unit (ICU) at one of the most recognized hospitals in Lebanon. The DEA was able to classify nurses into efficient and inefficient ones. The set of efficient nurses was used to establish an internal best practice benchmark to project career development plans for improving the performance of other inefficient nurses. The DEA result confirmed the ranking of some nurses and highlighted injustice in other cases that were produced by the currently practiced appraisal system. Further, the DEA model is shown to be an effective talent management and motivational tool as it can provide clear managerial plans related to promoting, training and development activities from the perspective of nurses, hence increasing their satisfaction, motivation and acceptance of appraisal results. Due to such features, the model is currently being considered for implementation at ICU. Finally, the ratio of the number DEA units to the number of input/output measures is revisited with new suggested values on its upper and lower limits depending on the type of DEA models and the desired number of efficient units from a managerial perspective.
Resumo:
Social streams have proven to be the mostup-to-date and inclusive information on cur-rent events. In this paper we propose a novelprobabilistic modelling framework, called violence detection model (VDM), which enables the identification of text containing violent content and extraction of violence-related topics over social media data. The proposed VDM model does not require any labeled corpora for training, instead, it only needs the in-corporation of word prior knowledge which captures whether a word indicates violence or not. We propose a novel approach of deriving word prior knowledge using the relative entropy measurement of words based on the in-tuition that low entropy words are indicative of semantically coherent topics and therefore more informative, while high entropy words indicates words whose usage is more topical diverse and therefore less informative. Our proposed VDM model has been evaluated on the TREC Microblog 2011 dataset to identify topics related to violence. Experimental results show that deriving word priors using our proposed relative entropy method is more effective than the widely-used information gain method. Moreover, VDM gives higher violence classification results and produces more coherent violence-related topics compared toa few competitive baselines.