22 resultados para Training time
em Aston University Research Archive
Resumo:
Training Mixture Density Network (MDN) configurations within the NETLAB framework takes time due to the nature of the computation of the error function and the gradient of the error function. By optimising the computation of these functions, so that gradient information is computed in parameter space, training time is decreased by at least a factor of sixty for the example given. Decreased training time increases the spectrum of problems to which MDNs can be practically applied making the MDN framework an attractive method to the applied problem solver.
Resumo:
We complement recent advances in thermodynamic limit analyses of mean on-line gradient descent learning dynamics in multi-layer networks by calculating fluctuations possessed by finite dimensional systems. Fluctuations from the mean dynamics are largest at the onset of specialisation as student hidden unit weight vectors begin to imitate specific teacher vectors, increasing with the degree of symmetry of the initial conditions. In light of this, we include a term to stimulate asymmetry in the learning process, which typically also leads to a significant decrease in training time.
Resumo:
Attractor properties of a popular discrete-time neural network model are illustrated through numerical simulations. The most complex dynamics is found to occur within particular ranges of parameters controlling the symmetry and magnitude of the weight matrix. A small network model is observed to produce fixed points, limit cycles, mode-locking, the Ruelle-Takens route to chaos, and the period-doubling route to chaos. Training algorithms for tuning this dynamical behaviour are discussed. Training can be an easy or difficult task, depending whether the problem requires the use of temporal information distributed over long time intervals. Such problems require training algorithms which can handle hidden nodes. The most prominent of these algorithms, back propagation through time, solves the temporal credit assignment problem in a way which can work only if the relevant information is distributed locally in time. The Moving Targets algorithm works for the more general case, but is computationally intensive, and prone to local minima.
Resumo:
This paper reviews some basic issues and methods involved in using neural networks to respond in a desired fashion to a temporally-varying environment. Some popular network models and training methods are introduced. A speech recognition example is then used to illustrate the central difficulty of temporal data processing: learning to notice and remember relevant contextual information. Feedforward network methods are applicable to cases where this problem is not severe. The application of these methods are explained and applications are discussed in the areas of pure mathematics, chemical and physical systems, and economic systems. A more powerful but less practical algorithm for temporal problems, the moving targets algorithm, is sketched and discussed. For completeness, a few remarks are made on reinforcement learning.
Resumo:
A simple method for training the dynamical behavior of a neural network is derived. It is applicable to any training problem in discrete-time networks with arbitrary feedback. The algorithm resembles back-propagation in that an error function is minimized using a gradient-based method, but the optimization is carried out in the hidden part of state space either instead of, or in addition to weight space. Computational results are presented for some simple dynamical training problems, one of which requires response to a signal 100 time steps in the past.
Resumo:
A simple method for training the dynamical behavior of a neural network is derived. It is applicable to any training problem in discrete-time networks with arbitrary feedback. The method resembles back-propagation in that it is a least-squares, gradient-based optimization method, but the optimization is carried out in the hidden part of state space instead of weight space. A straightforward adaptation of this method to feedforward networks offers an alternative to training by conventional back-propagation. Computational results are presented for simple dynamical training problems, with varied success. The failures appear to arise when the method converges to a chaotic attractor. A patch-up for this problem is proposed. The patch-up involves a technique for implementing inequality constraints which may be of interest in its own right.
Resumo:
In this paper, we discuss some practical implications for implementing adaptable network algorithms applied to non-stationary time series problems. Using electricity load data and training with the extended Kalman filter, we demonstrate that the dynamic model-order increment procedure of the resource allocating RBF network (RAN) is highly sensitive to the parameters of the novelty criterion. We investigate the use of system noise and forgetting factors for increasing the plasticity of the Kalman filter training algorithm, and discuss the consequences for on-line model order selection. We also find that a recently-proposed alternative novelty criterion, found to be more robust in stationary environments, does not fare so well in the non-stationary case due to the need for filter adaptability during training.
Resumo:
We are concerned with the problem of image segmentation in which each pixel is assigned to one of a predefined finite number of classes. In Bayesian image analysis, this requires fusing together local predictions for the class labels with a prior model of segmentations. Markov Random Fields (MRFs) have been used to incorporate some of this prior knowledge, but this not entirely satisfactory as inference in MRFs is NP-hard. The multiscale quadtree model of Bouman and Shapiro (1994) is an attractive alternative, as this is a tree-structured belief network in which inference can be carried out in linear time (Pearl 1988). It is an hierarchical model where the bottom-level nodes are pixels, and higher levels correspond to downsampled versions of the image. The conditional-probability tables (CPTs) in the belief network encode the knowledge of how the levels interact. In this paper we discuss two methods of learning the CPTs given training data, using (a) maximum likelihood and the EM algorithm and (b) emphconditional maximum likelihood (CML). Segmentations obtained using networks trained by CML show a statistically-significant improvement in performance on synthetic images. We also demonstrate the methods on a real-world outdoor-scene segmentation task.
Resumo:
Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models (HMMs) to identify the lag (or delay) between different variables for such data. We first present a method using maximum likelihood estimation and propose a simple algorithm which is capable of identifying associations between variables. We also adopt an information-theoretic approach and develop a novel procedure for training HMMs to maximise the mutual information between delayed time series. Both methods are successfully applied to real data. We model the oil drilling process with HMMs and estimate a crucial parameter, namely the lag for return.
Resumo:
The deficiencies of stationary models applied to financial time series are well documented. A special form of non-stationarity, where the underlying generator switches between (approximately) stationary regimes, seems particularly appropriate for financial markets. We use a dynamic switching (modelled by a hidden Markov model) combined with a linear dynamical system in a hybrid switching state space model (SSSM) and discuss the practical details of training such models with a variational EM algorithm due to [Ghahramani and Hilton,1998]. The performance of the SSSM is evaluated on several financial data sets and it is shown to improve on a number of existing benchmark methods.
Resumo:
Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models to identify the lag (or delay) between different variables for such data. Adopting an information-theoretic approach, we develop a procedure for training HMMs to maximise the mutual information (MMI) between delayed time series. The method is used to model the oil drilling process. We show that cross-correlation gives no information and that the MMI approach outperforms maximum likelihood.
Resumo:
The dynamics of supervised learning in layered neural networks were studied in the regime where the size of the training set is proportional to the number of inputs. The evolution of macroscopic observables, including the two relevant performance measures can be predicted by using the dynamical replica theory. Three approximation schemes aimed at eliminating the need to solve a functional saddle-point equation at each time step have been derived.
Resumo:
Objectives — To map the tasks, activities and training provision for primary care pharmacists (PCPs) and to identify perceived future training needs. Methods — Survey undertaken in 1998/1999 using a pre-piloted, postal, self-completion questionnaire to two samples of PCPs. Setting — PCPs in (a) the West Midlands and (b) England (outside West Midlands). Key findings — The response rate was 66 per cent. A majority (68 per cent) had worked in the role for less than two years. Eighty per cent had some form of continuing education or training for the role although only 50 per cent had a formal qualification. Over two-thirds had contributed to the funding of their training, with one-third providing all funding. Seventy-four per cent of PCPs agreed that pharmacists should go through a procedure to ensure competence (accreditation) before being allowed to work for a general medical practice or primary care group. Views on the need for formal education/training prior to work differed: 82 per cent of those with formal qualifications, but only 46 per cent of those without, considered that this should be a requirement. There was general agreement that training/education had met training needs. Views on future training closely reflected previous training experiences, with a focus upon pharmaceutical roles rather than upon generic skill development and the acquisition of management skills. Conclusions — The study provides a snapshot in time of the experience of pioneer PCPs and the training available to them. PCPs will need further training or updating if they are to provide the wider roles required by the developing needs of the National Health Service. Consideration should be given to formal recognition of the training of PCPs in order to assure competence. The expectation that pharmacists should fund their own training is likely to be a barrier to uptake of training and uncertainties over funding will militate against consistency of training.
Resumo:
Despite its increasing popularity, much intercultural training is not developed with the same level of rigour as training in other areas. Further, research on intercultural training has brought inconsistent results about the effectiveness of such training. This PhD thesis develops a rigorous model of intercultural training and applies it to the preparation of British students going on work/study placements in France and Germany. It investigates the reasons for inconsistent training success by looking at the cognitive learning processes in intercultural training, relating them to training goals, and by examining the short- and long-term transfer of intercultural training into real-life encounters with people from other cultures. Two cognitive trainings based on critical incidents were designed for online delivery. The training content relied on cultural practice dimensions from the GWBE study (House, Hanges, Javidan, Dorfman & Gupta, 2004). Of the two trainings, the 'singlemode training' aimed to develop declarative knowledge, which is necessary to analyse and understand other cultures. The 'concurrent training' aimed to develop declarative and procedural knowledge, which is needed to develop skills for dealing with difficult situations in a culturally appropriate way. Participants (N-48) were randomly assigned to one of the two training conditions. Declarative learning appeared as a process of steady knowledge increase, while procedural learning involved cognitive re-categorisation rather than knowledge increase. In a negotiation role play with host-country nationals directly after the online training, participants of the concurrent training exhibited a more initiative negotiation style than participants of the single-mode training. Comparing cultural adjustment and performance of training participants during their time abroad with an untrained control group, participants of the concurrent training showed the qualitatively best development in adjustment and performance. Besides intercultural training, multicultural personality traits were assessed and proved to be a powerful predictor of adjustment and, indirectly, of performance abroad.
Resumo:
This thesis covers two major aspects of pharmacy education; undergraduate education and pre-registration training. A cohort of pharmacy graduates were surveyed over a period of four years, on issues related to undergraduate education, pre-registration training and continuing education. These graduates were the first-ever to sit the pre-registration examination. In addition, the opinions of pre-registration tutors were obtained on pre-registration training, during the year that competence-based assessment was introduced. It was concluded that although the undergraduate course provided a broad base of knowledge suitable for graduates in all branches of pharmacy, several issues were identified which would require attention in future developments of the course. These were: 1. the strong support for the expansion of clinical, social and practice-based teaching. 2. the strong support to retain the scientific content to the same extent as in the three-year course. 3. a greater use of problem-based learning methods. The graduates supported the provision of a pre-registration continuing education course to help prepare for the examination and in areas inadequately covered in the undergraduate course. There was also support for the introduction of some form of split branch training. There was no strong evidence to suggest that the training had been an application of undergraduate education. In general, competence-based training was well regarded by tutors as an appropriate and effective method of skill assessment. However, community tutors felt it was difficult to carry out effectively due to day-to-day time constraints. The assistant tutors in hospital pharmacy were found to have a very important role in provision of training, and should be adequately trained and supported. The study recommends the introduction of uniform training and a quality assurance mechanism for all tutors and assistants undertaking this role.