82 resultados para Statistical Learning Theory.


Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the growth of the multi-national corporation (MNCs) has come the need to understand how parent companies transfer knowledge to, and manage the operations of, their subsidiaries. This is of particular interest to manufacturing companies transferring their operations overseas. Japanese companies in particular have been pioneering in the development of techniques such as Kaizen, and elements of the Toyota Production System (TPS) such as Kanban, which can be useful tools for transferring the ethos of Japanese manufacturing and maintaining quality and control in overseas subsidiaries. Much has been written about the process of transferring Japanese manufacturing techniques but much less is understood about how the subsidiaries themselves – which are required to make use of such techniques – actually acquire and incorporate them into their operations. This research therefore takes the perspective of the subsidiary in examining how knowledge of manufacturing techniques is transferred from the parent company within its surrounding (subsidiary). There is clearly a need to take a practice-based view to understanding how the local managers and operatives incorporate this knowledge into their working practices. A particularly relevant theme is how subsidiaries both replicate and adapt knowledge from parents and the circumstances in which replication or adaptation occurs. However, it is shown that there is a lack of research which takes an in-depth look at these processes from the perspective of the participants themselves. This is particularly important as much knowledge literature argues that knowledge is best viewed as enacted and learned in practice – and therefore transferred in person – rather than by the transfer of abstract and de-contextualised information. What is needed, therefore, is further research which makes an in-depth examination of what happens at the subsidiary level for this transfer process to occur. There is clearly a need to take a practice-based view to understanding how the local managers and operatives incorporate knowledge about manufacturing techniques into their working practices. In depth qualitative research was, therefore, conducted in the subsidiary of a Japanese multinational, Gambatte Corporation, involving three main manufacturing initiatives (or philosophies), namely 'TPS‘, 'TPM‘ and 'TS‘. The case data were derived from 52 in-depth interviews with project members, moderate-participant observations, and documentations and presented and analysed in episodes format. This study contributes to our understanding of knowledge transfer in relation to the approaches and circumstances of adaptation and replication of knowledge within the subsidiary, how the whole process is developed, and also how 'innovation‘ takes place. This study further understood that the process of knowledge transfer could be explained as a process of Reciprocal Provider-Learner Exchange that can be linked to the Experiential Learning Theory.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Purpose – The purpose of this paper is twofold: first, to provide a critical assessment of the literature on business incubation effectiveness and second, to submit a situated theoretical perspective on how business incubation management can provide an environment that supports the development of incubatee entrepreneurs and their businesses. Design/methodology/approach – The paper provides a narrative critical assessment of the literature on business incubation effectiveness. Definitional issues, performance aspects and approaches to establishing critical success factors in business incubation are discussed. Business incubation management is identified as an overarching factor for theorising on business incubation effectiveness. Findings – The literature on business incubation effectiveness suffers from several deficiencies, including definitional incongruence, descriptive accounts, fragmentation and lack of strong conceptual grounding. Notwithstanding the growth of research on this domain, understanding of how entrepreneurs and their businesses develop within the business incubator environment remains limited. Given the importance of relational, intangible factors in business incubation and the critical role of business incubation management in orchestrating and optimising such factors, it is suggested that theorising efforts would benefit from a situated perspective. Originality/value – The identification of specific shortcomings in the literature on business incubation highlights the need for more systematic efforts towards theory building. It is suggested that focusing on the role of business incubation management from a situated learning theory perspective can lend itself to a more profound understanding of the development process of incubatee entrepreneurs and their firms. Theoretical propositions are offered to this effect, as well as avenues for future research.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we review recent theoretical approaches for analysing the dynamics of on-line learning in multilayer neural networks using methods adopted from statistical physics. The analysis is based on monitoring a set of macroscopic variables from which the generalisation error can be calculated. A closed set of dynamical equations for the macroscopic variables is derived analytically and solved numerically. The theoretical framework is then employed for defining optimal learning parameters and for analysing the incorporation of second order information into the learning process using natural gradient descent and matrix-momentum based methods. We will also briefly explain an extension of the original framework for analysing the case where training examples are sampled with repetition.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A formalism for describing the dynamics of Genetic Algorithms (GAs) using method s from statistical mechanics is applied to the problem of generalization in a perceptron with binary weights. The dynamics are solved for the case where a new batch of training patterns is presented to each population member each generation, which considerably simplifies the calculation. The theory is shown to agree closely to simulations of a real GA averaged over many runs, accurately predicting the mean best solution found. For weak selection and large problem size the difference equations describing the dynamics can be expressed analytically and we find that the effects of noise due to the finite size of each training batch can be removed by increasing the population size appropriately. If this population resizing is used, one can deduce the most computationally efficient size of training batch each generation. For independent patterns this choice also gives the minimum total number of training patterns used. Although using independent patterns is a very inefficient use of training patterns in general, this work may also prove useful for determining the optimum batch size in the case where patterns are recycled.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A formalism for modelling the dynamics of Genetic Algorithms (GAs) using methods from statistical mechanics, originally due to Prugel-Bennett and Shapiro, is reviewed, generalized and improved upon. This formalism can be used to predict the averaged trajectory of macroscopic statistics describing the GA's population. These macroscopics are chosen to average well between runs, so that fluctuations from mean behaviour can often be neglected. Where necessary, non-trivial terms are determined by assuming maximum entropy with constraints on known macroscopics. Problems of realistic size are described in compact form and finite population effects are included, often proving to be of fundamental importance. The macroscopics used here are cumulants of an appropriate quantity within the population and the mean correlation (Hamming distance) within the population. Including the correlation as an explicit macroscopic provides a significant improvement over the original formulation. The formalism is applied to a number of simple optimization problems in order to determine its predictive power and to gain insight into GA dynamics. Problems which are most amenable to analysis come from the class where alleles within the genotype contribute additively to the phenotype. This class can be treated with some generality, including problems with inhomogeneous contributions from each site, non-linear or noisy fitness measures, simple diploid representations and temporally varying fitness. The results can also be applied to a simple learning problem, generalization in a binary perceptron, and a limit is identified for which the optimal training batch size can be determined for this problem. The theory is compared to averaged results from a real GA in each case, showing excellent agreement if the maximum entropy principle holds. Some situations where this approximation brakes down are identified. In order to fully test the formalism, an attempt is made on the strong sc np-hard problem of storing random patterns in a binary perceptron. Here, the relationship between the genotype and phenotype (training error) is strongly non-linear. Mutation is modelled under the assumption that perceptron configurations are typical of perceptrons with a given training error. Unfortunately, this assumption does not provide a good approximation in general. It is conjectured that perceptron configurations would have to be constrained by other statistics in order to accurately model mutation for this problem. Issues arising from this study are discussed in conclusion and some possible areas of further research are outlined.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A theoretical model is presented which describes selection in a genetic algorithm (GA) under a stochastic fitness measure and correctly accounts for finite population effects. Although this model describes a number of selection schemes, we only consider Boltzmann selection in detail here as results for this form of selection are particularly transparent when fitness is corrupted by additive Gaussian noise. Finite population effects are shown to be of fundamental importance in this case, as the noise has no effect in the infinite population limit. In the limit of weak selection we show how the effects of any Gaussian noise can be removed by increasing the population size appropriately. The theory is tested on two closely related problems: the one-max problem corrupted by Gaussian noise and generalization in a perceptron with binary weights. The averaged dynamics can be accurately modelled for both problems using a formalism which describes the dynamics of the GA using methods from statistical mechanics. The second problem is a simple example of a learning problem and by considering this problem we show how the accurate characterization of noise in the fitness evaluation may be relevant in machine learning. The training error (negative fitness) is the number of misclassified training examples in a batch and can be considered as a noisy version of the generalization error if an independent batch is used for each evaluation. The noise is due to the finite batch size and in the limit of large problem size and weak selection we show how the effect of this noise can be removed by increasing the population size. This allows the optimal batch size to be determined, which minimizes computation time as well as the total number of training examples required.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

An unsupervised learning procedure based on maximizing the mutual information between the outputs of two networks receiving different but statistically dependent inputs is analyzed (Becker S. and Hinton G., Nature, 355 (1992) 161). By exploiting a formal analogy to supervised learning in parity machines, the theory of zero-temperature Gibbs learning for the unsupervised procedure is presented for the case that the networks are perceptrons and for the case of fully connected committees.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A novel approach, based on statistical mechanics, to analyze typical performance of optimum code-division multiple-access (CDMA) multiuser detectors is reviewed. A `black-box' view ot the basic CDMA channel is introduced, based on which the CDMA multiuser detection problem is regarded as a `learning-from-examples' problem of the `binary linear perceptron' in the neural network literature. Adopting Bayes framework, analysis of the performance of the optimum CDMA multiuser detectors is reduced to evaluation of the average of the cumulant generating function of a relevant posterior distribution. The evaluation of the average cumulant generating function is done, based on formal analogy with a similar calculation appearing in the spin glass theory in statistical mechanics, by making use of the replica method, a method developed in the spin glass theory.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article reports on an investigationwith first year undergraduate ProductDesign and Management students within a School of Engineering and Applied Science. The students at the time of this investigation had studied fundamental engineering science and mathematics for one semester. The students were given an open ended, ill-formed problem which involved designing a simple bridge to cross a river.They were given a talk on problemsolving and given a rubric to follow, if they chose to do so.They were not given any formulae or procedures needed in order to resolve the problem. In theory, they possessed the knowledge to ask the right questions in order tomake assumptions but, in practice, it turned out they were unable to link their a priori knowledge to resolve this problem. They were able to solve simple beam problems when given closed questions. The results show they were unable to visualize a simple bridge as an augmented beam problem and ask pertinent questions and hence formulate appropriate assumptions in order to offer resolutions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper reports on an investigation with first year undergraduate Product Design and Management students within a School of Engineering. The students at the time of this investigation had studied fundamental engineering science and mathematics for one semester. The students were given an open ended, ill formed problem which involved designing a simple bridge to cross a river. They were given a talk on problem solving and given a rubric to follow, if they chose to do so. They were not given any formulae or procedures needed in order to resolve the problem. In theory, they possessed the knowledge to ask the right questions in order to make assumptions but, in practice, it turned out they were unable to link their a priori knowledge to resolve this problem. They were able to solve simple beam problems when given closed questions. The results show they were unable to visualise a simple bridge as an augmented beam problem and ask pertinent questions and hence formulate appropriate assumptions in order to offer resolutions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neural networks can be regarded as statistical models, and can be analysed in a Bayesian framework. Generalisation is measured by the performance on independent test data drawn from the same distribution as the training data. Such performance can be quantified by the posterior average of the information divergence between the true and the model distributions. Averaging over the Bayesian posterior guarantees internal coherence; Using information divergence guarantees invariance with respect to representation. The theory generalises the least mean squares theory for linear Gaussian models to general problems of statistical estimation. The main results are: (1)~the ideal optimal estimate is always given by average over the posterior; (2)~the optimal estimate within a computational model is given by the projection of the ideal estimate to the model. This incidentally shows some currently popular methods dealing with hyperpriors are in general unnecessary and misleading. The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd. It therefore offers conceptual simplification to information geometry. The general conclusion on the issue of evaluating neural network learning rules and other statistical inference methods is that such evaluations are only meaningful under three assumptions: The prior P(p), describing the environment of all the problems; the divergence Dd, specifying the requirement of the task; and the model Q, specifying available computing resources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neural networks are statistical models and learning rules are estimators. In this paper a theory for measuring generalisation is developed by combining Bayesian decision theory with information geometry. The performance of an estimator is measured by the information divergence between the true distribution and the estimate, averaged over the Bayesian posterior. This unifies the majority of error measures currently in use. The optimal estimators also reveal some intricate interrelationships among information geometry, Banach spaces and sufficient statistics.