797 resultados para Statistical Learning Theory.
Resumo:
The Support Vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights and threshold such as to minimize an upper bound on the expected test error. The present study is devoted to an experimental comparison of these machines with a classical approach, where the centers are determined by $k$--means clustering and the weights are found using error backpropagation. We consider three machines, namely a classical RBF machine, an SV machine with Gaussian kernel, and a hybrid system with the centers determined by the SV method and the weights trained by error backpropagation. Our results show that on the US postal service database of handwritten digits, the SV machine achieves the highest test accuracy, followed by the hybrid approach. The SV approach is thus not only theoretically well--founded, but also superior in a practical application.
Resumo:
We present an overview of current research on artificial neural networks, emphasizing a statistical perspective. We view neural networks as parameterized graphs that make probabilistic assumptions about data, and view learning algorithms as methods for finding parameter values that look probable in the light of the data. We discuss basic issues in representation and learning, and treat some of the practical issues that arise in fitting networks to data. We also discuss links between neural networks and the general formalism of graphical models.
Resumo:
When training Support Vector Machines (SVMs) over non-separable data sets, one sets the threshold $b$ using any dual cost coefficient that is strictly between the bounds of $0$ and $C$. We show that there exist SVM training problems with dual optimal solutions with all coefficients at bounds, but that all such problems are degenerate in the sense that the "optimal separating hyperplane" is given by ${f w} = {f 0}$, and the resulting (degenerate) SVM will classify all future points identically (to the class that supplies more training data). We also derive necessary and sufficient conditions on the input data for this to occur. Finally, we show that an SVM training problem can always be made degenerate by the addition of a single data point belonging to a certain unboundedspolyhedron, which we characterize in terms of its extreme points and rays.
Resumo:
This paper investigates detection of architectural distortion in mammographic images using support vector machine. Hausdorff dimension is used to characterise the texture feature of mammographic images. Support vector machine, a learning machine based on statistical learning theory, is trained through supervised learning to detect architectural distortion. Compared to the Radial Basis Function neural networks, SVM produced more accurate classification results in distinguishing architectural distortion abnormality from normal breast parenchyma.
Resumo:
The skin cancer is the most common of all cancers and the increase of its incidence must, in part, caused by the behavior of the people in relation to the exposition to the sun. In Brazil, the non-melanoma skin cancer is the most incident in the majority of the regions. The dermatoscopy and videodermatoscopy are the main types of examinations for the diagnosis of dermatological illnesses of the skin. The field that involves the use of computational tools to help or follow medical diagnosis in dermatological injuries is seen as very recent. Some methods had been proposed for automatic classification of pathology of the skin using images. The present work has the objective to present a new intelligent methodology for analysis and classification of skin cancer images, based on the techniques of digital processing of images for extraction of color characteristics, forms and texture, using Wavelet Packet Transform (WPT) and learning techniques called Support Vector Machine (SVM). The Wavelet Packet Transform is applied for extraction of texture characteristics in the images. The WPT consists of a set of base functions that represents the image in different bands of frequency, each one with distinct resolutions corresponding to each scale. Moreover, the characteristics of color of the injury are also computed that are dependants of a visual context, influenced for the existing colors in its surround, and the attributes of form through the Fourier describers. The Support Vector Machine is used for the classification task, which is based on the minimization principles of the structural risk, coming from the statistical learning theory. The SVM has the objective to construct optimum hyperplanes that represent the separation between classes. The generated hyperplane is determined by a subset of the classes, called support vectors. For the used database in this work, the results had revealed a good performance getting a global rightness of 92,73% for melanoma, and 86% for non-melanoma and benign injuries. The extracted describers and the SVM classifier became a method capable to recognize and to classify the analyzed skin injuries
Resumo:
Minimax lower bounds for concept learning state, for example, thatfor each sample size $n$ and learning rule $g_n$, there exists a distributionof the observation $X$ and a concept $C$ to be learnt such that the expectederror of $g_n$ is at least a constant times $V/n$, where $V$ is the VC dimensionof the concept class. However, these bounds do not tell anything about therate of decrease of the error for a {\sl fixed} distribution--concept pair.\\In this paper we investigate minimax lower bounds in such a--stronger--sense.We show that for several natural $k$--parameter concept classes, includingthe class of linear halfspaces, the class of balls, the class of polyhedrawith a certain number of faces, and a class of neural networks, for any{\sl sequence} of learning rules $\{g_n\}$, there exists a fixed distributionof $X$ and a fixed concept $C$ such that the expected error is larger thana constant times $k/n$ for {\sl infinitely many n}. We also obtain suchstrong minimax lower bounds for the tail distribution of the probabilityof error, which extend the corresponding minimax lower bounds.
Resumo:
We study opinion dynamics in a population of interacting adaptive agents voting on a set of issues represented by vectors. We consider agents who can classify issues into one of two categories and can arrive at their opinions using an adaptive algorithm. Adaptation comes from learning and the information for the learning process comes from interacting with other neighboring agents and trying to change the internal state in order to concur with their opinions. The change in the internal state is driven by the information contained in the issue and in the opinion of the other agent. We present results in a simple yet rich context where each agent uses a Boolean perceptron to state their opinion. If the update occurs with information asynchronously exchanged among pairs of agents, then the typical case, if the number of issues is kept small, is the evolution into a society torn by the emergence of factions with extreme opposite beliefs. This occurs even when seeking consensus with agents with opposite opinions. If the number of issues is large, the dynamics becomes trapped, the society does not evolve into factions and a distribution of moderate opinions is observed. The synchronous case is technically simpler and is studied by formulating the problem in terms of differential equations that describe the evolution of order parameters that measure the consensus between pairs of agents. We show that for a large number of issues and unidirectional information flow, global consensus is a fixed point; however, the approach to this consensus is glassy for large societies.
Resumo:
Evaluative learning theory states that affective learning, the acquisition of likes and dislikes, is qualitatively different from relational learning, the learning of predictive relationships among stimuli. Three experiments tested the prediction derived from evaluative learning theory that relational learning, but not affective learning, is affected by stimulus competition by comparing performance during two conditional stimuli, one trained in a superconditioning procedure and the other in a blocking procedure. Ratings of unconditional stimulus expectancy and electrodermal responses indicated stimulus competition in relational learning. Evidence for stimulus competition in affective learning was provided by verbal ratings of conditional stimulus pleasantness and by measures of blink startle modulation. Taken together, the present experiments demonstrate stimulus competition in relational and affective learning, a result inconsistent with evaluative learning theory. (C) 2001 Academic Press.
Resumo:
This theoretical note describes an expansion of the behavioral prediction equation, in line with the greater complexity encountered in models of structured learning theory (R. B. Cattell, 1996a). This presents learning theory with a vector substitute for the simpler scalar quantities by which traditional Pavlovian-Skinnerian models have hitherto been represented. Structured learning can be demonstrated by vector changes across a range of intrapersonal psychological variables (ability, personality, motivation, and state constructs). Its use with motivational dynamic trait measures (R. B. Cattell, 1985) should reveal new theoretical possibilities for scientifically monitoring change processes (dynamic calculus model; R. B. Cattell, 1996b), such as encountered within psycho therapeutic settings (R. B. Cattell, 1987). The enhanced behavioral prediction equation suggests that static conceptualizations of personality structure such as the Big Five model are less than optimal.
Resumo:
Low noise surfaces have been increasingly considered as a viable and cost-effective alternative to acoustical barriers. However, road planners and administrators frequently lack information on the correlation between the type of road surface and the resulting noise emission profile. To address this problem, a method to identify and classify different types of road pavements was developed, whereby near field road noise is analyzed using statistical learning methods. The vehicle rolling sound signal near the tires and close to the road surface was acquired by two microphones in a special arrangement which implements the Close-Proximity method. A set of features, characterizing the properties of the road pavement, was extracted from the corresponding sound profiles. A feature selection method was used to automatically select those that are most relevant in predicting the type of pavement, while reducing the computational cost. A set of different types of road pavement segments were tested and the performance of the classifier was evaluated. Results of pavement classification performed during a road journey are presented on a map, together with geographical data. This procedure leads to a considerable improvement in the quality of road pavement noise data, thereby increasing the accuracy of road traffic noise prediction models.
Resumo:
The aim of this paper is to present an adaptation model for an Adaptive Educational Hypermedia System, PCMAT. The adaptation of the application is based on progressive self-assessment (exercises, tasks, and so on) and applies the constructivist learning theory and the learning styles theory. Our objective is the creation of a better, more adequate adaptation model that takes into account the complexities of different users.
Resumo:
This study displays and analyzes the contents of the Mathematics subject in ESO’s second cycle from a constructivist perspective. This analysis has been carried out by contrasting two groups of participants (control group and experimental group). These groups were formed by a sample of 240 students between the ages of 14 and 16 from four different educational centres of the Osona area. Research – Action methodology has been employed, combining quantitative techniques (statistical study with the SPSS package) with qualitative analysis (transcriptions of interviews and discussion group). This study has been carried out after years of classroom observation, reflection and action. The theoretical framework employed is a cognitive one, based on Ausubel’s Significative Learning Theory. Quantitative analysis shows how the researcher’s design improves, on the one hand, the students’ academic motivation and, on the other hand, their comprehensive memory, enabling them to achieve a more significant learning of the subjects’ contents. Furthermore, our analysis shows that the proposed method is more comprehensive than those employed by teachers collaborating with control groups. The main aim of the qualitative analysis is that of identifying the elements which configure the programme and contribute to an improvement of the aspects mentioned above. The key elements here are: co-operation as the basis of group dynamics; the employment, in some cases, of easily handled materials; the type of interaction between teacher and students, where, through open discussion, students are lead by teaching staff towards the course objectives; induction, that is, deducing formulae by initially using examples which are close to the students’ knowledge and experience or taken from everyday life (what we could call “down-top” mathematics). We should add here that the qualitative analysis does not only corroborate the results obtained by quantitative techniques, but also displays an increase of motivation in teaching staff. Teachers did show a positive attitude and welcomed the use and development of these materials in the next academic year. Finally, we discuss possible directions for further research.
Resumo:
We report experiments designed to test between Nash equilibria that are stable and unstable under learning. The “TASP” (Time Average of the Shapley Polygon) gives a precise prediction about what happens when there is divergence from equilibrium under fictitious play like learning processes. We use two 4 x 4 games each with a unique mixed Nash equilibrium; one is stable and one is unstable under learning. Both games are versions of Rock-Paper-Scissors with the addition of a fourth strategy, Dumb. Nash equilibrium places a weight of 1/2 on Dumb in both games, but the TASP places no weight on Dumb when the equilibrium is unstable. We also vary the level of monetary payoffs with higher payoffs predicted to increase instability. We find that the high payoff unstable treatment differs from the others. Frequency of Dumb is lower and play is further from Nash than in the other treatments. That is, we find support for the comparative statics prediction of learning theory, although the frequency of Dumb is substantially greater than zero in the unstable treatments.
Resumo:
This paper investigates the role of learning by private agents and the central bank (two-sided learning) in a New Keynesian framework in which both sides of the economy have asymmetric and imperfect knowledge about the true data generating process. We assume that all agents employ the data that they observe (which may be distinct for different sets of agents) to form beliefs about unknown aspects of the true model of the economy, use their beliefs to decide on actions, and revise these beliefs through a statistical learning algorithm as new information becomes available. We study the short-run dynamics of our model and derive its policy recommendations, particularly with respect to central bank communications. We demonstrate that two-sided learning can generate substantial increases in volatility and persistence, and alter the behavior of the variables in the model in a signifficant way. Our simulations do not converge to a symmetric rational expectations equilibrium and we highlight one source that invalidates the convergence results of Marcet and Sargent (1989). Finally, we identify a novel aspect of central bank communication in models of learning: communication can be harmful if the central bank's model is substantially mis-specified
Resumo:
This paper investigates the role of learning by private agents and the central bank(two-sided learning) in a New Keynesian framework in which both sides of the economyhave asymmetric and imperfect knowledge about the true data generating process. Weassume that all agents employ the data that they observe (which may be distinct fordifferent sets of agents) to form beliefs about unknown aspects of the true model ofthe economy, use their beliefs to decide on actions, and revise these beliefs througha statistical learning algorithm as new information becomes available. We study theshort-run dynamics of our model and derive its policy recommendations, particularlywith respect to central bank communications. We demonstrate that two-sided learningcan generate substantial increases in volatility and persistence, and alter the behaviorof the variables in the model in a significant way. Our simulations do not convergeto a symmetric rational expectations equilibrium and we highlight one source thatinvalidates the convergence results of Marcet and Sargent (1989). Finally, we identifya novel aspect of central bank communication in models of learning: communicationcan be harmful if the central bank's model is substantially mis-specified.