952 resultados para statistical learning


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of model selection, in a general learning framework. Actually, we consider a weaker version of this condition that allows one to take into account that learning within a small model can be much easier than within a large one. Requiring this “strong margin adaptivity” makes the model selection problem more challenging. We first prove, in a general framework, that some penalization procedures (including local Rademacher complexities) exhibit this adaptivity when the models are nested. Contrary to previous results, this holds with penalties that only depend on the data. Our second main result is that strong margin adaptivity is not always possible when the models are not nested: for every model selection procedure (even a randomized one), there is a problem for which it does not demonstrate strong margin adaptivity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This chapter argues for the need to restructure children’s statistical experiences from the beginning years of formal schooling. The ability to understand and apply statistical reasoning is paramount across all walks of life, as seen in the variety of graphs, tables, diagrams, and other data representations requiring interpretation. Young children are immersed in our data-driven society, with early access to computer technology and daily exposure to the mass media. With the rate of data proliferation have come increased calls for advancing children’s statistical reasoning abilities, commencing with the earliest years of schooling (e.g., Langrall et al. 2008; Lehrer and Schauble 2005; Shaughnessy 2010; Whitin and Whitin 2011). Several articles (e.g., Franklin and Garfield 2006; Langrall et al. 2008) and policy documents (e.g., National Council of Teachers ofMathematics 2006) have highlighted the need for a renewed focus on this component of early mathematics learning, with children working mathematically and scientifically in dealing with realworld data. One approach to this component in the beginning school years is through data modelling (English 2010; Lehrer and Romberg 1996; Lehrer and Schauble 2000, 2007)...

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Two algorithms are outlined, each of which has interesting features for modeling of spatial variability of rock depth. In this paper, reduced level of rock at Bangalore, India, is arrived from the 652 boreholes data in the area covering 220 sqa <.km. Support vector machine (SVM) and relevance vector machine (RVM) have been utilized to predict the reduced level of rock in the subsurface of Bangalore and to study the spatial variability of the rock depth. The support vector machine (SVM) that is firmly based on the theory of statistical learning theory uses regression technique by introducing epsilon-insensitive loss function has been adopted. RVM is a probabilistic model similar to the widespread SVM, but where the training takes place in a Bayesian framework. Prediction results show the ability of learning machine to build accurate models for spatial variability of rock depth with strong predictive capabilities. The paper also highlights the capability ofRVM over the SVM model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical learning can be used to extract the words from continuous speech. Gómez, Bion, and Mehler (Language and Cognitive Processes, 26, 212–223, 2011) proposed an online measure of statistical learning: They superimposed auditory clicks on a continuous artificial speech stream made up of a random succession of trisyllabic nonwords. Participants were instructed to detect these clicks, which could be located either within or between words. The results showed that, over the length of exposure, reaction times (RTs) increased more for within-word than for between-word clicks. This result has been accounted for by means of statistical learning of the between-word boundaries. However, even though statistical learning occurs without an intention to learn, it nevertheless requires attentional resources. Therefore, this process could be affected by a concurrent task such as click detection. In the present study, we evaluated the extent to which the click detection task indeed reflects successful statistical learning. Our results suggest that the emergence of RT differences between within- and between-word click detection is neither systematic nor related to the successful segmentation of the artificial language. Therefore, instead of being an online measure of learning, the click detection task seems to interfere with the extraction of statistical regularities.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Learning to talk about motion in a second language is very difficult because it involves restructuring deeply entrenched patterns from the first language (Slobin 1996). In this paper we argue that statistical learning (Saffran et al. 1997) can explain why L2 learners are only partially successful in restructuring their second language grammars. We explore to what extent L2 learners make use of two mechanisms of statistical learning, entrenchment and pre-emption (Boyd and Goldberg 2011) to acquire target-like expressions of motion and retreat from overgeneralisation in this domain. Paying attention to the frequency of existing patterns in the input can help learners to adjust the frequency with which they use path and manner verbs in French but is insufficient to acquire the boundary crossing constraint (Slobin and Hoiting 1994) and learn what not to say. We also look at the role of language proficiency and exposure to French in explaining the findings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This considers the challenging task of cancer prediction based on microarray data for the medical community. The research was conducted on mostly common cancers (breast, colon, long, prostate and leukemia) microarray data analysis, and suggests the use of modern machine learning techniques to predict cancer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent advances in the field of statistical learning have established that learners are able to track regularities of multimodal stimuli, yet it is unknown whether the statistical computations are performed on integrated representations or on separate, unimodal representations. In the present study, we investigated the ability of adults to integrate audio and visual input during statistical learning. We presented learners with a speech stream synchronized with a video of a speaker's face. In the critical condition, the visual (e.g., /gi/) and auditory (e.g., /mi/) signals were occasionally incongruent, which we predicted would produce the McGurk illusion, resulting in the perception of an audiovisual syllable (e.g., /ni/). In this way, we used the McGurk illusion to manipulate the underlying statistical structure of the speech streams, such that perception of these illusory syllables facilitated participants' ability to segment the speech stream. Our results therefore demonstrate that participants can integrate audio and visual input to perceive the McGurk illusion during statistical learning. We interpret our findings as support for modality-interactive accounts of statistical learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Typically, statistical learning is investigated by testing the acquisition of specific items or forming general rules. As implicit sequence learning also involves the extraction of regularities from the environment, it can also be considered as an instance of statistical learning. In the present study, a Serial Reaction Time Task was used to test whether the continuous versus interleaved repetition of a sequence affects implicit learning despite the equal exposure to the sequences. The results revealed a sequence learning advantage for the continuous repetition condition compared to the interleaved condition. This suggests that by repetition, additional sequence information was extracted although the exposure to the sequences was identical as in the interleaved condition. The results are discussed in terms of similarities and potential differences between typical statistical learning paradigms and sequence learning.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES The objectives of the present study were to investigate temporal/spectral sound-feature processing in preschool children (4 to 7 years old) with peripheral hearing loss compared with age-matched controls. The results verified the presence of statistical learning, which was diminished in children with hearing impairments (HIs), and elucidated possible perceptual mediators of speech production. DESIGN Perception and production of the syllables /ba/, /da/, /ta/, and /na/ were recorded in 13 children with normal hearing and 13 children with HI. Perception was assessed physiologically through event-related potentials (ERPs) recorded by EEG in a multifeature mismatch negativity paradigm and behaviorally through a discrimination task. Temporal and spectral features of the ERPs during speech perception were analyzed, and speech production was quantitatively evaluated using speech motor maximum performance tasks. RESULTS Proximal to stimulus onset, children with HI displayed a difference in map topography, indicating diminished statistical learning. In later ERP components, children with HI exhibited reduced amplitudes in the N2 and early parts of the late disciminative negativity components specifically, which are associated with temporal and spectral control mechanisms. Abnormalities of speech perception were only subtly reflected in speech production, as the lone difference found in speech production studies was a mild delay in regulating speech intensity. CONCLUSIONS In addition to previously reported deficits of sound-feature discriminations, the present study results reflect diminished statistical learning in children with HI, which plays an early and important, but so far neglected, role in phonological processing. Furthermore, the lack of corresponding behavioral abnormalities in speech production implies that impaired perceptual capacities do not necessarily translate into productive deficits.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents a robust stochastic framework for the incorporation of visual observations into conventional estimation, data fusion, navigation and control algorithms. The representation combines Isomap, a non-linear dimensionality reduction algorithm, with expectation maximization, a statistical learning scheme. The joint probability distribution of this representation is computed offline based on existing training data. The training phase of the algorithm results in a nonlinear and non-Gaussian likelihood model of natural features conditioned on the underlying visual states. This generative model can be used online to instantiate likelihoods corresponding to observed visual features in real-time. The instantiated likelihoods are expressed as a Gaussian mixture model and are conveniently integrated within existing non-linear filtering algorithms. Example applications based on real visual data from heterogenous, unstructured environments demonstrate the versatility of the generative models.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Since manually constructing domain-specific sentiment lexicons is extremely time consuming and it may not even be feasible for domains where linguistic expertise is not available. Research on the automatic construction of domain-specific sentiment lexicons has become a hot topic in recent years. The main contribution of this paper is the illustration of a novel semi-supervised learning method which exploits both term-to-term and document-to-term relations hidden in a corpus for the construction of domain specific sentiment lexicons. More specifically, the proposed two-pass pseudo labeling method combines shallow linguistic parsing and corpusbase statistical learning to make domain-specific sentiment extraction scalable with respect to the sheer volume of opinionated documents archived on the Internet these days. Another novelty of the proposed method is that it can utilize the readily available user-contributed labels of opinionated documents (e.g., the user ratings of product reviews) to bootstrap the performance of sentiment lexicon construction. Our experiments show that the proposed method can generate high quality domain-specific sentiment lexicons as directly assessed by human experts. Moreover, the system generated domain-specific sentiment lexicons can improve polarity prediction tasks at the document level by 2:18% when compared to other well-known baseline methods. Our research opens the door to the development of practical and scalable methods for domain-specific sentiment analysis.