18 resultados para Noisy corpora.
Resumo:
A theoretical model is presented which describes selection in a genetic algorithm (GA) under a stochastic fitness measure and correctly accounts for finite population effects. Although this model describes a number of selection schemes, we only consider Boltzmann selection in detail here as results for this form of selection are particularly transparent when fitness is corrupted by additive Gaussian noise. Finite population effects are shown to be of fundamental importance in this case, as the noise has no effect in the infinite population limit. In the limit of weak selection we show how the effects of any Gaussian noise can be removed by increasing the population size appropriately. The theory is tested on two closely related problems: the one-max problem corrupted by Gaussian noise and generalization in a perceptron with binary weights. The averaged dynamics can be accurately modelled for both problems using a formalism which describes the dynamics of the GA using methods from statistical mechanics. The second problem is a simple example of a learning problem and by considering this problem we show how the accurate characterization of noise in the fitness evaluation may be relevant in machine learning. The training error (negative fitness) is the number of misclassified training examples in a batch and can be considered as a noisy version of the generalization error if an independent batch is used for each evaluation. The noise is due to the finite batch size and in the limit of large problem size and weak selection we show how the effect of this noise can be removed by increasing the population size. This allows the optimal batch size to be determined, which minimizes computation time as well as the total number of training examples required.
Resumo:
Language learners ask a variety of questions about words and their meanings and uses: “What does X mean? What is the word for X in English? Can you say X? When do you use X and when do you use Y (e.g. synonyms, grammatical structures, prepositional choices, variant phrases, etc)?”
Resumo:
Properties of computing Boolean circuits composed of noisy logical gates are studied using the statistical physics methodology. A formula-growth model that gives rise to random Boolean functions is mapped onto a spin system, which facilitates the study of their typical behavior in the presence of noise. Bounds on their performance, derived in the information theory literature for specific gates, are straightforwardly retrieved, generalized and identified as the corresponding macroscopic phase transitions. The framework is employed for deriving results on error-rates at various function-depths and function sensitivity, and their dependence on the gate-type and noise model used. These are difficult to obtain via the traditional methods used in this field.
Resumo:
Corpora amylacea (CA) are spherical or ovoid bodies 50-50 microns in diameter. They have been described in normal elderly brain as well as in a number of neurodegenerative disorders. In this study, the incidence of CA in the optic nerves of Alzheimer's disease (AD) patients was compared with normal elderly controls. Samples of optic nerves (MRC Brain Bank, Institute of Psychiatry) were taken from 12 AD patients (age range 69-94 years) and 18 controls (43-82 years). Optic nerves were fixed in 2% buffered glutaraldehyde, post-fixed in osmium tetroxide, embedded in epoxy resin and then sectioned to a thickness of 2 microns. Sections were stained with toluidine blue. CA were present in all of the optic nerves examined. In addition, a number of similarly stained but more irregularly shaped bodies were present. Fewer CA were found in the optic nerves of AD patients compared with controls. By contrast, the number or irregularly shaped bodies was increased in AD. In AD, there may be a preferential decline in the large diameter fibres which may mediate the M-cell pathway. Hence, the decline in the incidence of CA in AD may be associated with a reduction in these fibres. It is also possible that the irregualrly shaped bodies are a degeneration product of the CA.
Resumo:
This study presents a detailed contrastive description of the textual functioning of connectives in English and Arabic. Particular emphasis is placed on the organisational force of connectives and their role in sustaining cohesion. The description is intended as a contribution for a better understanding of the variations in the dominant tendencies for text organisation in each language. The findings are expected to be utilised for pedagogical purposes, particularly in improving EFL teaching of writing at the undergraduate level. The study is based on an empirical investigation of the phenomenon of connectivity and, for optimal efficiency, employs computer-aided procedures, particularly those adopted in corpus linguistics, for investigatory purposes. One important methodological requirement is the establishment of two comparable and statistically adequate corpora, also the design of software and the use of existing packages and to achieve the basic analysis. Each corpus comprises ca 250,000 words of newspaper material sampled in accordance to a specific set of criteria and assembled in machine readable form prior to the computer-assisted analysis. A suite of programmes have been written in SPITBOL to accomplish a variety of analytical tasks, and in particular to perform a battery of measurements intended to quantify the textual functioning of connectives in each corpus. Concordances and some word lists are produced by using OCP. Results of these researches confirm the existence of fundamental differences in text organisation in Arabic in comparison to English. This manifests itself in the way textual operations of grouping and sequencing are performed and in the intensity of the textual role of connectives in imposing linearity and continuity and in maintaining overall stability. Furthermore, computation of connective functionality and range of operationality has identified fundamental differences in the way favourable choices for text organisation are made and implemented.
Resumo:
Distributed Brillouin sensing of strain and temperature works by making spatially resolved measurements of the position of the measurand-dependent extremum of the resonance curve associated with the scattering process in the weakly nonlinear regime. Typically, measurements of backscattered Stokes intensity (the dependent variable) are made at a number of predetermined fixed frequencies covering the design measurand range of the apparatus and combined to yield an estimate of the position of the extremum. The measurand can then be found because its relationship to the position of the extremum is assumed known. We present analytical expressions relating the relative error in the extremum position to experimental errors in the dependent variable. This is done for two cases: (i) a simple non-parametric estimate of the mean based on moments and (ii) the case in which a least squares technique is used to fit a Lorentzian to the data. The question of statistical bias in the estimates is discussed and in the second case we go further and present for the first time a general method by which the probability density function (PDF) of errors in the fitted parameters can be obtained in closed form in terms of the PDFs of the errors in the noisy data.
Resumo:
Contrast sensitivity improves with the area of a sine-wave grating, but why? Here we assess this phenomenon against contemporary models involving spatial summation, probability summation, uncertainty, and stochastic noise. Using a two-interval forced-choice procedure we measured contrast sensitivity for circular patches of sine-wave gratings with various diameters that were blocked or interleaved across trials to produce low and high extrinsic uncertainty, respectively. Summation curves were steep initially, becoming shallower thereafter. For the smaller stimuli, sensitivity was slightly worse for the interleaved design than for the blocked design. Neither area nor blocking affected the slope of the psychometric function. We derived model predictions for noisy mechanisms and extrinsic uncertainty that was either low or high. The contrast transducer was either linear (c1.0) or nonlinear (c2.0), and pooling was either linear or a MAX operation. There was either no intrinsic uncertainty, or it was fixed or proportional to stimulus size. Of these 10 canonical models, only the nonlinear transducer with linear pooling (the noisy energy model) described the main forms of the data for both experimental designs. We also show how a cross-correlator can be modified to fit our results and provide a contemporary presentation of the relation between summation and the slope of the psychometric function.
Resumo:
We show a new method for term extraction from a domain relevant corpus using natural language processing for the purposes of semi-automatic ontology learning. Literature shows that topical words occur in bursts. We find that the ranking of extracted terms is insensitive to the choice of population model, but calculating frequencies relative to the burst size rather than the document length in words yields significantly different results.
Resumo:
Distributed Brillouin sensing of strain and temperature works by making spatially resolved measurements of the position of the measurand-dependent extremum of the resonance curve associated with the scattering process in the weakly nonlinear regime. Typically, measurements of backscattered Stokes intensity (the dependent variable) are made at a number of predetermined fixed frequencies covering the design measurand range of the apparatus and combined to yield an estimate of the position of the extremum. The measurand can then be found because its relationship to the position of the extremum is assumed known. We present analytical expressions relating the relative error in the extremum position to experimental errors in the dependent variable. This is done for two cases: (i) a simple non-parametric estimate of the mean based on moments and (ii) the case in which a least squares technique is used to fit a Lorentzian to the data. The question of statistical bias in the estimates is discussed and in the second case we go further and present for the first time a general method by which the probability density function (PDF) of errors in the fitted parameters can be obtained in closed form in terms of the PDFs of the errors in the noisy data.