912 resultados para sequencing error
Resumo:
Statistical physics is employed to evaluate the performance of error-correcting codes in the case of finite message length for an ensemble of Gallager's error correcting codes. We follow Gallager's approach of upper-bounding the average decoding error rate, but invoke the replica method to reproduce the tightest general bound to date, and to improve on the most accurate zero-error noise level threshold reported in the literature. The relation between the methods used and those presented in the information theory literature are explored.
Resumo:
We employ the methods of statistical physics to study the performance of Gallager type error-correcting codes. In this approach, the transmitted codeword comprises Boolean sums of the original message bits selected by two randomly-constructed sparse matrices. We show that a broad range of these codes potentially saturate Shannon's bound but are limited due to the decoding dynamics used. Other codes show sub-optimal performance but are not restricted by the decoding dynamics. We show how these codes may also be employed as a practical public-key cryptosystem and are of competitive performance to modern cyptographical methods.
Resumo:
We analyse Gallager codes by employing a simple mean-field approximation that distorts the model geometry and preserves important interactions between sites. The method naturally recovers the probability propagation decoding algorithm as a minimization of a proper free-energy. We find a thermodynamical phase transition that coincides with information theoretical upper-bounds and explain the practical code performance in terms of the free-energy landscape.
Resumo:
We study the performance of Low Density Parity Check (LDPC) error-correcting codes using the methods of statistical physics. LDPC codes are based on the generation of codewords using Boolean sums of the original message bits by employing two randomly-constructed sparse matrices. These codes can be mapped onto Ising spin models and studied using common methods of statistical physics. We examine various regular constructions and obtain insight into their theoretical and practical limitations. We also briefly report on results obtained for irregular code constructions, for codes with non-binary alphabet, and on how a finite system size effects the error probability.
Resumo:
Using the magnetization enumerator method, we evaluate the practical and theoretical limitations of symmetric channels with real outputs. Results are presented for several regular Gallager code constructions.
Resumo:
The replica method, developed in statistical physics, is employed in conjunction with Gallager's methodology to accurately evaluate zero error noise thresholds for Gallager code ensembles. Our approach generally provides more optimistic evaluations than those reported in the information theory literature for sparse matrices; the difference vanishes as the parity check matrix becomes dense.
Resumo:
We present a theoretical method for a direct evaluation of the average error exponent in Gallager error-correcting codes using methods of statistical physics. Results for the binary symmetric channel(BSC)are presented for codes of both finite and infinite connectivity.
Resumo:
We present a theoretical method for a direct evaluation of the average and reliability error exponents in low-density parity-check error-correcting codes using methods of statistical physics. Results for the binary symmetric channel are presented for codes of both finite and infinite connectivity.
Resumo:
To describe the prevalence of refractive error (myopia and hyperopia) and visual impairment in a representative sample of white school children.
Resumo:
A chip shooter machine for electronic components assembly has a movable feeder carrier holding components, a movable X-Y table carrying a printed circuit board (PCB), and a rotary turret having multiple assembly heads. This paper presents a hybrid genetic algorithm to optimize the sequence of component placements for a chip shooter machine. The objective of the problem is to minimize the total traveling distance of the X-Y table or the board. The genetic algorithm developed in the paper hybridizes the nearest neighbor heuristic, and an iterated swap procedure, which is a new improved heuristic. We have compared the performance of the hybrid genetic algorithm with that of the approach proposed by other researchers and have demonstrated our algorithm is superior in terms of the distance traveled by the X-Y table or the board.
Resumo:
This paper presents a hybrid genetic algorithm to optimize the sequence of component placements on a printed circuit board and the arrangement of component types to feeders simultaneously for a pick-and-place machine with multiple stationary feeders, a fixed board table and a movable placement head. The objective of the problem is to minimize the total travelling distance, or the travelling time, of the placement head. The genetic algorithm developed in the paper hybrisizes different search heuristics including the nearest neighbor heuristic, the 2-opt heuristic, and an iterated swap procedure, which is a new improving heuristic. Compared with the results obtained by other researchers, the performance of the hybrid genetic algorithm is superior to others in terms of the distance travelled by the placement head.
Resumo:
We have successfully linked protein library screening directly with the identification of active proteins, without the need for individual purification, display technologies or physical linkage between the protein and its encoding sequence. By using 'MAX' randomization we have rapidly constructed 60 overlapping gene libraries that encode zinc finger proteins, randomized variously at the three principal DNA-contacting residues. Expression and screening of the libraries against five possible target DNA sequences generated data points covering a potential 40,000 individual interactions. Comparative analysis of the resulting data enabled direct identification of active proteins. Accuracy of this library analysis methodology was confirmed by both in vitro and in vivo analyses of identified proteins to yield novel zinc finger proteins that bind to their target sequences with high affinity, as indicated by low nanomolar apparent dissociation constants.
Resumo:
Purpose To investigate the utility of uncorrected visual acuity measures in screening for refractive error in white school children aged 6-7-years and 12-13-years. Methods The Northern Ireland Childhood Errors of Refraction (NICER) study used a stratified random cluster design to recruit children from schools in Northern Ireland. Detailed eye examinations included assessment of logMAR visual acuity and cycloplegic autorefraction. Spherical equivalent refractive data from the right eye were used to classify significant refractive error as myopia of at least 1DS, hyperopia as greater than +3.50DS and astigmatism as greater than 1.50DC, whether it occurred in isolation or in association with myopia or hyperopia. Results Results are presented from 661 white 12-13-year-old and 392 white 6-7-year-old school-children. Using a cut-off of uncorrected visual acuity poorer than 0.20 logMAR to detect significant refractive error gave a sensitivity of 50% and specificity of 92% in 6-7-year-olds and 73% and 93% respectively in 12-13-year-olds. In 12-13-year-old children a cut-off of poorer than 0.20 logMAR had a sensitivity of 92% and a specificity of 91% in detecting myopia and a sensitivity of 41% and a specificity of 84% in detecting hyperopia. Conclusions Vision screening using logMAR acuity can reliably detect myopia, but not hyperopia or astigmatism in school-age children. Providers of vision screening programs should be cognisant that where detection of uncorrected hyperopic and/or astigmatic refractive error is an aspiration, current UK protocols will not effectively deliver.
Resumo:
Large monitoring networks are becoming increasingly common and can generate large datasets from thousands to millions of observations in size, often with high temporal resolution. Processing large datasets using traditional geostatistical methods is prohibitively slow and in real world applications different types of sensor can be found across a monitoring network. Heterogeneities in the error characteristics of different sensors, both in terms of distribution and magnitude, presents problems for generating coherent maps. An assumption in traditional geostatistics is that observations are made directly of the underlying process being studied and that the observations are contaminated with Gaussian errors. Under this assumption, sub–optimal predictions will be obtained if the error characteristics of the sensor are effectively non–Gaussian. One method, model based geostatistics, assumes that a Gaussian process prior is imposed over the (latent) process being studied and that the sensor model forms part of the likelihood term. One problem with this type of approach is that the corresponding posterior distribution will be non–Gaussian and computationally demanding as Monte Carlo methods have to be used. An extension of a sequential, approximate Bayesian inference method enables observations with arbitrary likelihoods to be treated, in a projected process kriging framework which is less computationally intensive. The approach is illustrated using a simulated dataset with a range of sensor models and error characteristics.
Resumo:
Regression problems are concerned with predicting the values of one or more continuous quantities, given the values of a number of input variables. For virtually every application of regression, however, it is also important to have an indication of the uncertainty in the predictions. Such uncertainties are expressed in terms of the error bars, which specify the standard deviation of the distribution of predictions about the mean. Accurate estimate of error bars is of practical importance especially when safety and reliability is an issue. The Bayesian view of regression leads naturally to two contributions to the error bars. The first arises from the intrinsic noise on the target data, while the second comes from the uncertainty in the values of the model parameters which manifests itself in the finite width of the posterior distribution over the space of these parameters. The Hessian matrix which involves the second derivatives of the error function with respect to the weights is needed for implementing the Bayesian formalism in general and estimating the error bars in particular. A study of different methods for evaluating this matrix is given with special emphasis on the outer product approximation method. The contribution of the uncertainty in model parameters to the error bars is a finite data size effect, which becomes negligible as the number of data points in the training set increases. A study of this contribution is given in relation to the distribution of data in input space. It is shown that the addition of data points to the training set can only reduce the local magnitude of the error bars or leave it unchanged. Using the asymptotic limit of an infinite data set, it is shown that the error bars have an approximate relation to the density of data in input space.