47 resultados para Data compression (Electronic computers)

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is generally assumed that the variability of neuronal morphology has an important effect on both the connectivity and the activity of the nervous system, but this effect has not been thoroughly investigated. Neuroanatomical archives represent a crucial tool to explore structure–function relationships in the brain. We are developing computational tools to describe, generate, store and render large sets of three–dimensional neuronal structures in a format that is compact, quantitative, accurate and readily accessible to the neuroscientist. Single–cell neuroanatomy can be characterized quantitatively at several levels. In computer–aided neuronal tracing files, a dendritic tree is described as a series of cylinders, each represented by diameter, spatial coordinates and the connectivity to other cylinders in the tree. This ‘Cartesian’ description constitutes a completely accurate mapping of dendritic morphology but it bears little intuitive information for the neuroscientist. In contrast, a classical neuroanatomical analysis characterizes neuronal dendrites on the basis of the statistical distributions of morphological parameters, e.g. maximum branching order or bifurcation asymmetry. This description is intuitively more accessible, but it only yields information on the collective anatomy of a group of dendrites, i.e. it is not complete enough to provide a precise ‘blueprint’ of the original data. We are adopting a third, intermediate level of description, which consists of the algorithmic generation of neuronal structures within a certain morphological class based on a set of ‘fundamental’, measured parameters. This description is as intuitive as a classical neuroanatomical analysis (parameters have an intuitive interpretation), and as complete as a Cartesian file (the algorithms generate and display complete neurons). The advantages of the algorithmic description of neuronal structure are immense. If an algorithm can measure the values of a handful of parameters from an experimental database and generate virtual neurons whose anatomy is statistically indistinguishable from that of their real counterparts, a great deal of data compression and amplification can be achieved. Data compression results from the quantitative and complete description of thousands of neurons with a handful of statistical distributions of parameters. Data amplification is possible because, from a set of experimental neurons, many more virtual analogues can be generated. This approach could allow one, in principle, to create and store a neuroanatomical database containing data for an entire human brain in a personal computer. We are using two programs, L–NEURON and ARBORVITAE, to investigate systematically the potential of several different algorithms for the generation of virtual neurons. Using these programs, we have generated anatomically plausible virtual neurons for several morphological classes, including guinea pig cerebellar Purkinje cells and cat spinal cord motor neurons. These virtual neurons are stored in an online electronic archive of dendritic morphology. This process highlights the potential and the limitations of the ‘computational neuroanatomy’ strategy for neuroscience databases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Written for communications and electronic engineers, technicians and students, this book begins with an introduction to data communications, and goes on to explain the concept of layered communications. Other chapters deal with physical communications channels, baseband digital transmission, analog data transmission, error control and data compression codes, physical layer standards, the data link layer, the higher layers of the protocol hierarchy, and local are networks (LANS). Finally, the book explores some likely future developments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Numerical weather prediction (NWP) centres use numerical models of the atmospheric flow to forecast future weather states from an estimate of the current state. Variational data assimilation (VAR) is used commonly to determine an optimal state estimate that miminizes the errors between observations of the dynamical system and model predictions of the flow. The rate of convergence of the VAR scheme and the sensitivity of the solution to errors in the data are dependent on the condition number of the Hessian of the variational least-squares objective function. The traditional formulation of VAR is ill-conditioned and hence leads to slow convergence and an inaccurate solution. In practice, operational NWP centres precondition the system via a control variable transform to reduce the condition number of the Hessian. In this paper we investigate the conditioning of VAR for a single, periodic, spatially-distributed state variable. We present theoretical bounds on the condition number of the original and preconditioned Hessians and hence demonstrate the improvement produced by the preconditioning. We also investigate theoretically the effect of observation position and error variance on the preconditioned system and show that the problem becomes more ill-conditioned with increasingly dense and accurate observations. Finally, we confirm the theoretical results in an operational setting by giving experimental results from the Met Office variational system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The resolution of remotely sensed data is becoming increasingly fine, and there are now many sources of data with a pixel size of 1 m x 1 m. This produces huge amounts of data that have to be stored, processed and transmitted. For environmental applications this resolution possibly provides far more data than are needed: data overload. This poses the question: how much is too much? We have explored two resolutions of data-20 in pixel SPOT data and I in pixel Computerized Airborne Multispectral Imaging System (CAMIS) data from Fort A. P. Hill (Virginia, USA), using the variogram of geostatistics. For both we used the normalized difference vegetation index (NDVI). Three scales of spatial variation were identified in both the SPOT and 1 in data: there was some overlap at the intermediate spatial scales of about 150 in and of 500 m-600 in. We subsampled the I in data and scales of variation of about 30 in and of 300 in were identified consistently until the separation between pixel centroids was 15 in (or 1 in 225pixels). At this stage, spatial scales of about 100m and 600m were described, which suggested that only now was there a real difference in the amount of spatial information available from an environmental perspective. These latter were similar spatial scales to those identified from the SPOT image. We have also analysed I in CAMIS data from Fort Story (Virginia, USA) for comparison and the outcome is similar.:From these analyses it seems that a pixel size of 20m is adequate for many environmental applications, and that if more detail is required the higher resolution data could be sub-sampled to a 10m separation between pixel centroids without any serious loss of information. This reduces significantly the amount of data that needs to be stored, transmitted and analysed and has important implications for data compression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Empirical orthogonal function (EOF) analysis is a powerful tool for data compression and dimensionality reduction used broadly in meteorology and oceanography. Often in the literature, EOF modes are interpreted individually, independent of other modes. In fact, it can be shown that no such attribution can generally be made. This review demonstrates that in general individual EOF modes (i) will not correspond to individual dynamical modes, (ii) will not correspond to individual kinematic degrees of freedom, (iii) will not be statistically independent of other EOF modes, and (iv) will be strongly influenced by the nonlocal requirement that modes maximize variance over the entire domain. The goal of this review is not to argue against the use of EOF analysis in meteorology and oceanography; rather, it is to demonstrate the care that must be taken in the interpretation of individual modes in order to distinguish the medium from the message.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objectives: To clarify the role of growth monitoring in primary school children, including obesity, and to examine issues that might impact on the effectiveness and cost-effectiveness of such programmes. Data sources: Electronic databases were searched up to July 2005. Experts in the field were also consulted. Review methods: Data extraction and quality assessment were performed on studies meeting the review's inclusion criteria. The performance of growth monitoring to detect disorders of stature and obesity was evaluated against National Screening Committee (NSC) criteria. Results: In the 31 studies that were included in the review, there were no controlled trials of the impact of growth monitoring and no studies of the diagnostic accuracy of different methods for growth monitoring. Analysis of the studies that presented a 'diagnostic yield' of growth monitoring suggested that one-off screening might identify between 1: 545 and 1: 1793 new cases of potentially treatable conditions. Economic modelling suggested that growth monitoring is associated with health improvements [ incremental cost per quality-adjusted life-year (QALY) of pound 9500] and indicated that monitoring was cost-effective 100% of the time over the given probability distributions for a willingness to pay threshold of pound 30,000 per QALY. Studies of obesity focused on the performance of body mass index against measures of body fat. A number of issues relating to human resources required for growth monitoring were identified, but data on attitudes to growth monitoring were extremely sparse. Preliminary findings from economic modelling suggested that primary prevention may be the most cost-effective approach to obesity management, but the model incorporated a great deal of uncertainty. Conclusions: This review has indicated the potential utility and cost-effectiveness of growth monitoring in terms of increased detection of stature-related disorders. It has also pointed strongly to the need for further research. Growth monitoring does not currently meet all NSC criteria. However, it is questionable whether some of these criteria can be meaningfully applied to growth monitoring given that short stature is not a disease in itself, but is used as a marker for a range of pathologies and as an indicator of general health status. Identification of effective interventions for the treatment of obesity is likely to be considered a prerequisite to any move from monitoring to a screening programme designed to identify individual overweight and obese children. Similarly, further long-term studies of the predictors of obesity-related co-morbidities in adulthood are warranted. A cluster randomised trial comparing growth monitoring strategies with no growth monitoring in the general population would most reliably determine the clinical effectiveness of growth monitoring. Studies of diagnostic accuracy, alongside evidence of effective treatment strategies, could provide an alternative approach. In this context, careful consideration would need to be given to target conditions and intervention thresholds. Diagnostic accuracy studies would require long-term follow-up of both short and normal children to determine sensitivity and specificity of growth monitoring.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Point defects in metal oxides such as TiO2 are key to their applications in numerous technologies. The investigation of thermally induced nonstoichiometry in TiO2 is complicated by the difficulties in preparing and determining a desired degree of nonstoichiometry. We study controlled self-doping of TiO2 by adsorption of 1/8 and 1/16 monolayer Ti at the (110) surface using a combination of experimental and computational approaches to unravel the details of the adsorption process and the oxidation state of Ti. Upon adsorption of Ti, x-ray and ultraviolet photoemission spectroscopy (XPS and UPS) show formation of reduced Ti. Comparison of pure density functional theory (DFT) with experiment shows that pure DFT provides an inconsistent description of the electronic structure. To surmount this difficulty, we apply DFT corrected for on-site Coulomb interaction (DFT+U) to describe reduced Ti ions. The optimal value of U is 3 eV, determined from comparison of the computed Ti 3d electronic density of states with the UPS data. DFT+U and UPS show the appearance of a Ti 3d adsorbate-induced state at 1.3 eV above the valence band and 1.0 eV below the conduction band. The computations show that the adsorbed Ti atom is oxidized to Ti2+ and a fivefold coordinated surface Ti atom is reduced to Ti3+, while the remaining electron is distributed among other surface Ti atoms. The UPS data are best fitted with reduced Ti2+ and Ti3+ ions. These results demonstrate that the complexity of doped metal oxides is best understood with a combination of experiment and appropriate computations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The simulated annealing approach to structure solution from powder diffraction data, as implemented in the DASH program, is easily amenable to parallelization at the individual run level. Very large scale increases in speed of execution can therefore be achieved by distributing individual DASH runs over a network of computers. The GDASH program achieves this by packaging DASH in a form that enables it to run under the Univa UD Grid MP system, which harnesses networks of existing computing resources to perform calculations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Matheron's usual variogram estimator can result in unreliable variograms when data are strongly asymmetric or skewed. Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. This paper examines the effects of underlying asymmetry on the variogram and on the accuracy of prediction, and the second one examines the effects arising from outliers. Standard geostatistical texts suggest ways of dealing with underlying asymmetry; however, this is based on informed intuition rather than detailed investigation. To determine whether the methods generally used to deal with underlying asymmetry are appropriate, the effects of different coefficients of skewness on the shape of the experimental variogram and on the model parameters were investigated. Simulated annealing was used to create normally distributed random fields of different size from variograms with different nugget:sill ratios. These data were then modified to give different degrees of asymmetry and the experimental variogram was computed in each case. The effects of standard data transformations on the form of the variogram were also investigated. Cross-validation was used to assess quantitatively the performance of the different variogram models for kriging. The results showed that the shape of the variogram was affected by the degree of asymmetry, and that the effect increased as the size of data set decreased. Transformations of the data were more effective in reducing the skewness coefficient in the larger sets of data. Cross-validation confirmed that variogram models from transformed data were more suitable for kriging than were those from the raw asymmetric data. The results of this study have implications for the 'standard best practice' in dealing with asymmetry in data for geostatistical analyses. (C) 2007 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. The first paper of this series examined the effects of the former on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to create normally distributed random fields of different size that are realizations of known processes described by variograms with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were computed from these data by Matheron's estimator and by three robust estimators. The effects of standard data transformations on the coefficient of skewness and on the variogram were also investigated. Cross-validation was used to assess the performance of models fitted to experimental variograms computed from a range of data contaminated by outliers for kriging. The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both square roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near to the edge or the centre of the field. The results of cross-validation showed that the robust estimators and the removal of outliers were the most effective ways of dealing with outliers for variogram estimation and kriging. (C) 2007 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate the “flux excess” effect, whereby open solar flux estimates from spacecraft increase with increasing heliocentric distance. We analyze the kinematic effect on these open solar flux estimates of large-scale longitudinal structure in the solar wind flow, with particular emphasis on correcting estimates made using data from near-Earth satellites. We show that scatter, but no net bias, is introduced by the kinematic “bunching effect” on sampling and that this is true for both compression and rarefaction regions. The observed flux excesses, as a function of heliocentric distance, are shown to be consistent with open solar flux estimates from solar magnetograms made using the potential field source surface method and are well explained by the kinematic effect of solar wind speed variations on the frozen-in heliospheric field. Applying this kinematic correction to the Omni-2 interplanetary data set shows that the open solar flux at solar minimum fell from an annual mean of 3.82 × 1016 Wb in 1987 to close to half that value (1.98 × 1016 Wb) in 2007, making the fall in the minimum value over the last two solar cycles considerably faster than the rise inferred from geomagnetic activity observations over four solar cycles in the first half of the 20th century.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective To assess the impact of a closed-loop electronic prescribing and automated dispensing system on the time spent providing a ward pharmacy service and the activities carried out. Setting Surgical ward, London teaching hospital. Method All data were collected two months pre- and one year post-intervention. First, the ward pharmacist recorded the time taken each day for four weeks. Second, an observational study was conducted over 10 weekdays, using two-dimensional work sampling, to identify the ward pharmacist's activities. Finally, medication orders were examined to identify pharmacists' endorsements that should have been, and were actually, made. Key findings Mean time to provide a weekday ward pharmacy service increased from 1 h 8 min to 1 h 38 min per day (P = 0.001; unpaired t-test). There were significant increases in time spent prescription monitoring, recommending changes in therapy/monitoring, giving advice or information, and non-productive time. There were decreases for supply, looking for charts and checking patients' own drugs. There was an increase in the amount of time spent with medical and pharmacy staff, and with 'self'. Seventy-eight per cent of patients' medication records could be assessed for endorsements pre- and 100% post-intervention. Endorsements were required for 390 (50%) of 787 medication orders pre-intervention and 190 (21%) of 897 afterwards (P < 0.0001; chi-square test). Endorsements were made for 214 (55%) of endorsement opportunities pre-intervention and 57 (30%) afterwards (P < 0.0001; chi-square test). Conclusion The intervention increased the overall time required to provide a ward pharmacy service and changed the types of activity undertaken. Contact time with medical and pharmacy staff increased. There was no significant change in time spent with patients. Fewer pharmacy endorsements were required post-intervention, but a lower percentage were actually made. The findings have important implications for the design, introduction and use of similar systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This presentation describes a system for measuring claddings as an example of the many possible advantages to be obtained by applying a personal computer to eddy current testing. A theoretical model and a learning algorithm are integrated into an instrument. They are supported in the PC, and serve to simplify and enhance multiparameter testing. The PC gives additional assistance by simplifying set-up procedures and data logging etc.