20 resultados para Data compression. Seismic data. Mathematical Transforms. Huffman codification
em CentAUR: Central Archive University of Reading - UK
Resumo:
Cross-hole anisotropic electrical and seismic tomograms of fractured metamorphic rock have been obtained at a test site where extensive hydrological data were available. A strong correlation between electrical resistivity anisotropy and seismic compressional-wave velocity anisotropy has been observed. Analysis of core samples from the site reveal that the shale-rich rocks have fabric-related average velocity anisotropy of between 10% and 30%. The cross-hole seismic data are consistent with these values, indicating that observed anisotropy might be principally due to the inherent rock fabric rather than to the aligned sets of open fractures. One region with velocity anisotropy greater than 30% has been modelled as aligned open fractures within an anisotropic rock matrix and this model is consistent with available fracture density and hydraulic transmissivity data from the boreholes and the cross-hole resistivity tomography data. However, in general the study highlights the uncertainties that can arise, due to the relative influence of rock fabric and fluid-filled fractures, when using geophysical techniques for hydrological investigations.
Resumo:
Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data.
Resumo:
Data assimilation is a sophisticated mathematical technique for combining observational data with model predictions to produce state and parameter estimates that most accurately approximate the current and future states of the true system. The technique is commonly used in atmospheric and oceanic modelling, combining empirical observations with model predictions to produce more accurate and well-calibrated forecasts. Here, we consider a novel application within a coastal environment and describe how the method can also be used to deliver improved estimates of uncertain morphodynamic model parameters. This is achieved using a technique known as state augmentation. Earlier applications of state augmentation have typically employed the 4D-Var, Kalman filter or ensemble Kalman filter assimilation schemes. Our new method is based on a computationally inexpensive 3D-Var scheme, where the specification of the error covariance matrices is crucial for success. A simple 1D model of bed-form propagation is used to demonstrate the method. The scheme is capable of recovering near-perfect parameter values and, therefore, improves the capability of our model to predict future bathymetry. Such positive results suggest the potential for application to more complex morphodynamic models.
Resumo:
We investigate the “flux excess” effect, whereby open solar flux estimates from spacecraft increase with increasing heliocentric distance. We analyze the kinematic effect on these open solar flux estimates of large-scale longitudinal structure in the solar wind flow, with particular emphasis on correcting estimates made using data from near-Earth satellites. We show that scatter, but no net bias, is introduced by the kinematic “bunching effect” on sampling and that this is true for both compression and rarefaction regions. The observed flux excesses, as a function of heliocentric distance, are shown to be consistent with open solar flux estimates from solar magnetograms made using the potential field source surface method and are well explained by the kinematic effect of solar wind speed variations on the frozen-in heliospheric field. Applying this kinematic correction to the Omni-2 interplanetary data set shows that the open solar flux at solar minimum fell from an annual mean of 3.82 × 1016 Wb in 1987 to close to half that value (1.98 × 1016 Wb) in 2007, making the fall in the minimum value over the last two solar cycles considerably faster than the rise inferred from geomagnetic activity observations over four solar cycles in the first half of the 20th century.
Resumo:
In a world of almost permanent and rapidly increasing electronic data availability, techniques of filtering, compressing, and interpreting this data to transform it into valuable and easily comprehensible information is of utmost importance. One key topic in this area is the capability to deduce future system behavior from a given data input. This book brings together for the first time the complete theory of data-based neurofuzzy modelling and the linguistic attributes of fuzzy logic in a single cohesive mathematical framework. After introducing the basic theory of data-based modelling, new concepts including extended additive and multiplicative submodels are developed and their extensions to state estimation and data fusion are derived. All these algorithms are illustrated with benchmark and real-life examples to demonstrate their efficiency. Chris Harris and his group have carried out pioneering work which has tied together the fields of neural networks and linguistic rule-based algortihms. This book is aimed at researchers and scientists in time series modeling, empirical data modeling, knowledge discovery, data mining, and data fusion.
Resumo:
It is generally assumed that the variability of neuronal morphology has an important effect on both the connectivity and the activity of the nervous system, but this effect has not been thoroughly investigated. Neuroanatomical archives represent a crucial tool to explore structure–function relationships in the brain. We are developing computational tools to describe, generate, store and render large sets of three–dimensional neuronal structures in a format that is compact, quantitative, accurate and readily accessible to the neuroscientist. Single–cell neuroanatomy can be characterized quantitatively at several levels. In computer–aided neuronal tracing files, a dendritic tree is described as a series of cylinders, each represented by diameter, spatial coordinates and the connectivity to other cylinders in the tree. This ‘Cartesian’ description constitutes a completely accurate mapping of dendritic morphology but it bears little intuitive information for the neuroscientist. In contrast, a classical neuroanatomical analysis characterizes neuronal dendrites on the basis of the statistical distributions of morphological parameters, e.g. maximum branching order or bifurcation asymmetry. This description is intuitively more accessible, but it only yields information on the collective anatomy of a group of dendrites, i.e. it is not complete enough to provide a precise ‘blueprint’ of the original data. We are adopting a third, intermediate level of description, which consists of the algorithmic generation of neuronal structures within a certain morphological class based on a set of ‘fundamental’, measured parameters. This description is as intuitive as a classical neuroanatomical analysis (parameters have an intuitive interpretation), and as complete as a Cartesian file (the algorithms generate and display complete neurons). The advantages of the algorithmic description of neuronal structure are immense. If an algorithm can measure the values of a handful of parameters from an experimental database and generate virtual neurons whose anatomy is statistically indistinguishable from that of their real counterparts, a great deal of data compression and amplification can be achieved. Data compression results from the quantitative and complete description of thousands of neurons with a handful of statistical distributions of parameters. Data amplification is possible because, from a set of experimental neurons, many more virtual analogues can be generated. This approach could allow one, in principle, to create and store a neuroanatomical database containing data for an entire human brain in a personal computer. We are using two programs, L–NEURON and ARBORVITAE, to investigate systematically the potential of several different algorithms for the generation of virtual neurons. Using these programs, we have generated anatomically plausible virtual neurons for several morphological classes, including guinea pig cerebellar Purkinje cells and cat spinal cord motor neurons. These virtual neurons are stored in an online electronic archive of dendritic morphology. This process highlights the potential and the limitations of the ‘computational neuroanatomy’ strategy for neuroscience databases.
Resumo:
Written for communications and electronic engineers, technicians and students, this book begins with an introduction to data communications, and goes on to explain the concept of layered communications. Other chapters deal with physical communications channels, baseband digital transmission, analog data transmission, error control and data compression codes, physical layer standards, the data link layer, the higher layers of the protocol hierarchy, and local are networks (LANS). Finally, the book explores some likely future developments.
Resumo:
This dissertation deals with aspects of sequential data assimilation (in particular ensemble Kalman filtering) and numerical weather forecasting. In the first part, the recently formulated Ensemble Kalman-Bucy (EnKBF) filter is revisited. It is shown that the previously used numerical integration scheme fails when the magnitude of the background error covariance grows beyond that of the observational error covariance in the forecast window. Therefore, we present a suitable integration scheme that handles the stiffening of the differential equations involved and doesn’t represent further computational expense. Moreover, a transform-based alternative to the EnKBF is developed: under this scheme, the operations are performed in the ensemble space instead of in the state space. Advantages of this formulation are explained. For the first time, the EnKBF is implemented in an atmospheric model. The second part of this work deals with ensemble clustering, a phenomenon that arises when performing data assimilation using of deterministic ensemble square root filters in highly nonlinear forecast models. Namely, an M-member ensemble detaches into an outlier and a cluster of M-1 members. Previous works may suggest that this issue represents a failure of EnSRFs; this work dispels that notion. It is shown that ensemble clustering can be reverted also due to nonlinear processes, in particular the alternation between nonlinear expansion and compression of the ensemble for different regions of the attractor. Some EnSRFs that use random rotations have been developed to overcome this issue; these formulations are analyzed and their advantages and disadvantages with respect to common EnSRFs are discussed. The third and last part contains the implementation of the Robert-Asselin-Williams (RAW) filter in an atmospheric model. The RAW filter is an improvement to the widely popular Robert-Asselin filter that successfully suppresses spurious computational waves while avoiding any distortion in the mean value of the function. Using statistical significance tests both at the local and field level, it is shown that the climatology of the SPEEDY model is not modified by the changed time stepping scheme; hence, no retuning of the parameterizations is required. It is found the accuracy of the medium-term forecasts is increased by using the RAW filter.
Resumo:
Variational data assimilation in continuous time is revisited. The central techniques applied in this paper are in part adopted from the theory of optimal nonlinear control. Alternatively, the investigated approach can be considered as a continuous time generalization of what is known as weakly constrained four-dimensional variational assimilation (4D-Var) in the geosciences. The technique allows to assimilate trajectories in the case of partial observations and in the presence of model error. Several mathematical aspects of the approach are studied. Computationally, it amounts to solving a two-point boundary value problem. For imperfect models, the trade-off between small dynamical error (i.e. the trajectory obeys the model dynamics) and small observational error (i.e. the trajectory closely follows the observations) is investigated. This trade-off turns out to be trivial if the model is perfect. However, even in this situation, allowing for minute deviations from the perfect model is shown to have positive effects, namely to regularize the problem. The presented formalism is dynamical in character. No statistical assumptions on dynamical or observational noise are imposed.
Resumo:
Data assimilation algorithms are a crucial part of operational systems in numerical weather prediction, hydrology and climate science, but are also important for dynamical reconstruction in medical applications and quality control for manufacturing processes. Usually, a variety of diverse measurement data are employed to determine the state of the atmosphere or to a wider system including land and oceans. Modern data assimilation systems use more and more remote sensing data, in particular radiances measured by satellites, radar data and integrated water vapor measurements via GPS/GNSS signals. The inversion of some of these measurements are ill-posed in the classical sense, i.e. the inverse of the operator H which maps the state onto the data is unbounded. In this case, the use of such data can lead to significant instabilities of data assimilation algorithms. The goal of this work is to provide a rigorous mathematical analysis of the instability of well-known data assimilation methods. Here, we will restrict our attention to particular linear systems, in which the instability can be explicitly analyzed. We investigate the three-dimensional variational assimilation and four-dimensional variational assimilation. A theory for the instability is developed using the classical theory of ill-posed problems in a Banach space framework. Further, we demonstrate by numerical examples that instabilities can and will occur, including an example from dynamic magnetic tomography.
Resumo:
Data quality is a difficult notion to define precisely, and different communities have different views and understandings of the subject. This causes confusion, a lack of harmonization of data across communities and omission of vital quality information. For some existing data infrastructures, data quality standards cannot address the problem adequately and cannot fulfil all user needs or cover all concepts of data quality. In this study, we discuss some philosophical issues on data quality. We identify actual user needs on data quality, review existing standards and specifications on data quality, and propose an integrated model for data quality in the field of Earth observation (EO). We also propose a practical mechanism for applying the integrated quality information model to a large number of datasets through metadata inheritance. While our data quality management approach is in the domain of EO, we believe that the ideas and methodologies for data quality management can be applied to wider domains and disciplines to facilitate quality-enabled scientific research.