892 resultados para lossy compression
Resumo:
A novel approach for lossless as well as lossy compression of monochrome images using Boolean minimization is proposed. The image is split into bit planes. Each bit plane is divided into windows or blocks of variable size. Each block is transformed into a Boolean switching function in cubical form, treating the pixel values as output of the function. Compression is performed by minimizing these switching functions using ESPRESSO, a cube based two level function minimizer. The minimized cubes are encoded using a code set which satisfies the prefix property. Our technique of lossless compression involves linear prediction as a preprocessing step and has compression ratio comparable to that of JPEG lossless compression technique. Our lossy compression technique involves reducing the number of bit planes as a preprocessing step which incurs minimal loss in the information of the image. The bit planes that remain after preprocessing are compressed using our lossless compression technique based on Boolean minimization. Qualitatively one cannot visually distinguish between the original image and the lossy image and the value of mean square error is kept low. For mean square error value close to that of JPEG lossy compression technique, our method gives better compression ratio. The compression scheme is relatively slower while the decompression time is comparable to that of JPEG.
Resumo:
Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data.
Resumo:
The main goal of this research is to design an efficient compression al~ gorithm for fingerprint images. The wavelet transform technique is the principal tool used to reduce interpixel redundancies and to obtain a parsimonious representation for these images. A specific fixed decomposition structure is designed to be used by the wavelet packet in order to save on the computation, transmission, and storage costs. This decomposition structure is based on analysis of information packing performance of several decompositions, two-dimensional power spectral density, effect of each frequency band on the reconstructed image, and the human visual sensitivities. This fixed structure is found to provide the "most" suitable representation for fingerprints, according to the chosen criteria. Different compression techniques are used for different subbands, based on their observed statistics. The decision is based on the effect of each subband on the reconstructed image according to the mean square criteria as well as the sensitivities in human vision. To design an efficient quantization algorithm, a precise model for distribution of the wavelet coefficients is developed. The model is based on the generalized Gaussian distribution. A least squares algorithm on a nonlinear function of the distribution model shape parameter is formulated to estimate the model parameters. A noise shaping bit allocation procedure is then used to assign the bit rate among subbands. To obtain high compression ratios, vector quantization is used. In this work, the lattice vector quantization (LVQ) is chosen because of its superior performance over other types of vector quantizers. The structure of a lattice quantizer is determined by its parameters known as truncation level and scaling factor. In lattice-based compression algorithms reported in the literature the lattice structure is commonly predetermined leading to a nonoptimized quantization approach. In this research, a new technique for determining the lattice parameters is proposed. In the lattice structure design, no assumption about the lattice parameters is made and no training and multi-quantizing is required. The design is based on minimizing the quantization distortion by adapting to the statistical characteristics of the source in each subimage. 11 Abstract Abstract Since LVQ is a multidimensional generalization of uniform quantizers, it produces minimum distortion for inputs with uniform distributions. In order to take advantage of the properties of LVQ and its fast implementation, while considering the i.i.d. nonuniform distribution of wavelet coefficients, the piecewise-uniform pyramid LVQ algorithm is proposed. The proposed algorithm quantizes almost all of source vectors without the need to project these on the lattice outermost shell, while it properly maintains a small codebook size. It also resolves the wedge region problem commonly encountered with sharply distributed random sources. These represent some of the drawbacks of the algorithm proposed by Barlaud [26). The proposed algorithm handles all types of lattices, not only the cubic lattices, as opposed to the algorithms developed by Fischer [29) and Jeong [42). Furthermore, no training and multiquantizing (to determine lattice parameters) is required, as opposed to Powell's algorithm [78). For coefficients with high-frequency content, the positive-negative mean algorithm is proposed to improve the resolution of reconstructed images. For coefficients with low-frequency content, a lossless predictive compression scheme is used to preserve the quality of reconstructed images. A method to reduce bit requirements of necessary side information is also introduced. Lossless entropy coding techniques are subsequently used to remove coding redundancy. The algorithms result in high quality reconstructed images with better compression ratios than other available algorithms. To evaluate the proposed algorithms their objective and subjective performance comparisons with other available techniques are presented. The quality of the reconstructed images is important for a reliable identification. Enhancement and feature extraction on the reconstructed images are also investigated in this research. A structural-based feature extraction algorithm is proposed in which the unique properties of fingerprint textures are used to enhance the images and improve the fidelity of their characteristic features. The ridges are extracted from enhanced grey-level foreground areas based on the local ridge dominant directions. The proposed ridge extraction algorithm, properly preserves the natural shape of grey-level ridges as well as precise locations of the features, as opposed to the ridge extraction algorithm in [81). Furthermore, it is fast and operates only on foreground regions, as opposed to the adaptive floating average thresholding process in [68). Spurious features are subsequently eliminated using the proposed post-processing scheme.
Resumo:
This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.
Resumo:
The thesis explores the area of still image compression. The image compression techniques can be broadly classified into lossless and lossy compression. The most common lossy compression techniques are based on Transform coding, Vector Quantization and Fractals. Transform coding is the simplest of the above and generally employs reversible transforms like, DCT, DWT, etc. Mapped Real Transform (MRT) is an evolving integer transform, based on real additions alone. The present research work aims at developing new image compression techniques based on MRT. Most of the transform coding techniques employ fixed block size image segmentation, usually 8×8. Hence, a fixed block size transform coding is implemented using MRT and the merits and demerits are analyzed for both 8×8 and 4×4 blocks. The N2 unique MRT coefficients, for each block, are computed using templates. Considering the merits and demerits of fixed block size transform coding techniques, a hybrid form of these techniques is implemented to improve the performance of compression. The performance of the hybrid coder is found to be better compared to the fixed block size coders. Thus, if the block size is made adaptive, the performance can be further improved. In adaptive block size coding, the block size may vary from the size of the image to 2×2. Hence, the computation of MRT using templates is impractical due to memory requirements. So, an adaptive transform coder based on Unique MRT (UMRT), a compact form of MRT, is implemented to get better performance in terms of PSNR and HVS The suitability of MRT in vector quantization of images is then experimented. The UMRT based Classified Vector Quantization (CVQ) is implemented subsequently. The edges in the images are identified and classified by employing a UMRT based criteria. Based on the above experiments, a new technique named “MRT based Adaptive Transform Coder with Classified Vector Quantization (MATC-CVQ)”is developed. Its performance is evaluated and compared against existing techniques. A comparison with standard JPEG & the well-known Shapiro’s Embedded Zero-tree Wavelet (EZW) is done and found that the proposed technique gives better performance for majority of images
Resumo:
Programa de doctorado: Ingeniería de Telecomunicación Avanzada
Resumo:
The use of 3D data in mobile robotics applications provides valuable information about the robot’s environment but usually the huge amount of 3D information is unmanageable by the robot storage and computing capabilities. A data compression is necessary to store and manage this information but preserving as much information as possible. In this paper, we propose a 3D lossy compression system based on plane extraction which represent the points of each scene plane as a Delaunay triangulation and a set of points/area information. The compression system can be customized to achieve different data compression or accuracy ratios. It also supports a color segmentation stage to preserve original scene color information and provides a realistic scene reconstruction. The design of the method provides a fast scene reconstruction useful for further visualization or processing tasks.
Resumo:
Medical imaging technology and applications are continuously evolving, dealing with images of increasing spatial and temporal resolutions, which allow easier and more accurate medical diagnosis. However, this increase in resolution demands a growing amount of data to be stored and transmitted. Despite the high coding efficiency achieved by the most recent image and video coding standards in lossy compression, they are not well suited for quality-critical medical image compression where either near-lossless or lossless coding is required. In this dissertation, two different approaches to improve lossless coding of volumetric medical images, such as Magnetic Resonance and Computed Tomography, were studied and implemented using the latest standard High Efficiency Video Encoder (HEVC). In a first approach, the use of geometric transformations to perform inter-slice prediction was investigated. For the second approach, a pixel-wise prediction technique, based on Least-Squares prediction, that exploits inter-slice redundancy was proposed to extend the current HEVC lossless tools. Experimental results show a bitrate reduction between 45% and 49%, when compared with DICOM recommended encoders, and 13.7% when compared with standard HEVC.
Resumo:
Two decades after its inception, Latent Semantic Analysis(LSA) has become part and parcel of every modern introduction to Information Retrieval. For any tool that matures so quickly, it is important to check its lore and limitations, or else stagnation will set in. We focus here on the three main aspects of LSA that are well accepted, and the gist of which can be summarized as follows: (1) that LSA recovers latent semantic factors underlying the document space, (2) that such can be accomplished through lossy compression of the document space by eliminating lexical noise, and (3) that the latter can best be achieved by Singular Value Decomposition. For each aspect we performed experiments analogous to those reported in the LSA literature and compared the evidence brought to bear in each case. On the negative side, we show that the above claims about LSA are much more limited than commonly believed. Even a simple example may show that LSA does not recover the optimal semantic factors as intended in the pedagogical example used in many LSA publications. Additionally, and remarkably deviating from LSA lore, LSA does not scale up well: the larger the document space, the more unlikely that LSA recovers an optimal set of semantic factors. On the positive side, we describe new algorithms to replace LSA (and more recent alternatives as pLSA, LDA, and kernel methods) by trading its l2 space for an l1 space, thereby guaranteeing an optimal set of semantic factors. These algorithms seem to salvage the spirit of LSA as we think it was initially conceived.
Resumo:
Conventional encryption techniques are usually applicable for text data and often unsuited for encrypting multimedia objects for two reasons. Firstly, the huge sizes associated with multimedia objects make conventional encryption computationally costly. Secondly, multimedia objects come with massive redundancies which are useful in avoiding encryption of the objects in their entirety. Hence a class of encryption techniques devoted to encrypting multimedia objects like images have been developed. These techniques make use of the fact that the data comprising multimedia objects like images could in general be seggregated into two disjoint components, namely salient and non-salient. While the former component contributes to the perceptual quality of the object, the latter only adds minor details to it. In the context of images, the salient component is often much smaller in size than the non-salient component. Encryption effort is considerably reduced if only the salient component is encrypted while leaving the other component unencrypted. A key challenge is to find means to achieve a desirable seggregation so that the unencrypted component does not reveal any information about the object itself. In this study, an image encryption approach that uses fractal structures known as space-filling curves- in order to reduce the encryption overload is presented. In addition, the approach also enables a high quality lossy compression of images.
Resumo:
O tema principal desta tese é o problema de cancelamento de interferência para sistemas multi-utilizador, com antenas distribuídas. Como tal, ao iniciar, uma visão geral das principais propriedades de um sistema de antenas distribuídas é apresentada. Esta descrição inclui o estudo analítico do impacto da ligação, dos utilizadores do sistema, a mais antenas distribuídas. Durante essa análise é demonstrado que a propriedade mais importante do sistema para obtenção do ganho máximo, através da ligação de mais antenas de transmissão, é a simetria espacial e que os utilizadores nas fronteiras das células são os mais bene ciados. Tais resultados são comprovados através de simulação. O problema de cancelamento de interferência multi-utilizador é considerado tanto para o caso unidimensional (i.e. sem codi cação) como para o multidimensional (i.e. com codi cação). Para o caso unidimensional um algoritmo de pré-codi cação não-linear é proposto e avaliado, tendo como objectivo a minimização da taxa de erro de bit. Tanto o caso de portadora única como o de multipla-portadora são abordados, bem como o cenário de antenas colocadas e distribuidas. É demonstrado que o esquema proposto pode ser visto como uma extensão do bem conhecido esquema de zeros forçados, cuja desempenho é provado ser um limite inferior para o esquema generalizado. O algoritmo é avaliado, para diferentes cenários, através de simulação, a qual indica desempenho perto do óptimo, com baixa complexidade. Para o caso multi-dimensional um esquema para efectuar "dirty paper coding" binário, tendo como base códigos de dupla camada é proposto. No desenvolvimento deste esquema, a compressão com perdas de informação, é considerada como um subproblema. Resultados de simulação indicam transmissão dedigna proxima do limite de Shannon.
Resumo:
Der technische Fortschritt konfrontiert die medizinische Bildgebung wie keine andere Sparte der Medizin mit einem rasanten Anstieg zu speichernder Daten. Anschaffung, Wartung und Ausbau der nötigen Infrastruktur entwickeln sich zunehmend zu einem ökonomischen Faktor. Ein Verfahren, welches diesem Trend etwas entgegensetzten könnte ist die irreversible Bilddatenkompression. Sie ist seit über 10 Jahren Gegenstand vieler Studien, deren Ergebnisse sich wiederum in Empfehlungen zum Einsatz irreversibler Kompression mehrerer nationaler und internationaler Organisation, wie CAR, DRG, RCR und ESR wiederspiegeln. Tenor dieser Empfehlungen ist, dass der Einsatz von moderater irreversibler Bilddatenkompression sicher und sinnvoll ist. Teil dieser Empfehlungen sind auch Angaben über das Maß an Kompression, ausgedrückt in Kompressionsraten, welche je nach Untersuchung und anatomischer Region als sicher anwendbar gelten und keinen diagnostisch relevanten Verlust der komprimierten Bilder erzeugen.rnVerschiedene Kompressionsalgorithmen wurden vorgeschlagen. Letztendlich haben sich vor allem die beiden weit verbreiteten Algorithmen JPEG und JPEG2000 bewährt. Letzterer erfährt in letzter Zeit zunehmen Anwendung, aufgrund seiner einfacheren Handhabung und seiner umfangreichen Zusatzfunktionen.rnAufgrund rechtlich-ethischer Bedenken hat die irreversible Kompression keine breite praktische Anwendung finden können. Dafür verantwortlich ist unter anderem auch die Unklarheit, wie sich irreversible Kompression auf Nach- und Weiterverarbeitung (sog. Postprocessing) medizinischer Bilder, wie Segmentierung, Volumetrie oder 3D-Darstellung, auswirkt. Bisherige Studien zu diesem Thema umfassen vier verschiedene Postprocessing-Algorithmen. Die untersuchten Algorithmen zeigten sich bei verlustbehafteter Kompression im Bereich der erwähnten, publizierten Kompressionsraten weitgehend unbeeinflusst. Lediglich die computergestützte Messung von Stenosegraden in der digitalen Koronarangiographie kollidiert mit den in Großbritannien geltenden Empfehlungen. Die Verwendung unterschiedlicher Kompressionsalgorithmen schränkt die allgemeinernAussagekraft dieser Studienergebnisse außerdem ein.rnZur Erweiterung der Studienlage wurden vier weitere Nach- und Weiterverarbeitungsalgorithmen auf ihre Kompressionstoleranz untersucht. Dabei wurden die Kompressionsraten von 8:1, 10:1 und 15:1 verwendet, welche um die empfohlenen Kompressionsraten von CAR, DRG, RCR und ESR liegen und so ein praxisnahes Setting bieten. Als Kompressionsalgorithmus wurde JPEG2000 verwendet, aufgrund seiner zunehmenden Nutzung in Studien sowie seiner bereits erwähnten Vorzüge in Sachen Handhabung und Zusatzfunktionen. Die vier Algorithmen umfassten das 3D-Volume rendering von CT-Angiographien der Becken-Bein-Gefäße, die Computer-assistierte Detektion von Lungenrundherden, die automatisierte Volumetrie von Leberrundherden und die funktionelle Bestimmung der Ejektionsfraktion in computertomographischen Aufnahmen des Herzens.rnAlle vier Algorithmen zeigten keinen Einfluss durch irreversibler Bilddatenkompression in denrngewählten Kompressionsraten (8:1, 10:1 und 15:1). Zusammen mit der bestehenden Literatur deuten die Ergebnisse an, dass moderate irreversible Kompression im Rahmen aktueller Empfehlungen keinen Einfluss auf Nach- und Weiterverarbeitung medizinischer Bilder hat. Eine explizitere Vorhersage zu einem bestimmten, noch nicht untersuchten Algorithmus ist jedoch aufgrund der unterschiedlichen Funktionsweisen und Programmierungen nicht sicher möglich.rnSofern ein Postprocessing Algorithmus auf komprimiertes Bildmaterial angewendet werden soll, muss dieser zunächst auf seine Kompressionstoleranz getestet werden. Dabei muss der Test eine rechtlich-ethische Grundlage für den Einsatz des Algorithmus bei komprimiertem Bildmaterial schaffen. Es sind vor allem zwei Optionen denkbar, die Testung institutsintern, eventuell unter Zuhilfenahme von vorgefertigten Bibliotheken, oder die Testung durch den Hersteller des Algorithmus.
Resumo:
Quizás el Código Morse, inventado en 1838 para su uso en la telegrafía, es uno de los primeros ejemplos de la utilización práctica de la compresión de datos [1], donde las letras más comunes del alfabeto son codificadas con códigos más cortos que las demás. A partir de 1940 y tras el desarrollo de la teoría de la información y la creación de los primeros ordenadores, la compresión de la información ha sido un reto constante y fundamental entre los campos de trabajo de investigadores de todo tipo. Cuanto mayor es nuestra comprensión sobre el significado de la información, mayor es nuestro éxito comprimiéndola. En el caso de la información multimedia, su naturaleza permite la compresión con pérdidas, alcanzando así cotas de compresión imposibles para los algoritmos sin pérdidas. Estos “recientes” algoritmos con pérdidas han estado mayoritariamente basados en transformación de la información al dominio de la frecuencia y en la eliminación de parte de la información en dicho dominio. Transformar al dominio de la frecuencia posee ventajas pero también involucra unos costes computacionales inevitables. Esta tesis presenta un nuevo algoritmo de compresión multimedia llamado “LHE” (Logarithmical Hopping Encoding) que no requiere transformación al dominio de la frecuencia, sino que trabaja en el dominio del espacio. Esto lo convierte en un algoritmo lineal de reducida complejidad computacional. Los resultados del algoritmo son prometedores, superando al estándar JPEG en calidad y velocidad. Para ello el algoritmo utiliza como base la respuesta fisiológica del ojo humano ante el estímulo luminoso. El ojo, al igual que el resto de los sentidos, responde al logaritmo de la señal de acuerdo a la ley de Weber. El algoritmo se compone de varias etapas. Una de ellas es la medición de la “Relevancia Perceptual”, una nueva métrica que nos va a permitir medir la relevancia que tiene la información en la mente del sujeto y en base a la misma, degradar en mayor o menor medida su contenido, a través de lo que he llamado “sub-muestreado elástico”. La etapa de sub-muestreado elástico constituye una nueva técnica sin precedentes en el tratamiento digital de imágenes. Permite tomar más o menos muestras en diferentes áreas de una imagen en función de su relevancia perceptual. En esta tesis se dan los primeros pasos para la elaboración de lo que puede llegar a ser un nuevo formato estándar de compresión multimedia (imagen, video y audio) libre de patentes y de alto rendimiento tanto en velocidad como en calidad. ABSTRACT The Morse code, invented in 1838 for use in telegraphy, is one of the first examples of the practical use of data compression [1], where the most common letters of the alphabet are coded shorter than the rest of codes. From 1940 and after the development of the theory of information and the creation of the first computers, compression of information has been a constant and fundamental challenge among any type of researchers. The greater our understanding of the meaning of information, the greater our success at compressing. In the case of multimedia information, its nature allows lossy compression, reaching impossible compression rates compared with lossless algorithms. These "recent" lossy algorithms have been mainly based on information transformation to frequency domain and elimination of some of the information in that domain. Transforming the frequency domain has advantages but also involves inevitable computational costs. This thesis introduces a new multimedia compression algorithm called "LHE" (logarithmical Hopping Encoding) that does not require transformation to frequency domain, but works in the space domain. This feature makes LHE a linear algorithm of reduced computational complexity. The results of the algorithm are promising, outperforming the JPEG standard in quality and speed. The basis of the algorithm is the physiological response of the human eye to the light stimulus. The eye, like other senses, responds to the logarithm of the signal according with Weber law. The algorithm consists of several stages. One is the measurement of "perceptual relevance," a new metric that will allow us to measure the relevance of information in the subject's mind and based on it; degrade accordingly their contents, through what I have called "elastic downsampling". Elastic downsampling stage is an unprecedented new technique in digital image processing. It lets take more or less samples in different areas of an image based on their perceptual relevance. This thesis introduces the first steps for the development of what may become a new standard multimedia compression format (image, video and audio) free of patents and high performance in both speed and quality.