978 resultados para Spectral bands
Resumo:
Robust image hashing seeks to transform a given input image into a shorter hashed version using a key-dependent non-invertible transform. These image hashes can be used for watermarking, image integrity authentication or image indexing for fast retrieval. This paper introduces a new method of generating image hashes based on extracting Higher Order Spectral features from the Radon projection of an input image. The feature extraction process is non-invertible, non-linear and different hashes can be produced from the same image through the use of random permutations of the input. We show that the transform is robust to typical image transformations such as JPEG compression, noise, scaling, rotation, smoothing and cropping. We evaluate our system using a verification-style framework based on calculating false match, false non-match likelihoods using the publicly available Uncompressed Colour Image database (UCID) of 1320 images. We also compare our results to Swaminathan’s Fourier-Mellin based hashing method with at least 1% EER improvement under noise, scaling and sharpening.
Resumo:
This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.
Resumo:
This thesis presents an original approach to parametric speech coding at rates below 1 kbitsjsec, primarily for speech storage applications. Essential processes considered in this research encompass efficient characterization of evolutionary configuration of vocal tract to follow phonemic features with high fidelity, representation of speech excitation using minimal parameters with minor degradation in naturalness of synthesized speech, and finally, quantization of resulting parameters at the nominated rates. For encoding speech spectral features, a new method relying on Temporal Decomposition (TD) is developed which efficiently compresses spectral information through interpolation between most steady points over time trajectories of spectral parameters using a new basis function. The compression ratio provided by the method is independent of the updating rate of the feature vectors, hence allows high resolution in tracking significant temporal variations of speech formants with no effect on the spectral data rate. Accordingly, regardless of the quantization technique employed, the method yields a high compression ratio without sacrificing speech intelligibility. Several new techniques for improving performance of the interpolation of spectral parameters through phonetically-based analysis are proposed and implemented in this research, comprising event approximated TD, near-optimal shaping event approximating functions, efficient speech parametrization for TD on the basis of an extensive investigation originally reported in this thesis, and a hierarchical error minimization algorithm for decomposition of feature parameters which significantly reduces the complexity of the interpolation process. Speech excitation in this work is characterized based on a novel Multi-Band Excitation paradigm which accurately determines the harmonic structure in the LPC (linear predictive coding) residual spectra, within individual bands, using the concept 11 of Instantaneous Frequency (IF) estimation in frequency domain. The model yields aneffective two-band approximation to excitation and computes pitch and voicing with high accuracy as well. New methods for interpolative coding of pitch and gain contours are also developed in this thesis. For pitch, relying on the correlation between phonetic evolution and pitch variations during voiced speech segments, TD is employed to interpolate the pitch contour between critical points introduced by event centroids. This compresses pitch contour in the ratio of about 1/10 with negligible error. To approximate gain contour, a set of uniformly-distributed Gaussian event-like functions is used which reduces the amount of gain information to about 1/6 with acceptable accuracy. The thesis also addresses a new quantization method applied to spectral features on the basis of statistical properties and spectral sensitivity of spectral parameters extracted from TD-based analysis. The experimental results show that good quality speech, comparable to that of conventional coders at rates over 2 kbits/sec, can be achieved at rates 650-990 bits/sec.
Resumo:
The NIR spectra of reichenbachite, scholzite and parascholzite have been studied at 298 K. The spectra of the minerals are different, in line with composition and crystal structural variations. Cation substitution effects are significant in their electronic spectra and three distinctly different electronic transition bands are observed in the near-infrared spectra at high wavenumbers in the 12000-7600 cm-1 spectral region. Reichenbachite electronic spectrum is characterised by Cu(II) transition bands at 9755 and 7520 cm-1. A broad spectral feature observed for ferrous ion in the 12000-9000 cm-1 region both in scholzite and parascholzite. Some what similarities in the vibrational spectra of the three phosphate minerals are observed particularly in the OH stretching region. The observation of strong band at 5090 cm-1 indicates strong hydrogen bonding in the structure of the dimorphs, scholzite and parascholzite. The three phosphates exhibit overlapping bands in the 4800-4000 cm-1 region resulting from the combinations of vibrational modes of (PO4)3- units.
Resumo:
Mid-infrared (MIR) and near-infrared (NIR) spectroscopy have been used to study the molecular structure of halloysite and potassium acetate intercalated halloysite and to determine the structural changes of halloysite through intercalation. The MIR spectra show all fundamental vibrations including the hydroxyl units, basic aluminosilicate framework and water molecules in the structure of halloysite and its intercalation complex. Comparison between halloysite and halloysite-potassium acetate intercalation complex shows almost all bands observed for halloysite are also observed for halloysite-potassium acetate intercalation complex apart from bands observed in the 1700-1300 cm-1 region, but with differences in band intensity. However, NIR, based on MIR spectra, provide sufficient evidence to analyze the structural changes of halloysite through intercalation. There are obvious differences between halloysite and halloysite-potassium acetate intercalation complex in the all spectral ranges. Therefore, the reproducibility of measurement and richness of qualitative information should be simultaneously considered for proper selection of a spectroscopic method for molecular structural analysis.
Resumo:
The use of appropriate features to represent an output class or object is critical for all classification problems. In this paper, we propose a biologically inspired object descriptor to represent the spectral-texture patterns of image-objects. The proposed feature descriptor is generated from the pulse spectral frequencies (PSF) of a pulse coupled neural network (PCNN), which is invariant to rotation, translation and small scale changes. The proposed method is first evaluated in a rotation and scale invariant texture classification using USC-SIPI texture database. It is further evaluated in an application of vegetation species classification in power line corridor monitoring using airborne multi-spectral aerial imagery. The results from the two experiments demonstrate that the PSF feature is effective to represent spectral-texture patterns of objects and it shows better results than classic color histogram and texture features.
Resumo:
Near infrared (NIR), infrared (IR) spectroscopy and X-ray diffraction (XRD) have been applied to halotrichites of the formula FeAl2(SO4)4∙22H2O and Fe2+Fe23+(SO4)4∙22H2O. Comparison of the halotrichites and their starting materials has been used to give a better understanding of the bonding involved in these types of minerals. The vibrational spectroscopy data has shown that Fe2+ oxidises during the formation of halotrichite, no preventative measures were implemented to prevent oxidation, and this has been clearly shown by the position and broadness of electronic bands of transition metals in the NIR spectra (12500 to 7500 cm-1). It is apparent from this region that Fe3+ substitutes for Al3+ in the synthesis of halotrichite. Due to the oxidation of Fe2+ to Fe3+ the halotrichite sample contains a small portion of bilinite. This has been confirmed by XRD, peaks at 9 and 14° 2θ were observed in the halotrichite sample and are identical to the XRD pattern obtained for bilinite. Substitution of aluminium for Fe3+ has resulted in significant changes in the overall infrared and NIR spectral profiles. However, the lower wavenumber regions of the NIR spectra have very similar spectral profiles, which indicate a similar structure to halotrichite has formed for bilinite. This work has shown that iron halotrichites can be synthesised and characterised by infrared and NIR spectroscopy.
Resumo:
Raman spectroscopy has been used to study selected mineral samples of the copiapite group. Copiapite (Fe2+Fe3+(SO4)6(OH)2 · 20H2O) is a secondary mineral formed through the oxidn. of pyrite. Minerals of the copiapite group have the general formula AFe4(SO4)6(OH)2 · 20H2O, where A has a + 2 charge and can be either magnesium, iron, copper, calcium and/or zinc. The formula can also be B2/3Fe4(SO4)6(OH)2 · 20H2O, where B has a + 3 charge and may be either aluminum or iron. For each mineral, two Raman bands are obsd. at around 992 and 1029 cm-1, assigned to the (SO4)2-ν1 sym. stretching mode. The observation of two bands provides evidence for the existence of two non-equiv. sulfate anions in the mineral structure. Three Raman bands at 1112, 1142 and 1161 cm-1 are obsd. in the Raman spectrum of copiapites, indicating a redn. of symmetry of the sulfate anion in the copiapite structure. This redn. in symmetry is supported by multiple bands in the ν2 and ν4(SO4)2- spectral regions.
Resumo:
The single crystal Raman spectra of minerals brandholzite and bottinoite, formula M[Sb(OH)6]2•6H2O, where M is Mg+2 and Ni+2 respectively, and the non-aligned Raman spectrum of mopungite, formula Na[Sb(OH)6], are presented for the first time. The mixed metal minerals comprise of alternating layers of [Sb(OH)6]-1 octahedra and mixed [M(H2O)6]+2 / [Sb(OH)6]-1 octahedra. Mopungite comprises hydrogen bonded layers of [Sb(OH)6]-1 octahedra linked within the layer by Na+ ions. The spectra of the three minerals were dominated by the Sb-O symmetric stretch of the [Sb(OH)6]-1 octahedron, which occurs at approximately 620 cm-1. The Raman spectrum of mopungite showed many similarities to spectra of the di-octahedral minerals informing the view that the Sb octahedra gave rise to most of the Raman bands observed, particularly below 1200 cm-1. Assignments have been proposed based on the spectral comparison between the minerals, prior literature and density field theory calculations of the vibrational spectra of the free [Sb(OH)6]-1 and [M(H2O)6]+2 octahedra by a model chemistry of B3LYP/6-31G(d) and lanl2dz for the Sb atom. The single crystal data spectra showed good mode separation, allowing the majority of the bands to be assigned a symmetry species of A or E.
Resumo:
The single crystal Raman spectra of natural mineral schafarzikite FeSb2O4 from the Pernek locality of the Slovak Republic are presented for the first time. Raman spectra of natural mineral apuanite Fe2+Fe43+Sb4O12S, originating from the Apuan Alps in Italy, as well as spectra of synthetic ZnSb2O4 and arsenite mineral trippkeite CuAs2O4 are also presented for the first time. The spectra of the antimonite minerals are characterized by a strong band in the region 660 – 680 cm-1 with shoulders on either side, and a band of medium intensity near 300 cm-1. The spectrum of the arsenite mineral is characterized by a medium band near 780 cm-1 with a shoulder on the high wavenumber side and a strong band at 370 cm-1. Assignments are proposed based on the spectral comparison between the compounds, symmetry modes of the bands and prior literature. The single crystal spectra of schafarzikite showed good mode separation, allowing bands to be assigned a symmetry species of A1g, B1g, B2g or Eg.