23 resultados para Dictionaries, Polyglot.


Relevância:

10.00% 10.00%

Publicador:

Resumo:

A property of sparse representations in relation to their capacity for information storage is discussed. It is shown that this feature can be used for an application that we term Encrypted Image Folding. The proposed procedure is realizable through any suitable transformation. In particular, in this paper we illustrate the approach by recourse to the Discrete Cosine Transform and a combination of redundant Cosine and Dirac dictionaries. The main advantage of the proposed technique is that both storage and encryption can be achieved simultaneously using simple processing steps.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sparse representation of astronomical images is discussed. It is shown that a significant gain in sparsity is achieved when particular mixed dictionaries are used for approximating these types of images with greedy selection strategies. Experiments are conducted to confirm (i) the effectiveness at producing sparse representations and (ii) competitiveness, with respect to the time required to process large images. The latter is a consequence of the suitability of the proposed dictionaries for approximating images in partitions of small blocks. This feature makes it possible to apply the effective greedy selection technique called orthogonal matching pursuit, up to some block size. For blocks exceeding that size, a refinement of the original matching pursuit approach is considered. The resulting method is termed "self-projected matching pursuit," because it is shown to be effective for implementing, via matching pursuit itself, the optional backprojection intermediate steps in that approach. © 2013 Optical Society of America.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Semantic Web relies on carefully structured, well defined, data to allow machines to communicate and understand one another. In many domains (e.g. geospatial) the data being described contains some uncertainty, often due to incomplete knowledge; meaningful processing of this data requires these uncertainties to be carefully analysed and integrated into the process chain. Currently, within the SemanticWeb there is no standard mechanism for interoperable description and exchange of uncertain information, which renders the automated processing of such information implausible, particularly where error must be considered and captured as it propagates through a processing sequence. In particular we adopt a Bayesian perspective and focus on the case where the inputs / outputs are naturally treated as random variables. This paper discusses a solution to the problem in the form of the Uncertainty Markup Language (UncertML). UncertML is a conceptual model, realised as an XML schema, that allows uncertainty to be quantified in a variety of ways i.e. realisations, statistics and probability distributions. UncertML is based upon a soft-typed XML schema design that provides a generic framework from which any statistic or distribution may be created. Making extensive use of Geography Markup Language (GML) dictionaries, UncertML provides a collection of definitions for common uncertainty types. Containing both written descriptions and mathematical functions, encoded as MathML, the definitions within these dictionaries provide a robust mechanism for defining any statistic or distribution and can be easily extended. Universal Resource Identifiers (URIs) are used to introduce semantics to the soft-typed elements by linking to these dictionary definitions. The INTAMAP (INTeroperability and Automated MAPping) project provides a use case for UncertML. This paper demonstrates how observation errors can be quantified using UncertML and wrapped within an Observations & Measurements (O&M) Observation. The interpolation service uses the information within these observations to influence the prediction outcome. The output uncertainties may be encoded in a variety of UncertML types, e.g. a series of marginal Gaussian distributions, a set of statistics, such as the first three marginal moments, or a set of realisations from a Monte Carlo treatment. Quantifying and propagating uncertainty in this way allows such interpolation results to be consumed by other services. This could form part of a risk management chain or a decision support system, and ultimately paves the way for complex data processing chains in the Semantic Web.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Corpora—large collections of written and/or spoken text stored and accessed electronically—provide the means of investigating language that is of growing importance academically and professionally. Corpora are now routinely used in the following fields: The production of dictionaries and other reference materials; The development of aids to translation; Language teaching materials; The investigation of ideologies and cultural assumptions; Natural language processing; and The investigation of all aspects of linguistic behaviour, including vocabulary, grammar and pragmatics.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many people think of language as words. Words are small, convenient units, especially in written English, where they are separated by spaces. Dictionaries seem to reinforce this idea, because entries are arranged as a list of alphabetically-ordered words. Traditionally, linguists and teachers focused on grammar and treated words as self-contained units of meaning, which fill the available grammatical slots in a sentence. More recently, attention has shifted from grammar to lexis, and from words to chunks. Dictionary headwords are convenient points of access for the user, but modern dictionary entries usually deal with chunks, because meanings often do not arise from individual words, but from the chunks in which the words occur. Corpus research confirms that native speakers of a language actually work with larger “chunks” of language. This paper will show that teachers and learners will benefit from treating language as chunks rather than words.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cooperative Greedy Pursuit Strategies are considered for approximating a signal partition subjected to a global constraint on sparsity. The approach aims at producing a high quality sparse approximation of the whole signal, using highly coherent redundant dictionaries. The cooperation takes place by ranking the partition units for their sequential stepwise approximation, and is realized by means of i)forward steps for the upgrading of an approximation and/or ii) backward steps for the corresponding downgrading. The advantage of the strategy is illustrated by approximation of music signals using redundant trigonometric dictionaries. In addition to rendering stunning improvements in sparsity with respect to the concomitant trigonometric basis, these dictionaries enable a fast implementation of the approach via the Fast Fourier Transform

Relevância:

10.00% 10.00%

Publicador:

Resumo:

I describe and discuss a series of court cases which focus upon on decoding the meaning of slang terms. Examples include sexual slang used in a description by a child and an Internet Relay Chat containing a conspiracy to murder. I consider the task presented by these cases for the forensic linguist and the roles the linguist may assume in determining the meaning of slang terms for the Courts. These roles are identified as linguist as naïve interpreter, lexicographer, case researcher and cultural mediator. Each of these roles is suggestive of different strategies that might be used from consulting formal slang dictionaries and less formal Internet sources, to collecting case specific corpora and examining all the extraneous material in a particular case. Each strategy is evaluated both in terms of the strength of evidence provided and its applicability to the forensic context.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A dedicated algorithm for sparse spectral representation of music sound is presented. The goal is to enable the representation of a piece of music signal as a linear superposition of as few spectral components as possible, without affecting the quality of the reproduction. A representation of this nature is said to be sparse. In the present context sparsity is accomplished by greedy selection of the spectral components, from an overcomplete set called a dictionary. The proposed algorithm is tailored to be applied with trigonometric dictionaries. Its distinctive feature being that it avoids the need for the actual construction of the whole dictionary, by implementing the required operations via the fast Fourier transform. The achieved sparsity is theoretically equivalent to that rendered by the orthogonal matching pursuit (OMP) method. The contribution of the proposed dedicated implementation is to extend the applicability of the standard OMP algorithm, by reducing its storage and computational demands. The suitability of the approach for producing sparse spectral representation is illustrated by comparison with the traditional method, in the line of the short time Fourier transform, involving only the corresponding orthonormal trigonometric basis.