973 resultados para Compressed text search


Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the development and evaluation of a tactical lane change model using the forward search algorithm, for use in a traffic simulator. The tactical lane change model constructs a set of possible choices of near-term maneuver sequences available to the driver and selects the lane change action at the present time to realize the best maneuver plan. Including near term maneuver planning in the driver behavior model can allow a better representation of the complex interactions in situations such as a weaving section and high-occupancy vehicle (HOV) lane systems where drivers must weave across several lanes in order to access the HOV lanes. To support the investigation, a longitudinal control model and a basic lane change model were also analyzed. The basic lane change model is similar to those used by today's commonly-used traffic simulators. Parameters in all models were best-fit estimated for selected vehicles from a real-world freeway vehicle trajectory data set. The best-fit estimation procedure minimizes the discrepancy between the model vehicle and real vehicle's trajectories. With the best fit parameters, the proposed tactical lane change model gave a better overall performance for a greater number of cases than the basic lane change model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Digital collections are growing exponentially in size as the information age takes a firm grip on all aspects of society. As a result Information Retrieval (IR) has become an increasingly important area of research. It promises to provide new and more effective ways for users to find information relevant to their search intentions. Document clustering is one of the many tools in the IR toolbox and is far from being perfected. It groups documents that share common features. This grouping allows a user to quickly identify relevant information. If these groups are misleading then valuable information can accidentally be ignored. There- fore, the study and analysis of the quality of document clustering is important. With more and more digital information available, the performance of these algorithms is also of interest. An algorithm with a time complexity of O(n2) can quickly become impractical when clustering a corpus containing millions of documents. Therefore, the investigation of algorithms and data structures to perform clustering in an efficient manner is vital to its success as an IR tool. Document classification is another tool frequently used in the IR field. It predicts categories of new documents based on an existing database of (doc- ument, category) pairs. Support Vector Machines (SVM) have been found to be effective when classifying text documents. As the algorithms for classifica- tion are both efficient and of high quality, the largest gains can be made from improvements to representation. Document representations are vital for both clustering and classification. Representations exploit the content and structure of documents. Dimensionality reduction can improve the effectiveness of existing representations in terms of quality and run-time performance. Research into these areas is another way to improve the efficiency and quality of clustering and classification results. Evaluating document clustering is a difficult task. Intrinsic measures of quality such as distortion only indicate how well an algorithm minimised a sim- ilarity function in a particular vector space. Intrinsic comparisons are inherently limited by the given representation and are not comparable between different representations. Extrinsic measures of quality compare a clustering solution to a “ground truth” solution. This allows comparison between different approaches. As the “ground truth” is created by humans it can suffer from the fact that not every human interprets a topic in the same manner. Whether a document belongs to a particular topic or not can be subjective.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Theatre Audience Contribution introduces a new approach to theatre audience research: audience contribution through the post-performance discussion. This volume considers the physical and vocal behaviour of audience members as an integral part of the theatrical event that changes, adds to and informs the theatrical experience. Post-performance discussions, although rising in popularity, are yet an under-explored and under-utilised avenue for audience contribution. Beginning with an overview of reception theory and the historical role of theatre audiences, the author introduces a new method for the facilitation of post-performance discussions that encourages audience contribution and privileges the audience voice. Two case studies explore post-performance discussions that inform the theatrical event and discover a new role for the contemporary audience: audience critic. This accessible volume has significant implications for theatre theorists, practitioners and audiences alike.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Consider a person searching electronic health records, a search for the term ‘cracked skull’ should return documents that contain the term ‘cranium fracture’. A information retrieval systems is required that matches concepts, not just keywords. Further more, determining relevance of a query to a document requires inference – its not simply matching concepts. For example a document containing ‘dialysis machine’ should align with a query for ‘kidney disease’. Collectively we describe this problem as the ‘semantic gap’ – the difference between the raw medical data and the way a human interprets it. This paper presents an approach to semantic search of health records by combining two previous approaches: an ontological approach using the SNOMED CT medical ontology; and a distributional approach using semantic space vector space models. Our approach will be applied to a specific problem in health informatics: the matching of electronic patient records to clinical trials.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study investigates the application of local search methods on the railway junction traffic conflict-resolution problem, with the objective of attaining a quick and reasonable solution. A procedure based on local search relies on finding a better solution than the current one by a search in the neighbourhood of the current one. The structure of neighbourhood is therefore very important to an efficient local search procedure. In this paper, the formulation of the structure of the solution, which is the right-of-way sequence assignment, is first described. Two new neighbourhood definitions are then proposed and the performance of the corresponding local search procedures is evaluated by simulation. It has been shown that they provide similar results but they can be used to handle different traffic conditions and system requirements.