923 resultados para Extensions


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Relevância:

10.00% 10.00%

Publicador:

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Lateral gene transfer (LGT) from prokaryotes to microbial eukaryotes is usually detected by chance through genome-sequencing projects. Here, we explore a different, hypothesis-driven approach. We show that the fitness advantage associated with the transferred gene, typically invoked only in retrospect, can be used to design a functional screen capable of identifying postulated LGT cases. We hypothesized that beta-glucuronidase (gus) genes may be prone to LGT from bacteria to fungi (thought to lack gus) because this would enable fungi to utilize glucuronides in vertebrate urine as a carbon source. Using an enrichment procedure based on a glucose-releasing glucuronide analog (cellobiouronic acid), we isolated two gus(+) ascomycete fungi from soils (Penicillium canescens and Scopulariopsis sp.). A phylogenetic analysis suggested that their gus genes, as well as the gus genes identified in genomic sequences of the ascomycetes Aspergillus nidulans and Gibberella zeae, had been introgressed laterally from high-GC gram(+) bacteria. Two such bacteria (Arthrobacter spp.), isolated together with the gus(+) fungi, appeared to be the descendants of a bacterial donor organism from which gus had been transferred to fungi. This scenario was independently supported by similar substrate affinities of the encoded beta-glucuronidases, the absence of introns from fungal gus genes, and the similarity between the signal peptide-encoding 5' extensions of some fungal gus genes and the Arthrobacter sequences upstream of gus. Differences in the sequences of the fungal 5' extensions suggested at least two separate introgression events after the divergence of the two main Euascomycete classes. We suggest that deposition of glucuronides on soils as a result of the colonization of land by vertebrates may have favored LGT of gus from bacteria to fungi in soils.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Component software has many benefits, most notably increased software re-use; however, the component software process places heavy burdens on programming language technology, which modern object-oriented programming languages do not address. In particular, software components require specifications that are both sufficiently expressive and sufficiently abstract, and, where possible, these specifications should be checked formally by the programming language. This dissertation presents a programming language called Mentok that provides two novel programming language features enabling improved specification of stateful component roles. Negotiable interfaces are interface types extended with protocols, and allow specification of changing method availability, including some patterns of out-calls and re-entrance. Type layers are extensions to module signatures that allow specification of abstract control flow constraints through the interfaces of a component-based application. Development of Mentok's unique language features included creation of MentokC, the Mentok compiler, and formalization of key properties of Mentok in mini-languages called MentokP and MentokL.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Establishing a nationwide Electronic Health Record system has become a primary objective for many countries around the world, including Australia, in order to improve the quality of healthcare while at the same time decreasing its cost. Doing so will require federating the large number of patient data repositories currently in use throughout the country. However, implementation of EHR systems is being hindered by several obstacles, among them concerns about data privacy and trustworthiness. Current IT solutions fail to satisfy patients’ privacy desires and do not provide a trustworthiness measure for medical data. This thesis starts with the observation that existing EHR system proposals suer from six serious shortcomings that aect patients’ privacy and safety, and medical practitioners’ trust in EHR data: accuracy and privacy concerns over linking patients’ existing medical records; the inability of patients to have control over who accesses their private data; the inability to protect against inferences about patients’ sensitive data; the lack of a mechanism for evaluating the trustworthiness of medical data; and the failure of current healthcare workflow processes to capture and enforce patient’s privacy desires. Following an action research method, this thesis addresses the above shortcomings by firstly proposing an architecture for linking electronic medical records in an accurate and private way where patients are given control over what information can be revealed about them. This is accomplished by extending the structure and protocols introduced in federated identity management to link a patient’s EHR to his existing medical records by using pseudonym identifiers. Secondly, a privacy-aware access control model is developed to satisfy patients’ privacy requirements. The model is developed by integrating three standard access control models in a way that gives patients access control over their private data and ensures that legitimate uses of EHRs are not hindered. Thirdly, a probabilistic approach for detecting and restricting inference channels resulting from publicly-available medical data is developed to guard against indirect accesses to a patient’s private data. This approach is based upon a Bayesian network and the causal probabilistic relations that exist between medical data fields. The resulting definitions and algorithms show how an inference channel can be detected and restricted to satisfy patients’ expressed privacy goals. Fourthly, a medical data trustworthiness assessment model is developed to evaluate the quality of medical data by assessing the trustworthiness of its sources (e.g. a healthcare provider or medical practitioner). In this model, Beta and Dirichlet reputation systems are used to collect reputation scores about medical data sources and these are used to compute the trustworthiness of medical data via subjective logic. Finally, an extension is made to healthcare workflow management processes to capture and enforce patients’ privacy policies. This is accomplished by developing a conceptual model that introduces new workflow notions to make the workflow management system aware of a patient’s privacy requirements. These extensions are then implemented in the YAWL workflow management system.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Where object-oriented languages deal with objects as described by classes, model-driven development uses models, as graphs of interconnected objects, described by metamodels. A number of new languages have been and continue to be developed for this model- based paradigm, both for model transformation and for general programming using models. Many of these use single-object approaches to typing, derived from solutions found in object-oriented systems, while others use metamodels as model types, but without a clear notion of polymorphism. Both of these approaches lead to brittle and overly restrictive reuse characteristics. In this paper we propose a simple extension to object-oriented typing to better cater for a model-oriented context, including a simple strategy for typing models as a collection of interconnected objects. We suggest extensions to existing type system formalisms to support these concepts and their manipulation. Using a simple example we show how this extended approach permits more flexible reuse, while preserving type safety.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Statisticians along with other scientists have made significant computational advances that enable the estimation of formerly complex statistical models. The Bayesian inference framework combined with Markov chain Monte Carlo estimation methods such as the Gibbs sampler enable the estimation of discrete choice models such as the multinomial logit (MNL) model. MNL models are frequently applied in transportation research to model choice outcomes such as mode, destination, or route choices or to model categorical outcomes such as crash outcomes. Recent developments allow for the modification of the potentially limiting assumptions of MNL such as the independence from irrelevant alternatives (IIA) property. However, relatively little transportation-related research has focused on Bayesian MNL models, the tractability of which is of great value to researchers and practitioners alike. This paper addresses MNL model specification issues in the Bayesian framework, such as the value of including prior information on parameters, allowing for nonlinear covariate effects, and extensions to random parameter models, so changing the usual limiting IIA assumption. This paper also provides an example that demonstrates, using route-choice data, the considerable potential of the Bayesian MNL approach with many transportation applications. This paper then concludes with a discussion of the pros and cons of this Bayesian approach and identifies when its application is worthwhile

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and P4 classes of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving power systems network equations with SSE and discuss advantages and disadvantages of this approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III class of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving power systems network equations with SSE and discuss advantages and disadvantages of this approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Streaming SIMD Extensions (SSE) is a unique feature embedded in the Pentium III and IV classes of microprocessors. By fully exploiting SSE, parallel algorithms can be implemented on a standard personal computer and a theoretical speedup of four can be achieved. In this paper, we demonstrate the implementation of a parallel LU matrix decomposition algorithm for solving linear systems with SSE and discuss advantages and disadvantages of this approach based on our experimental study.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Parallel computing is currently used in many engineering problems. However, because of limitations in curriculum design, it is not always possible to offer students specific formal teaching in this topic. Furthermore, parallel machines are still too expensive for many institutions. The latest microprocessors, such as Intel’s Pentium III and IV, embody single instruction multiple-data (SIMD) type parallel features, which makes them a viable solution for introducing parallel computing concepts to students. Final year projects have been initiated utilizing SSE (streaming SIMD extensions) features and it has been observed that students can easily learn parallel programming concepts after going through some programming exercises. They can now experiment with parallel algorithms on their own PCs at home. Keywords

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The use of appropriate features to characterize an output class or object is critical for all classification problems. This paper evaluates the capability of several spectral and texture features for object-based vegetation classification at the species level using airborne high resolution multispectral imagery. Image-objects as the basic classification unit were generated through image segmentation. Statistical moments extracted from original spectral bands and vegetation index image are used as feature descriptors for image objects (i.e. tree crowns). Several state-of-art texture descriptors such as Gray-Level Co-Occurrence Matrix (GLCM), Local Binary Patterns (LBP) and its extensions are also extracted for comparison purpose. Support Vector Machine (SVM) is employed for classification in the object-feature space. The experimental results showed that incorporating spectral vegetation indices can improve the classification accuracy and obtained better results than in original spectral bands, and using moments of Ratio Vegetation Index obtained the highest average classification accuracy in our experiment. The experiments also indicate that the spectral moment features also outperform or can at least compare with the state-of-art texture descriptors in terms of classification accuracy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Comprehensive Australian Study of Entrepreneurial Emergence (CAUSEE) is a research programme that aims to uncover the factors that initiate, hinder and facilitate the process of emergence of new economic activities and organizations. It is widely acknowledged that entrepreneurship is one of the most important forces shaping changes in a country’s economic landscape (Baumol 1968; Birch 1987; Acs 1999). An understanding of the process by which new economic activity and business entities emerge is vital (Gartner 1993; Sarasvathy 2001). An important development in the study of ‘nascent entrepreneurs’ and ‘firms in gestation’ was the Panel Study of Entrepreneurial Dynamics (PSED) (Gartner et al. 2004) and its extensions in Argentina, Canada, Greece, the Netherlands, Norway and Sweden. Yet while PSED I is an important first step towards systematically studying new venture emergence, it represents just the beginning of a stream of nascent venture studies – most notably PSED II is currently being undertaken in the US (2005– 10) (Reynolds and Curtin 2008).