888 resultados para Corpus Paroemiographorum Graecorum


Relevância:

20.00% 20.00%

Publicador:

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The effectiveness of higher-order spectral (HOS) phase features in speaker recognition is investigated by comparison with Mel Cepstral features on the same speech data. HOS phase features retain phase information from the Fourier spectrum unlikeMel–frequency Cepstral coefficients (MFCC). Gaussian mixture models are constructed from Mel– Cepstral features and HOS features, respectively, for the same data from various speakers in the Switchboard telephone Speech Corpus. Feature clusters, model parameters and classification performance are analyzed. HOS phase features on their own provide a correct identification rate of about 97% on the chosen subset of the corpus. This is the same level of accuracy as provided by MFCCs. Cluster plots and model parameters are compared to show that HOS phase features can provide complementary information to better discriminate between speakers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we analyse a 600,000 word corpus comprised of policy statements produced within supranational, national, state and local legislatures about the nature and causes of(un)employment. We identify significant rhetorical and discursive features deployed by third sector (un)employment policy authors that function to extend their legislative grasp to encompass the most intimate aspects of human association.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this article I outline and demonstrate a synthesis of the methods developed by Lemke (1998) and Martin (2000) for analyzing evaluations in English. I demonstrate the synthesis using examples from a 1.3-million-word technology policy corpus drawn from institutions at the local, state, national, and supranational levels. Lemke's (1998) critical model is organized around the broad 'evaluative dimensions' that are deployed to evaluate propositions and proposals in English. Martin's (2000) model is organized with a more overtly systemic-functional orientation around the concept of 'encoded feeling'. In applying both these models at different times, whilst recognizing their individual usefulness and complementarity, I found specific limitations that led me to work towards a synthesis of the two approaches. I also argue for the need to consider genre, media, and institutional aspects more explicitly when claiming intertextual and heteroglossic relations as the basis for inferred evaluations. A basic assertion made in this article is that the perceived Desirability of a process, person, circumstance, or thing is identical to its 'value'. But the Desirability of anything is a socially and thus historically conditioned attribution that requires significant amounts of institutional inculcation of other 'types' of value-appropriateness, importance, beauty, power, and so on. I therefore propose a method informed by critical discourse analysis (CDA) that sees evaluation as happening on at least four interdependent levels of abstraction.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Although internet chat is a significant aspect of many internet users’ lives, the manner in which participants in quasi-synchronous chat situations orient to issues of social and moral order remains to be studied in depth. The research presented here is therefore at the forefront of a continually developing area of study. This work contributes new insights into how members construct and make accountable the social and moral orders of an adult-oriented Internet Relay Chat (IRC) channel by addressing three questions: (1) What conversational resources do participants use in addressing matters of social and moral order? (2) How are these conversational resources deployed within IRC interaction? and (3) What interactional work is locally accomplished through use of these resources? A survey of the literature reveals considerable research in the field of computer-mediated communication, exploring both asynchronous and quasi-synchronous discussion forums. The research discussed represents a range of communication interests including group and collaborative interaction, the linguistic construction of social identity, and the linguistic features of online interaction. It is suggested that the present research differs from previous studies in three ways: (1) it focuses on the interaction itself, rather than the ways in which the medium affects the interaction; (2) it offers turn-by-turn analysis of interaction in situ; and (3) it discusses membership categories only insofar as they are shown to be relevant by participants through their talk. Through consideration of the literature, the present study is firmly situated within the broader computer-mediated communication field. Ethnomethodology, conversation analysis and membership categorization analysis were adopted as appropriate methodological approaches to explore the research focus on interaction in situ, and in particular to investigate the ways in which participants negotiate and co-construct social and moral orders in the course of their interaction. IRC logs collected from one chat room were analysed using a two-pass method, based on a modification of the approaches proposed by Pomerantz and Fehr (1997) and ten Have (1999). From this detailed examination of the data corpus three interaction topics are identified by means of which participants clearly orient to issues of social and moral order: challenges to rule violations, ‘trolling’ for cybersex, and experiences regarding the 9/11 attacks. Instances of these interactional topics are subjected to fine-grained analysis, to demonstrate the ways in which participants draw upon various interactional resources in their negotiation and construction of channel social and moral orders. While these analytical topics stand alone in individual focus, together they illustrate different instances in which participants’ talk serves to negotiate social and moral orders or collaboratively construct new orders. Building on the work of Vallis (2001), Chapter 5 illustrates three ways that rule violation is initiated as a channel discussion topic: (1) through a visible violation in open channel, (2) through an official warning or sanction by a channel operator regarding the violation, and (3) through a complaint or announcement of a rule violation by a non-channel operator participant. Once the topic has been initiated, it is shown to become available as a topic for others, including the perceived violator. The fine-grained analysis of challenges to rule violations ultimately demonstrates that channel participants orient to the rules as a resource in developing categorizations of both the rule violation and violator. These categorizations are contextual in that they are locally based and understood within specific contexts and practices. Thus, it is shown that compliance with rules and an orientation to rule violations as inappropriate within the social and moral orders of the channel serves two purposes: (1) to orient the speaker as a group member, and (2) to reinforce the social and moral orders of the group. Chapter 6 explores a particular type of rule violation, solicitations for ‘cybersex’ known in IRC parlance as ‘trolling’. In responding to trolling violations participants are demonstrated to use affiliative and aggressive humour, in particular irony, sarcasm and insults. These conversational resources perform solidarity building within the group, positioning non-Troll respondents as compliant group members. This solidarity work is shown to have three outcomes: (1) consensus building, (2) collaborative construction of group membership, and (3) the continued construction and negotiation of existing social and moral orders. Chapter 7, the final data analysis chapter, offers insight into how participants, in discussing the events of 9/11 on the actual day, collaboratively constructed new social and moral orders, while orienting to issues of appropriate and reasonable emotional responses. This analysis demonstrates how participants go about ‘doing being ordinary’ (Sacks, 1992b) in formulating their ‘first thoughts’ (Jefferson, 2004). Through sharing their initial impressions of the event, participants perform support work within the interaction, in essence working to normalize both the event and their initial misinterpretation of it. Normalising as a support work mechanism is also shown in relation to participants constructing the ‘quiet’ following the event as unusual. Normalising is accomplished by reference to the indexical ‘it’ and location formulations, which participants use both to negotiate who can claim to experience the ‘unnatural quiet’ and to identify the extent of the quiet. Through their talk participants upgrade the quiet from something legitimately experienced by one person in a particular place to something that could be experienced ‘anywhere’, moving the phenomenon from local to global provenance. With its methodological design and detailed analysis and findings, this research contributes to existing knowledge in four ways. First, it shows how rules are used by participants as a resource in negotiating and constructing social and moral orders. Second, it demonstrates that irony, sarcasm and insults are three devices of humour which can be used to perform solidarity work and reinforce existing social and moral orders. Third, it demonstrates how new social and moral orders are collaboratively constructed in relation to extraordinary events, which serve to frame the event and evoke reasonable responses for participants. And last, the detailed analysis and findings further support the use of conversation analysis and membership categorization as valuable methods for approaching quasi-synchronous computer-mediated communication.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", or NGMI, which is used to segment Chinese documents into n-character words or phrases, using language statistics drawn from the Chinese Wikipedia corpus. The approach alleviates the tremendous effort that is required in preparing and maintaining the manually segmented Chinese text for training purposes, and manually maintaining ever expanding lexicons. Previously, mutual information was used to achieve automated segmentation into 2-character words. The NGMI approach extends the approach to handle longer n-character words. Experiments with heterogeneous documents from the Chinese Wikipedia collection show good results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The problem of impostor dataset selection for GMM-based speaker verification is addressed through the recently proposed data-driven background dataset refinement technique. The SVM-based refinement technique selects from a candidate impostor dataset those examples that are most frequently selected as support vectors when training a set of SVMs on a development corpus. This study demonstrates the versatility of dataset refinement in the task of selecting suitable impostor datasets for use in GMM-based speaker verification. The use of refined Z- and T-norm datasets provided performance gains of 15% in EER in the NIST 2006 SRE over the use of heuristically selected datasets. The refined datasets were shown to generalise well to the unseen data of the NIST 2008 SRE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A data-driven background dataset refinement technique was recently proposed for SVM based speaker verification. This method selects a refined SVM background dataset from a set of candidate impostor examples after individually ranking examples by their relevance. This paper extends this technique to the refinement of the T-norm dataset for SVM-based speaker verification. The independent refinement of the background and T-norm datasets provides a means of investigating the sensitivity of SVM-based speaker verification performance to the selection of each of these datasets. Using refined datasets provided improvements of 13% in min. DCF and 9% in EER over the full set of impostor examples on the 2006 SRE corpus with the majority of these gains due to refinement of the T-norm dataset. Similar trends were observed for the unseen data of the NIST 2008 SRE.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work presents an extended Joint Factor Analysis model including explicit modelling of unwanted within-session variability. The goals of the proposed extended JFA model are to improve verification performance with short utterances by compensating for the effects of limited or imbalanced phonetic coverage, and to produce a flexible JFA model that is effective over a wide range of utterance lengths without adjusting model parameters such as retraining session subspaces. Experimental results on the 2006 NIST SRE corpus demonstrate the flexibility of the proposed model by providing competitive results over a wide range of utterance lengths without retraining and also yielding modest improvements in a number of conditions over current state-of-the-art.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article explores two matrix methods to induce the ``shades of meaning" (SoM) of a word. A matrix representation of a word is computed from a corpus of traces based on the given word. Non-negative Matrix Factorisation (NMF) and Singular Value Decomposition (SVD) compute a set of vectors corresponding to a potential shade of meaning. The two methods were evaluated based on loss of conditional entropy with respect to two sets of manually tagged data. One set reflects concepts generally appearing in text, and the second set comprises words used for investigations into word sense disambiguation. Results show that for NMF consistently outperforms SVD for inducing both SoM of general concepts as well as word senses. The problem of inducing the shades of meaning of a word is more subtle than that of word sense induction and hence relevant to thematic analysis of opinion where nuances of opinion can arise.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we argue that the term “capitalism” is no longer useful for understanding the current system of political economic relations in which we live. Rather, we argue that the system can be more usefully characterised as neofeudal corporatism. Using examples drawn from a 300,000 word corpus of public utterances by three political leaders from the “coalition of the willing”— George W. Bush, Tony Blair, and John Howard—we show some defining characteristics of this relatively new system and how they are manifest in political language about the invasion of Iraq.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this study, a nanofiber mesh made by co-electrospinning medical grade poly(epsilon-caprolactone) and collagen (mPCL/Col) was fabricated and studied. Its mechanical properties and characteristics were analyzed and compared to mPCL meshes. mPCL/Col meshes showed a reduction in strength but an increase in ductility when compared to PCL meshes. In vitro assays revealed that mPCL/Col supported the attachment and proliferation of smooth muscle cells on both sides of the mesh. In vivo studies in the corpus cavernosa of rabbits revealed that the mPCL/Col scaffold used in conjunction with autologous smooth muscle cells resulted in better integration with host tissue when compared to cell free scaffolds. On a cellular level preseeded scaffolds showed a minimized foreign body reaction.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

So much has been made over the crisis in English literature as field, as corpus, and as canon in recent years, that some of it undoubtedly has spilled over into English education. This has been the case in predominantly English-speaking Anglo-American and Commonwealth nations, as well as in those postcolonial states where English remains the medium of instruction and lingua franca of economic and cultural elites. Yet to attribute the pressures for change in pedagogic practice to academic paradigm shift per se would prop up the shaky axiom that English education is forever caught in some kind of perverse evolutionary time-lag, parasitic of university literary studies. I, too, believe that English education has reached a crucial moment in its history, but that this moment is contingent upon the changing demographics, cultural knowledges, and practices of economic globalization.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic Speech Recognition (ASR) has matured into a technology which is becoming more common in our everyday lives, and is emerging as a necessity to minimise driver distraction when operating in-car systems such as navigation and infotainment. In “noise-free” environments, word recognition performance of these systems has been shown to approach 100%, however this performance degrades rapidly as the level of background noise is increased. Speech enhancement is a popular method for making ASR systems more ro- bust. Single-channel spectral subtraction was originally designed to improve hu- man speech intelligibility and many attempts have been made to optimise this algorithm in terms of signal-based metrics such as maximised Signal-to-Noise Ratio (SNR) or minimised speech distortion. Such metrics are used to assess en- hancement performance for intelligibility not speech recognition, therefore mak- ing them sub-optimal ASR applications. This research investigates two methods for closely coupling subtractive-type enhancement algorithms with ASR: (a) a computationally-efficient Mel-filterbank noise subtraction technique based on likelihood-maximisation (LIMA), and (b) in- troducing phase spectrum information to enable spectral subtraction in the com- plex frequency domain. Likelihood-maximisation uses gradient-descent to optimise parameters of the enhancement algorithm to best fit the acoustic speech model given a word se- quence known a priori. Whilst this technique is shown to improve the ASR word accuracy performance, it is also identified to be particularly sensitive to non-noise mismatches between the training and testing data. Phase information has long been ignored in spectral subtraction as it is deemed to have little effect on human intelligibility. In this work it is shown that phase information is important in obtaining highly accurate estimates of clean speech magnitudes which are typically used in ASR feature extraction. Phase Estimation via Delay Projection is proposed based on the stationarity of sinusoidal signals, and demonstrates the potential to produce improvements in ASR word accuracy in a wide range of SNR. Throughout the dissertation, consideration is given to practical implemen- tation in vehicular environments which resulted in two novel contributions – a LIMA framework which takes advantage of the grounding procedure common to speech dialogue systems, and a resource-saving formulation of frequency-domain spectral subtraction for realisation in field-programmable gate array hardware. The techniques proposed in this dissertation were evaluated using the Aus- tralian English In-Car Speech Corpus which was collected as part of this work. This database is the first of its kind within Australia and captures real in-car speech of 50 native Australian speakers in seven driving conditions common to Australian environments.