883 resultados para corpus diacrônico


Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we propose a novel scheme for carrying out speaker diarization in an iterative manner. We aim to show that the information obtained through the first pass of speaker diarization can be reused to refine and improve the original diarization results. We call this technique speaker rediarization and demonstrate the practical application of our rediarization algorithm using a large archive of two-speaker telephone conversation recordings. We use the NIST 2008 SRE summed telephone corpora for evaluating our speaker rediarization system. This corpus contains recurring speaker identities across independent recording sessions that need to be linked across the entire corpus. We show that our speaker rediarization scheme can take advantage of inter-session speaker information, linked in the initial diarization pass, to achieve a 30% relative improvement over the original diarization error rate (DER) after only two iterations of rediarization.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a novel scheme for improving speaker diarization by making use of repeating speakers across multiple recordings within a large corpus. We call this technique speaker re-diarization and demonstrate that it is possible to reuse the initial speaker-linked diarization outputs to boost diarization accuracy within individual recordings. We first propose and evaluate two novel re-diarization techniques. We demonstrate their complementary characteristics and fuse the two techniques to successfully conduct speaker re-diarization across the SAIVT-BNEWS corpus of Australian broadcast data. This corpus contains recurring speakers in various independent recordings that need to be linked across the dataset. We show that our speaker re-diarization approach can provide a relative improvement of 23% in diarization error rate (DER), over the original diarization results, as well as improve the estimated number of speakers and the cluster purity and coverage metrics.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a novel method for improving hierarchical speaker clustering in the tasks of speaker diarization and speaker linking. In hierarchical clustering, a tree can be formed that demonstrates various levels of clustering. We propose a ratio that expresses the impact of each cluster on the formation of this tree and use this to rescale cluster scores. This provides score normalisation based on the impact of each cluster. We use a state-of-the-art speaker diarization and linking system across the SAIVT-BNEWS corpus to show that our proposed impact ratio can provide a relative improvement of 16% in diarization error rate (DER).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Agent-based modeling and simulation (ABMS) may fit well with entrepreneurship research and practice because the core concepts and basic premises of entrepreneurship coincide with the characteristics of ABMS. However, it is difficult to find cases where ABMS is applied to entrepreneurship research. To apply ABMS to entrepreneurship and organization studies, designing a conceptual model is important; thus to effectively design a conceptual model, various mixed method approaches are being attempted. As a new mixed method approach to ABMS, this study proposes a bibliometric approach to designing agent based models, which establishes and analyzes a domain corpus. This study presents an example on the venture creation process using the bibliometric approach. This example shows us that the results of the multi-agent simulations on the venturing process based on the bibliometric approach are close to each nation’s surveyed data on the venturing activities. In conclusion, by the bibliometric approach proposed in this study, all the agents and the agents’ behaviors related to a phenomenon can be extracted effectively, and a conceptual model for ABMS can be designed with the agents and their behaviors. This study contributes to the entrepreneurship and organization studies by promoting the application of ABMS.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, I discuss the representation of Sweden and Swedes in the Íslendingasögur, with an emphasis on identifying patterns across the works, both in terms of narrative structure and content. The aim in doing so is to shed light on modes of representing non-Icelanders in the Íslendingasögur, as well as on medieval Icelandic conceptions of Sweden as a distinct region within Scandinavia. I also aim here to add to a longer-term project that examines the place of foreign visitors to Iceland in the saga corpus more generally. As the scope of this paper is limited to Swedish characters, I am cautious about drawing broad conclusions about their representation – observations given here will need to be framed by a wider study, and one that reads for the characterisation of Swedes in the context both of other genres of saga literature and representations of characters from other regions beside Sweden. However, it is clear that some similarities exist in saga episodes involving Swedish characters: in four of the Íslendingasögur, Swedes are given roles as intruders or outsiders who threaten the community of the saga and whose deaths bring about a change in the for- tunes of their killers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Livecoding is an artistic programming practice in which an artist's low-level interaction can be observed with sufficiently high fidelity to allow for transcription and analysis. This paper presents the first reported" coding" of livecoding videos. From an identified corpus of videos available on the web, we coded performances of two different livecoding artists, recording both the (textual) programming edit events and the musical effect of these edits.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Acoustic sensors allow scientists to scale environmental monitoring over large spatiotemporal scales. The faunal vocalisations captured by these sensors can answer ecological questions, however, identifying these vocalisations within recorded audio is difficult: automatic recognition is currently intractable and manual recognition is slow and error prone. In this paper, a semi-automated approach to call recognition is presented. An automated decision support tool is tested that assists users in the manual annotation process. The respective strengths of human and computer analysis are used to complement one another. The tool recommends the species of an unknown vocalisation and thereby minimises the need for the memorization of a large corpus of vocalisations. In the case of a folksonomic tagging system, recommending species tags also minimises the proliferation of redundant tag categories. We describe two algorithms: (1) a “naïve” decision support tool (16%–64% sensitivity) with efficiency of O(n) but which becomes unscalable as more data is added and (2) a scalable alternative with 48% sensitivity and an efficiency ofO(log n). The improved algorithm was also tested in a HTML-based annotation prototype. The result of this work is a decision support tool for annotating faunal acoustic events that may be utilised by other bioacoustics projects.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the overwhelming increase in the amount of data on the web and data bases, many text mining techniques have been proposed for mining useful patterns in text documents. Extracting closed sequential patterns using the Pattern Taxonomy Model (PTM) is one of the pruning methods to remove noisy, inconsistent, and redundant patterns. However, PTM model treats each extracted pattern as whole without considering included terms, which could affect the quality of extracted patterns. This paper propose an innovative and effective method that extends the random set to accurately weigh patterns based on their distribution in the documents and their terms distribution in patterns. Then, the proposed approach will find the specific closed sequential patterns (SCSP) based on the new calculated weight. The experimental results on Reuters Corpus Volume 1 (RCV1) data collection and TREC topics show that the proposed method significantly outperforms other state-of-the-art methods in different popular measures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Like many other cataclysmic events September 11, a day now popularly believed to have 'changed the world', has become a topic taken up by children's writers. This thesis, titled The Whole World Shook: Ethnic, National and Heroic Identities in Children's Fiction About 9/11, examines how cultural identities are constructed within fictional texts for young people written about the attacks on the Twin Towers. It identifies three significant identity categories encoded in 9/11 books for children: ethnic identities, national identities, and heroic identities. The thesis argues that the identities formed within the selected children's texts are in flux, privileging performances of identities that are contingent on post-9/11 politics. This study is located within the field of children's literature criticism, which supports the understanding that children's books, like all texts, play a role in the production of identities. Children's literature is highly significant both in its pedagogical intent (to instruct and induct children into cultural practices and beliefs) and in its obscurity (in making the complex simple enough for children, and from sometimes intentionally shying away from difficult things). This literary criticism informed the study that the texts, if they were to be written at all, would be complex, varied and most likely as ambiguous and contradictory as the responses to the attacks on New York themselves. The theoretical framework for this thesis draws on a range of critical theories including literary theory, cultural studies, studies of performativity and postmodernism. This critical framework informs the approach by providing ways for: (i) understanding how political and ideological work is performed in children's literature; (ii) interrogating the constructed nature of cultural identities; (iii) developing a nuanced methodology for carrying out a close textual analysis. The textual analysis examines a representative sample of children's texts about 9/11, including picture books, young adult fiction, and a selection of DC Comics. Each chapter focuses on a different though related identity category. Chapter Four examines the performance of ethnic identities and race politics within a sample of picture books and young adult fiction; Chapter Five analyses the construction of collective, national identities in another set of texts; and Chapter Six does analytic work on a third set of texts, demonstrating the strategic performance of particular kinds of heroic identities. I argue that performances of cultural identities constructed in these texts draw on familiar versions of identities as well as contribute to new ones. These textual constructions can be seen as offering some certainties in increasingly uncertain times. The study finds, in its sample of books a co-mingling of xenophobia and tolerance; a binaried competition between good and evil and global harmony and national insularity; and a lauding of both the commonplace hero and the super-human. Being a recent corpus of texts about 9/11, these texts provide information on the kinds of 'selves' that appear to be privileged in the West since 2001. The thesis concludes that the shifting identities evident in texts that are being produced for children about 9/11 offer implicit and explicit accounts of what constitute good citizenship, loyalty to nation and community, and desirable attributes in a Western post-9/11 context. This thesis makes an original contribution to the field of children's literature by providing a focussed and sustained analysis of how texts for children about 9/11 contribute to formations of identity in these complex times of cultural unease and global unrest.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The celebrated work of Lortie (1975) alerted teacher educators to the extended period of 'apprenticeship' that student teachers have been through before they arrive at teacher education programmes. The subjective implicit theories (Marland, 1992) developed by prospective teachers are shaped by their lifeworld experiences at school and in the case of physical education teachers, their experiences in sport. The biography of physical education teacher education (PETE) students tends to be characterised by ecto-mesomorphic individuals who have been socialised by the rigours of highly competitive sport (Gore, 1990; Macdonald, 1992; Rossi, 1996). We can add to this, the requirements of teacher preparation in physical education which for the most part are dominated by the traditions and rhetoric of the 'natural' bio-physical sciences; largely a legacy of Henry's (1964) work on physical education as an academic discipline, as well as that of Abernathy and Waltz the same year (Abernathy & Waltz, 1964). In the United Kingdom, Curl (1973) further advanced the argument in an attempt to justify human movement as an independent field of study with its own corpus of knowledge. It is little wonder then, that the dominant pedagogical discourse in physical education is, as Tinning (1991) discusses, one of performance pedagogy (see also Hendry, 1986 for an earlier discussion). The knowledge required to support such a discourse could be described as 'official' (Apple, 1993) and it assumes such status by virtue of the power appropriated by and bestowed upon the scientific community in PETE (Macdonald & Tinning, 1995; Sparkes, 1989, 1993). However, there are social reifiers too, and these tend to relate to the social construction of the body (Kirk, 1993; Kirk & Spiller, 1994; Gilroy, 1994) and what Tinning (1985) has termed the Cult of Slenderness. Furthermore the 'slender image' has become a signifier of 'good health'. This is inextricably linked to what might be considered as a health triplex—'exercise = fitness = health' (see Kirk & Colquhoun, 1989; Tinning & Kirk, 1991) which in Australia, underpins curriculum packages such as Daily Physical Education which teachers (often including physical education primary...

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article begins with the premise that morality is an intrinsic, although often invisible, aspect of everyday social action. Drawn from a corpus of fifty audiorecorded telephone calls to Kids Helpline, an Australian helpline for children and young people, we examine one call to show how the young caller and counsellor co-construct ‘morality-in-action’. Ethnomethodological understandings and, in particular, Sacks’ (1992) description of ‘Class 2’ rules and infractions show how an adolescent caller and counsellor collaboratively assemble moral versions of the caller. In puzzling out possible motives, the caller and counsellor can be seen to be attending to the implications of different moral versions of the caller. This attribution of motives is moral work in action, with motives contingently assembled, displayed and evaluated, with such work understood as displays of moral reasoning. The counselling call makes visible the counsellor’s interactional work to support and empower the client. Analysis such as this offers counsellors ways of understanding and making visible their interactional and moral work within helpline call interactions.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The early years are significant in optimising children’s educational, emotional and social outcomes and have become a major international policy priority. Within Australia, policy levers have prioritised early childhood education, with a focus on program quality, as it is associated with lifelong success. Longitudinal studies have found that high quality teacher-child interactions are an essential element of high quality programs, and teacher questioning is one aspect of teacher-child interactions that has been attributed to affecting the quality of education, linking open ended questioning to higher cognitive achievement. Teachers, however, overwhelmingly ask more closed than open questions. In the classroom, like everyday interaction, questions in interaction require answers. They are used to request, offer, repair, challenge, seek agreement (Curl & Drew, 2008; Enfield, Stivers, & Levinson, 2010; Hayano, 2013; Schegloff, 2007). Teachers use questions to set agendas and manage lessons (McHoul, 1978; Mehan, 1979; Sacks, 1995), and to gauge students’ knowledge and understanding (Lerner, 1995; McHoul, 1978; Mehan, 1979). Drawing on data from the Australian Research Council project Interacting with Knowledge: Interacting with people: Web searching in early childhood, this paper focuses on an extended sequence of talk between a teacher with two students aged between 3.5 and 5 years in a preschool classroom. The episode, drawn from a corpus of over 200 hours of video recorded data, captures how the teacher and children undertake an online search for images of lady beetles and hairy caterpillars on the Web. Ethnomethodological and conversation analysis approaches examine how the teacher asks questions, which call on the children to display their factual knowledge about the search topic. The fine grained analysis shows how teachers design their interactions to prompt children’s displays of factual knowledge, and how the design of factual questions affect a student’s response in terms of what and how they respond. In focussing on how the teacher designs factual questions and how children respond to these questions it shows that question design can close down a student’s reply; or elicit a range of answers, from one word to extended more detailed responses. Understanding how the design of teachers’ questions can influence students’ responses has pedagogic implications and may support educators to make intentional decisions regarding their own questioning techniques.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This is the third (but first edited) volume in Sen and Hill’s corpus on Indonesian media. An anthology built from contributions to a 2006 workshop, it is necessarily more fragmented than the editors’ earlier monographs. While this fragmented character helps to evoke a fractured context, it also makes for unwieldiness...

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article presents and evaluates a model to automatically derive word association networks from text corpora. Two aspects were evaluated: To what degree can corpus-based word association networks (CANs) approximate human word association networks with respect to (1) their ability to quantitatively predict word associations and (2) their structural network characteristics. Word association networks are the basis of the human mental lexicon. However, extracting such networks from human subjects is laborious, time consuming and thus necessarily limited in relation to the breadth of human vocabulary. Automatic derivation of word associations from text corpora would address these limitations. In both evaluations corpus-based processing provided vector representations for words. These representations were then employed to derive CANs using two measures: (1) the well known cosine metric, which is a symmetric measure, and (2) a new asymmetric measure computed from orthogonal vector projections. For both evaluations, the full set of 4068 free association networks (FANs) from the University of South Florida word association norms were used as baseline human data. Two corpus based models were benchmarked for comparison: a latent topic model and latent semantic analysis (LSA). We observed that CANs constructed using the asymmetric measure were slightly less effective than the topic model in quantitatively predicting free associates, and slightly better than LSA. The structural networks analysis revealed that CANs do approximate the FANs to an encouraging degree.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Traditional text classification technology based on machine learning and data mining techniques has made a big progress. However, it is still a big problem on how to draw an exact decision boundary between relevant and irrelevant objects in binary classification due to much uncertainty produced in the process of the traditional algorithms. The proposed model CTTC (Centroid Training for Text Classification) aims to build an uncertainty boundary to absorb as many indeterminate objects as possible so as to elevate the certainty of the relevant and irrelevant groups through the centroid clustering and training process. The clustering starts from the two training subsets labelled as relevant or irrelevant respectively to create two principal centroid vectors by which all the training samples are further separated into three groups: POS, NEG and BND, with all the indeterminate objects absorbed into the uncertain decision boundary BND. Two pairs of centroid vectors are proposed to be trained and optimized through the subsequent iterative multi-learning process, all of which are proposed to collaboratively help predict the polarities of the incoming objects thereafter. For the assessment of the proposed model, F1 and Accuracy have been chosen as the key evaluation measures. We stress the F1 measure because it can display the overall performance improvement of the final classifier better than Accuracy. A large number of experiments have been completed using the proposed model on the Reuters Corpus Volume 1 (RCV1) which is important standard dataset in the field. The experiment results show that the proposed model has significantly improved the binary text classification performance in both F1 and Accuracy compared with three other influential baseline models.