889 resultados para Authorship, autobiography


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mode of access: Internet.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with texts from a German newspaper. With nearly perfect reliability the SVM was able to reject other authors and detected the target author in 60–80% of the cases. In a second experiment, we ignored nouns, verbs and adjectives and replaced them by grammatical tags and bigrams. This resulted in slightly reduced performance. Author detection with SVMs on full word forms was remarkably robust even if the author wrote about different topics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In response to Chaski’s article (published in this volume) an examination is made of the methodological understanding necessary to identify dependable markers for forensic (and general) authorship attribution work. This examination concentrates on three methodological areas of concern which researchers intending to identify markers of authorship must address. These areas are sampling linguistic data, establishing the reliability of authorship markers and establishing the validity of authorship markers. It is suggested that the complexity of sampling problems in linguistic data is often underestimated and that theoretical issues in this area are both difficult and unresolved. It is further argued that the concepts of reliability and validity must be well understood and accounted for in any attempts to identify authorship markers and that largely this is not done. Finally, Principal Component Analysis is identified as an alternative approach which avoids some of the methodological problems inherent in identifying reliable, valid markers of authorship.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The judicial interest in ‘scientific’ evidence has driven recent work to quantify results for forensic linguistic authorship analysis. Through a methodological discussion and a worked example this paper examines the issues which complicate attempts to quantify results in work. The solution suggested to some of the difficulties is a sampling and testing strategy which helps to identify potentially useful, valid and reliable markers of authorship. An important feature of the sampling strategy is that these markers identified as being generally valid and reliable are retested for use in specific authorship analysis cases. The suggested approach for drawing quantified conclusions combines discriminant function analysis and Bayesian likelihood measures. The worked example starts with twenty comparison texts for each of three potential authors and then uses a progressively smaller comparison corpus, reducing to fifteen, ten, five and finally three texts per author. This worked example demonstrates how reducing the amount of data affects the way conclusions can be drawn. With greater numbers of reference texts quantified and safe attributions are shown to be possible, but as the number of reference texts reduces the analysis shows how the conclusion which should be reached is that no attribution can be made. The testing process at no point results in instances of a misattribution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This chapter demonstrates diversity in the activity of authorship and the corresponding diversity of forensic authorship analysis questions and techniques. Authorship is discussed in terms of Love’s (2002) multifunctional description of precursory, executive, declarative and revisionary authorship activities and the implications of this distinction for forensic problem solving. Four different authorship questions are considered. These are ‘How was the text produced?’, ‘How many people wrote the text?’, ‘What kind of person wrote the text?’ and ‘What is the relationship of a queried text with comparison texts?’ Different approaches to forensic authorship analysis are discussed in terms of their appropriateness to answering different authorship questions. The conclusion drawn is that no one technique will ever be appropriate to all problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current debate within forensic authorship analysis has tended to polarise those who argue that analysis methods should reflect a strong cognitive theory of idiolect and others who see less of a need to look behind the stylistic variation of the texts they are examining. This chapter examines theories of idiolect and asks how useful or necessary they are to the practice of forensic authorship analysis. Taking a specific text messaging case the chapter demonstrates that methodologically rigorous, theoretically informed authorship analysis need not appeal to cognitive theories of idiolect in order to be valid. By considering text messaging forensics, lessons will be drawn which can contribute to wider debates on the role of theories of idiolect in forensic casework.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previous research into formulaic language has focussed on specialised groups of people (e.g. L1 acquisition by infants and adult L2 acquisition) with ordinary adult native speakers of English receiving less attention. Additionally, whilst some features of formulaic language have been used as evidence of authorship (e.g. the Unabomber’s use of you can’t eat your cake and have it too) there has been no systematic investigation into this as a potential marker of authorship. This thesis reports the first full-scale study into the use of formulaic sequences by individual authors. The theory of formulaic language hypothesises that formulaic sequences contained in the mental lexicon are shaped by experience combined with what each individual has found to be communicatively effective. Each author’s repertoire of formulaic sequences should therefore differ. To test this assertion, three automated approaches to the identification of formulaic sequences are tested on a specially constructed corpus containing 100 short narratives. The first approach explores a limited subset of formulaic sequences using recurrence across a series of texts as the criterion for identification. The second approach focuses on a word which frequently occurs as part of formulaic sequences and also investigates alternative non-formulaic realisations of the same semantic content. Finally, a reference list approach is used. Whilst claiming authority for any reference list can be difficult, the proposed method utilises internet examples derived from lists prepared by others, a procedure which, it is argued, is akin to asking large groups of judges to reach consensus about what is formulaic. The empirical evidence supports the notion that formulaic sequences have potential as a marker of authorship since in some cases a Questioned Document was correctly attributed. Although this marker of authorship is not universally applicable, it does promise to become a viable new tool in the forensic linguist’s tool-kit.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

By applying narrative theory to the party political texts emerging within the UK Labour Party after 2010, which make up the corpus of One Nation discourse, we can grasp the underlying significance of this ideational revision of Labour Party and leftist thought. Through an identification and analysis of the sequence of texts and their constitution as a "story" that interpolates an underlying "plot," we can see how a revision of Labour's "tale" offers to leadership a new party discourse appropriate to it, mediating-if not reconciling-the problematic duality of narrative authorship by both party and leader. © The Political Quarterly Publishing Co. Ltd. 2013.