38 resultados para corpus, collocations, corpus linguistics, EPTIC


Relevância:

100.00% 100.00%

Publicador:

Resumo:

While language use has been argued to reflect gender asymmetry, increasing parity has been evidenced in official settings (Holmes, 2000; Dister and Moreau, 2006). Our hypothesis is that the French national press has developed a norm of equal linguistic treatment of men and women. In a corpus of articles from Libération, Le Monde, and Le Figaro, we examine the treatment of Arlette Laguiller, the female leader of the French extreme-left 'Worker's Struggle' Party (Lutte Ouvrière), during the run-up to the 2007 presidential elections. How Laguiller is referred to and described in comparison with her male counterparts evidences no asymmetry. Breaches to parity are only found in the right-wing Figaro newspaper. The ideological distance between the newspaper and the candidate suggests that power struggles are a primary source of asymmetrical treatments. The discursive functions of such treatments can be understood through an investigation based on a portable corpus linguistics methodology for the measure of discrimination. © 2011 Elsevier B.V.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article is the first linguistic analysis of a new category of lifestyle magazines in the German speaking countries, based on methods of corpus linguistics and multimodal discourse analysis. Since the launch of the magazine LandLust in Germany in 2005, more than twenty publications of so called "land magazines" have appeared on the market, attracting millions of readers. Our research analyses land magazines as discursive events. We examine the specific combination of discourses land magazines are serving or creating by looking at the semiotic practices – writing and images – they manifest themselves by. Our results show that the magazine under scrutiny does not simply provide new forms of escapism but also positions itself politically in subtle ways as part of the traditional-conservative spectrum by reacting to metalinguistic discourses such as purism and feminist criticism.

Relevância:

50.00% 50.00%

Publicador:

Relevância:

50.00% 50.00%

Publicador:

Relevância:

50.00% 50.00%

Publicador:

Resumo:

We propose a hybrid generative/discriminative framework for semantic parsing which combines the hidden vector state (HVS) model and the hidden Markov support vector machines (HM-SVMs). The HVS model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. The HM-SVMs combine the advantages of the hidden Markov models and the support vector machines. By employing a modified K-means clustering method, a small set of most representative sentences can be automatically selected from an un-annotated corpus. These sentences together with their abstract annotations are used to train an HVS model which could be subsequently applied on the whole corpus to generate semantic parsing results. The most confident semantic parsing results are selected to generate a fully-annotated corpus which is used to train the HM-SVMs. The proposed framework has been tested on the DARPA Communicator Data. Experimental results show that an improvement over the baseline HVS parser has been observed using the hybrid framework. When compared with the HM-SVMs trained from the fully-annotated corpus, the hybrid framework gave a comparable performance with only a small set of lightly annotated sentences. © 2008. Licensed under the Creative Commons.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This study uses a purpose-built corpus to explore the linguistic legacy of Britain’s maritime history found in the form of hundreds of specialised ‘Maritime Expressions’ (MEs), such as TAKEN ABACK, ANCHOR and ALOOF, that permeate modern English. Selecting just those expressions commencing with ’A’, it analyses 61 MEs in detail and describes the processes by which these technical expressions, from a highly specialised occupational discourse community, have made their way into modern English. The Maritime Text Corpus (MTC) comprises 8.8 million words, encompassing a range of text types and registers, selected to provide a cross-section of ‘maritime’ writing. It is analysed using WordSmith analytical software (Scott, 2010), with the 100 million-word British National Corpus (BNC) as a reference corpus. Using the MTC, a list of keywords of specific salience within the maritime discourse has been compiled and, using frequency data, concordances and collocations, these MEs are described in detail and their use and form in the MTC and the BNC is compared. The study examines the transformation from ME to figurative use in the general discourse, in terms of form and metaphoricity. MEs are classified according to their metaphorical strength and their transference from maritime usage into new registers and domains such as those of business, politics, sports and reportage etc. A revised model of metaphoricity is developed and a new category of figurative expression, the ‘resonator’, is proposed. Additionally, developing the work of Lakov and Johnson, Kovesces and others on Conceptual Metaphor Theory (CMT), a number of Maritime Conceptual Metaphors are identified and their cultural significance is discussed.

Relevância:

40.00% 40.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper is a progress report on a research path I first outlined in my contribution to “Words in Context: A Tribute to John Sinclair on his Retirement” (Heffer and Sauntson, 2000). Therefore, I first summarize that paper here, in order to provide the relevant background. The second half of the current paper consists of some further manual analyses, exploring various parameters and procedures that might assist in the design of an automated computational process for the identification of lexical sets. The automation itself is beyond the scope of the current paper.

Relevância:

40.00% 40.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Almost everyone who has an email account receives from time to time unwanted emails. These emails can be jokes from friends or commercial product offers from unknown people. In this paper we focus on these unwanted messages which try to promote a product or service, or to offer some “hot” business opportunities. These messages are called junk emails. Several methods to filter junk emails were proposed, but none considers the linguistic characteristics of junk emails. In this paper, we investigate the linguistic features of a corpus of junk emails, and try to decide if they constitute a distinct genre. Our corpus of junk emails was build from the messages received by the authors over a period of time. Initially, the corpus consisted of 1563, but after eliminating the duplications automatically we kept only 673 files, totalising just over 373,000 tokens. In order to decide if the junk emails constitute a different genre, a comparison with a corpus of leaflets extracted from BNC and with the whole BNC corpus is carried out. Several characteristics at the lexical and grammatical levels were identified.