Biblioteca Digital

31 resultados para corpus paralelo

The science and technology of corpus, and corpus for science and technology

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Corpus Linguistics is a young discipline. The earliest work was done in the 1960s, but corpora only began to be widely used by lexicographers and linguists in the late 1980s, by language teachers in the late 1990s, and by language students only very recently. This course in corpus linguistics was held at the Departamento de Linguistica Aplicada, E.T.S.I. de Minas, Universidad Politecnica de Madrid from June 15-19 1998. About 45 teachers registered for the course. 30% had PhDs in linguistics, 20% in literature, and the rest were doctorandi or qualified English teachers. The course was designed to introduce the use of corpora and other computational resources in teaching and research, with special reference to scientific and technological discourse in English. Each participant had a computer networked with the lecturer’s machine, whose display could be projected onto a large screen. Application programs were loaded onto the central server, and telnet and a web browser were available. COBUILD gave us permission to access the 323 million word Bank of English corpus, Mike Scott allowed us to use his Wordsmith Tools software, and Tim Johns gave us a copy of his MicroConcord program.

Size matters: creating dictionaries from the world's largest corpus

Relevância:

20.00% 20.00%

Publicador:

ACORN (the Aston Corpus Network)

Relevância:

20.00% 20.00%

Publicador:

Corpus direct to your classroom

Relevância:

20.00% 20.00%

Publicador:

The discourse of climate change: a corpus-based approach

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Based on Goffman’s definition that frames are general ‘schemata of interpretation’ that people use to ‘locate, perceive, identify, and label’, other scholars have used the concept in a more specific way to analyze media coverage. Frames are used in the sense of organizing devices that allow journalists to select and emphasise topics, to decide ‘what matters’ (Gitlin 1980). Gamson and Modigliani (1989) consider frames as being embedded within ‘media packages’ that can be seen as ‘giving meaning’ to an issue. According to Entman (1993), framing comprises a combination of different activities such as: problem definition, causal interpretation, moral evaluation, and/or treatment recommendation for the item described. Previous research has analysed climate change with the purpose of testing Downs’s model of the issue attention cycle (Trumbo 1996), to uncover media biases in the US press (Boykoff and Boykoff 2004), to highlight differences between nations (Brossard et al. 2004; Grundmann 2007) or to analyze cultural reconstructions of scientific knowledge (Carvalho and Burgess 2005). In this paper we shall present data from a corpus linguistics-based approach. We will be drawing on results of a pilot study conducted in Spring 2008 based on the Nexis news media archive. Based on comparative data from the US, the UK, France and Germany, we aim to show how the climate change issue has been framed differently in these countries and how this framing indicates differences in national climate change policies.

A new venture in corpus-based lexicography: towards a dictionary of academic English

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper asserts the increasing importance of academic English in an increasingly Anglophone world, and looks at the differences between academic English and general English, especially in terms of vocabulary. The creation of wordlists has played an important role in trying to establish the academic English lexicon, but these wordlists are not based on appropriate data, or are implemented inappropriately. There is as yet no adequate dictionary of academic English, and this paper reports on new efforts at Aston University to create a suitable corpus on which such a dictionary could be based.

Classroom cornucopia: the new COBUILD dictionary and the Bank of English corpus

Relevância:

20.00% 20.00%

Publicador:

Corpus-driven lexicography

Relevância:

20.00% 20.00%

Publicador:

Corpus, collocation, and lexical sets

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is a progress report on a research path I first outlined in my contribution to “Words in Context: A Tribute to John Sinclair on his Retirement” (Heffer and Sauntson, 2000). Therefore, I first summarize that paper here, in order to provide the relevant background. The second half of the current paper consists of some further manual analyses, exploring various parameters and procedures that might assist in the design of an automated computational process for the identification of lexical sets. The automation itself is beyond the scope of the current paper.

The 'corpus approach' used by present-day lexicography in dictionary-writing applied to the archive resources of the Dictionary of Traded Goods and Commodities

Relevância:

20.00% 20.00%

Publicador:

Corpus direct to your classroom

Relevância:

20.00% 20.00%

Publicador:

Open the black box: corpus-based classroom exercises

Relevância:

20.00% 20.00%

Publicador:

A corpus-based investigation of junk emails

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Almost everyone who has an email account receives from time to time unwanted emails. These emails can be jokes from friends or commercial product offers from unknown people. In this paper we focus on these unwanted messages which try to promote a product or service, or to offer some “hot” business opportunities. These messages are called junk emails. Several methods to filter junk emails were proposed, but none considers the linguistic characteristics of junk emails. In this paper, we investigate the linguistic features of a corpus of junk emails, and try to decide if they constitute a distinct genre. Our corpus of junk emails was build from the messages received by the authors over a period of time. Initially, the corpus consisted of 1563, but after eliminating the duplications automatically we kept only 673 files, totalising just over 373,000 tokens. In order to decide if the junk emails constitute a different genre, a comparison with a corpus of leaflets extracted from BNC and with the whole BNC corpus is carried out. Several characteristics at the lexical and grammatical levels were identified.

Corpus, collocation, and lexical sets

Relevância:

20.00% 20.00%

Publicador:

A corpus of 714 full-color images of depth-rotated objects

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A set of full-color images of objects is described for use in experiments investigating the effects of in-depth rotation on the identification of three-dimensional objects. The corpus contains up to 11 perspective views of 70 nameable objects. We also provide ratings of the "goodness" of each view, based on Thurstonian scaling of subjects' preferences in a paired-comparison experiment. An exploratory cluster analysis on the scaling solutions indicates that the amount of information available in a given view generally is the major determinant of the goodness of the view. For instance, objects with an elongated front-back axis tend to cluster together, and the front and back views of these objects, which do not reveal the object's major surfaces and features, are evaluated as the worst views.

«
1
2
3
»