18 resultados para Language Resources
em Aston University Research Archive
Resumo:
This is a multiple case study of the leadership language of three senior women working in a large corporation in Bahrain. The study’s main aim is to explore the linguistic practices the women leaders use with their colleagues and subordinates in corporate meetings. Adopting a Foucauldian (1972) notion of ‘discourses’ as social practices and a view of gender as socially constructed and discursively performed (Butler 1990), this research aims to unveil the competing discourses which may shape the leadership language of senior women in their communities of practice. The research is situated within the broader field of Sociolinguistics and the specific field of Language and Gender. To address the research aim, a case study approach incorporating multiple methods of qualitative data collection (observation, interviews, and shadowing) was utilised to gather information about the three women leaders and produce a rich description of their use of language in and out of meeting contexts. For analysis, principles of Qualitative Data Analysis (QDA) were used to organise and sort the large amount of data. Also, Feminist Post- Structuralist Discourse Analysis (FPDA) was adopted to produce a multi-faceted analysis of the subjects, their language leadership, power relations, and competing discourses in the context. It was found that the three senior women enact leadership differently making variable use of a repertoire of conventionally masculine and feminine linguistic practices. However, they all appear to have limited language resources and even more limiting subject positions; and they all have to exercise considerable linguistic expertise to police and modify their language in order to avoid the ‘double bind’. Yet, the extent of this limitation and constraints depends on the community of practice with its prevailing discourses, which appear to have their roots in Islamic and cultural practices as well as some Western influences acquired throughout the company’s history. It is concluded that it may be particularly challenging for Middle Eastern women to achieve any degree of equality with men in the workplace because discourses of Gender difference lie at the core of Islamic teaching and ideology.
Resumo:
Procedural knowledge is the knowledge required to perform certain tasks, and forms an important part of expertise. A major source of procedural knowledge is natural language instructions. While these readable instructions have been useful learning resources for human, they are not interpretable by machines. Automatically acquiring procedural knowledge in machine interpretable formats from instructions has become an increasingly popular research topic due to their potential applications in process automation. However, it has been insufficiently addressed. This paper presents an approach and an implemented system to assist users to automatically acquire procedural knowledge in structured forms from instructions. We introduce a generic semantic representation of procedures for analysing instructions, using which natural language techniques are applied to automatically extract structured procedures from instructions. The method is evaluated in three domains to justify the generality of the proposed semantic representation as well as the effectiveness of the implemented automatic system.
Resumo:
In this paper we present a new approach to ontology learning. Its basis lies in a dynamic and iterative view of knowledge acquisition for ontologies. The Abraxas approach is founded on three resources, a set of texts, a set of learning patterns and a set of ontological triples, each of which must remain in equilibrium. As events occur which disturb this equilibrium various actions are triggered to re-establish a balance between the resources. Such events include acquisition of a further text from external resources such as the Web or the addition of ontological triples to the ontology. We develop the concept of a knowledge gap between the coverage of an ontology and the corpus of texts as a measure triggering actions. We present an overview of the algorithm and its functionalities.
Resumo:
This paper describes part of the corpus collection efforts underway in the EC funded Companions project. The Companions project is collecting substantial quantities of dialogue a large part of which focus on reminiscing about photographs. The texts are in English and Czech. We describe the context and objectives for which this dialogue corpus is being collected, the methodology being used and make observations on the resulting data. The corpora will be made available to the wider research community through the Companions Project web site.
Resumo:
Automatic Term Recognition (ATR) is a fundamental processing step preceding more complex tasks such as semantic search and ontology learning. From a large number of methodologies available in the literature only a few are able to handle both single and multi-word terms. In this paper we present a comparison of five such algorithms and propose a combined approach using a voting mechanism. We evaluated the six approaches using two different corpora and show how the voting algorithm performs best on one corpus (a collection of texts from Wikipedia) and less well using the Genia corpus (a standard life science corpus). This indicates that choice and design of corpus has a major impact on the evaluation of term recognition algorithms. Our experiments also showed that single-word terms can be equally important and occupy a fairly large proportion in certain domains. As a result, algorithms that ignore single-word terms may cause problems to tasks built on top of ATR. Effective ATR systems also need to take into account both the unstructured text and the structured aspects and this means information extraction techniques need to be integrated into the term recognition process.
Resumo:
Almost everyone who has an email account receives from time to time unwanted emails. These emails can be jokes from friends or commercial product offers from unknown people. In this paper we focus on these unwanted messages which try to promote a product or service, or to offer some “hot” business opportunities. These messages are called junk emails. Several methods to filter junk emails were proposed, but none considers the linguistic characteristics of junk emails. In this paper, we investigate the linguistic features of a corpus of junk emails, and try to decide if they constitute a distinct genre. Our corpus of junk emails was build from the messages received by the authors over a period of time. Initially, the corpus consisted of 1563, but after eliminating the duplications automatically we kept only 673 files, totalising just over 373,000 tokens. In order to decide if the junk emails constitute a different genre, a comparison with a corpus of leaflets extracted from BNC and with the whole BNC corpus is carried out. Several characteristics at the lexical and grammatical levels were identified.
Resumo:
This is a study of specific aspects of classroom interaction primary school level in Kenya. The study entailed the identification of the sources of particular communication problems during the change-over period from Kiswahili to English medium teaching in two primary schools. There was subsequently an examination of the language resources which were employed by teachers to maintain pupil participation in communication in the light of the occurrence of possibility of occurrence of specific communication problems. The language resources which were found to be significant in this regard concerned firstly the use of different elicitation types by teachers to stimulate pupils into giving responses and secondly teachers' recourse to code-switching from English to Kiswahili and vice-versa. It was also found in this study that although the use of English as the medium of instruction in the classrooms which were observed resulted in certain communication problems, some of these problems need not have arisen if teachers had been more careful in their use of language. The consideration of this finding, after taking into account the role of different elicitation types and code-switching as interpretable from data samples had certain implications which are specified in the study for teaching in Kenyan primary schools. The corpus for the study consisted of audio-recordings of English, Science and Number-Work lessons which were later transcribed. Relevant data samples were subsequently extracted from transcripts for analysis. Many of the samples have examples of cases of communication breakdowns, but they also illustrate how teachers maintained interaction with pupils who had yet to acquire an operational mastery of English. This study thus differs from most studies on classroom interaction because of its basic concern with the examination of the resources available to teachers for overcoming the problem areas of classroom communication.
Resumo:
This paper presents a novel prosody model in the context of computer text-to-speech synthesis applications for tone languages. We have demonstrated its applicability using the Standard Yorùbá (SY) language. Our approach is motivated by the theory that abstract and realised forms of various prosody dimensions should be modelled within a modular and unified framework [Coleman, J.S., 1994. Polysyllabic words in the YorkTalk synthesis system. In: Keating, P.A. (Ed.), Phonological Structure and Forms: Papers in Laboratory Phonology III, Cambridge University Press, Cambridge, pp. 293–324]. We have implemented this framework using the Relational Tree (R-Tree) technique. R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. The underlying assumption of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combine acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. To implement the intonation dimension, fuzzy logic based rules were developed using speech data from native speakers of Yorùbá. The Fuzzy Decision Tree (FDT) and the Classification and Regression Tree (CART) techniques were tested in modelling the duration dimension. For practical reasons, we have selected the FDT for implementing the duration dimension of our prosody model. To establish the effectiveness of our prosody model, we have also developed a Stem-ML prosody model for SY. We have performed both quantitative and qualitative evaluations on our implemented prosody models. The results suggest that, although the R-Tree model does not predict the numerical speech prosody data as accurately as the Stem-ML model, it produces synthetic speech prosody with better intelligibility and naturalness. The R-Tree model is particularly suitable for speech prosody modelling for languages with limited language resources and expertise, e.g. African languages. Furthermore, the R-Tree model is easy to implement, interpret and analyse.
Resumo:
Sentiment classification over Twitter is usually affected by the noisy nature (abbreviations, irregular forms) of tweets data. A popular procedure to reduce the noise of textual data is to remove stopwords by using pre-compiled stopword lists or more sophisticated methods for dynamic stopword identification. However, the effectiveness of removing stopwords in the context of Twitter sentiment classification has been debated in the last few years. In this paper we investigate whether removing stopwords helps or hampers the effectiveness of Twitter sentiment classification methods. To this end, we apply six different stopword identification methods to Twitter data from six different datasets and observe how removing stopwords affects two well-known supervised sentiment classification methods. We assess the impact of removing stopwords by observing fluctuations on the level of data sparsity, the size of the classifier's feature space and its classification performance. Our results show that using pre-compiled lists of stopwords negatively impacts the performance of Twitter sentiment classification approaches. On the other hand, the dynamic generation of stopword lists, by removing those infrequent terms appearing only once in the corpus, appears to be the optimal method to maintaining a high classification performance while reducing the data sparsity and substantially shrinking the feature space
Resumo:
The growth of social networking platforms has drawn a lot of attentions to the need for social computing. Social computing utilises human insights for computational tasks as well as design of systems that support social behaviours and interactions. One of the key aspects of social computing is the ability to attribute responsibility such as blame or praise to social events. This ability helps an intelligent entity account and understand other intelligent entities’ social behaviours, and enriches both the social functionalities and cognitive aspects of intelligent agents. In this paper, we present an approach with a model for blame and praise detection in text. We build our model based on various theories of blame and include in our model features used by humans determining judgment such as moral agent causality, foreknowledge, intentionality and coercion. An annotated corpus has been created for the task of blame and praise detection from text. The experimental results show that while our model gives similar results compared to supervised classifiers on classifying text as blame, praise or others, it outperforms supervised classifiers on more finer-grained classification of determining the direction of blame and praise, i.e., self-blame, blame-others, self-praise or praise-others, despite not using labelled training data.
Resumo:
In Spring 2009, the School of Languages and Social Sciences (LSS) at Aston University responded to a JISC (Joint Information Systems Committee) and Higher Education Academy (HEA) call for partners in Open Educational Resources (OER) projects. This led to participation in not one, but two different OER projects from within one small School of the University. This paper will share, from this unusual position, the experience of our English tutors, who participated in the HumBox Project, led by Languages, Linguistics and Area Studies (LLAS) and will compare the approach taken with the Sociology partnership in the C-SAP OER Project , led by the Centre for Sociology, Anthropology and Politics (C-SAP). These two HEA Subject Centre-led projects have taken different approaches to the challenges of encouraging tutors to deposit teaching resources, as on ongoing process, for others to openly access, download and re-purpose. As the projects draw to a close, findings will be discussed, in relation to the JISC OER call, with an emphasis on examining the language and discourses from the two collaborations to see where there are shared issues and outcomes, or different subject specific concerns to consider.
Resumo:
This paper reconceptualises a classic theory (Kanter 1993[1977]) on gender and leadership in order to provide fresh insights for both sociolinguistic and management thinking. Kanter claimed that there are four approved ‘role traps’ for women leaders in male-dominated organisations: Mother, Pet, Seductress and Iron Maiden, based on familiar historical archetypes of women in power. This paper reinterprets Kanter's construct of role traps in sociolinguistic terms as gendered, discursive resources that senior women utilise proactively to interact with their predominantly male colleagues. Based on a Research Council funded1 study of 14 senior leaders (seven female and seven male) each conducting at least one senior management meeting in the U.K., the paper finds that individual speakers can transform stereotyped subject positions into powerful discursive resources to accomplish the goals of leadership, albeit marked by gender.
Resumo:
Basic literacy skills are fundamental building blocks of education, yet for a very large number of adults tasks such as understanding and using everyday items is a challenge. While research, industry, and policy-making is looking at improving access to textual information for low-literacy adults, the literacy-based demands of today's society are continually increasing. Although many community-based organizations offer resources and support to adults with limited literacy skills, current programs have difficulties reaching and retaining those that would benefit most from them. To address these challenges, the National Research Council of Canada is proposing a technological solution to support literacy programs and to assist low-literacy adults in today's information-centric society: ALEX© – Adult Literacy support application for EXperiential learning. ALEX© has been created together with low-literacy adults, following guidelines for inclusive design of mobile assistive tools. It is a mobile language assistant that is designed to be used both in the classroom and in daily life, in order to help low-literacy adults become increasingly literate and independent.