27 resultados para vignette in-text
em Aston University Research Archive
Resumo:
This thesis sets out to investigate the role of cohesion in the organisation and processing of three text types in English and Arabic. In other words, it attempts to shed some light on the descriptive and explanatory power of cohesion in different text typologies. To this effect, three text types, namely, literary fictional narrative, newspaper editorial and science were analysed to ascertain the intra- and inter-sentential trends in textual cohesion characteristic of each text type in each language. In addition, two small scale experiments which aimed at exploring the facilitatory effect of one cohesive device (i.e. lexical repetition) on the comprehension of three English text types by Arab learners were carried out. The first experiment examined this effect in an English science text; the second covered three English text types, i.e. fictional narrative, culturally-oriented and science. Some interesting and significant results have emerged from the textual analysis and the pilot studies. Most importantly, each text type tends to utilize the cohesive trends that are compatible with its readership, reader knowledge, reading style and pedagogical purpose. Whereas fictional narratives largely cohere through pronominal co-reference, editorials and science texts derive much cohesion from lexical repetition. As for cross-language differences English opts for economy in the use of cohesive devices, while Arabic largely coheres through the redundant effect created by the high frequency of most of those devices. Thus, cohesion is proved to be a variable rather than a homogeneous phenomenon which is dictated by text type among other factors. The results of the experiments suggest that lexical repetition does facilitate the comprehension of English texts by Arab learners. Fictional narratives are found to be easier to process and understand than expository texts. Consequently, cohesion can assist in the processing of text as it can in its creation.
Resumo:
Most research in the area of emotion detection in written text focused on detecting explicit expressions of emotions in text. In this paper, we present a rule-based pipeline approach for detecting implicit emotions in written text without emotion-bearing words based on the OCC Model. We have evaluated our approach on three different datasets with five emotion categories. Our results show that the proposed approach outperforms the lexicon matching method consistently across all the three datasets by a large margin of 17–30% in F-measure and gives competitive performance compared to a supervised classifier. In particular, when dealing with formal text which follows grammatical rules strictly, our approach gives an average F-measure of 82.7% on “Happy”, “Angry-Disgust” and “Sad”, even outperforming the supervised baseline by nearly 17% in F-measure. Our preliminary results show the feasibility of the approach for the task of implicit emotion detection in written text.
Resumo:
The growth of social networking platforms has drawn a lot of attentions to the need for social computing. Social computing utilises human insights for computational tasks as well as design of systems that support social behaviours and interactions. One of the key aspects of social computing is the ability to attribute responsibility such as blame or praise to social events. This ability helps an intelligent entity account and understand other intelligent entities’ social behaviours, and enriches both the social functionalities and cognitive aspects of intelligent agents. In this paper, we present an approach with a model for blame and praise detection in text. We build our model based on various theories of blame and include in our model features used by humans determining judgment such as moral agent causality, foreknowledge, intentionality and coercion. An annotated corpus has been created for the task of blame and praise detection from text. The experimental results show that while our model gives similar results compared to supervised classifiers on classifying text as blame, praise or others, it outperforms supervised classifiers on more finer-grained classification of determining the direction of blame and praise, i.e., self-blame, blame-others, self-praise or praise-others, despite not using labelled training data.
Resumo:
This study presents a detailed contrastive description of the textual functioning of connectives in English and Arabic. Particular emphasis is placed on the organisational force of connectives and their role in sustaining cohesion. The description is intended as a contribution for a better understanding of the variations in the dominant tendencies for text organisation in each language. The findings are expected to be utilised for pedagogical purposes, particularly in improving EFL teaching of writing at the undergraduate level. The study is based on an empirical investigation of the phenomenon of connectivity and, for optimal efficiency, employs computer-aided procedures, particularly those adopted in corpus linguistics, for investigatory purposes. One important methodological requirement is the establishment of two comparable and statistically adequate corpora, also the design of software and the use of existing packages and to achieve the basic analysis. Each corpus comprises ca 250,000 words of newspaper material sampled in accordance to a specific set of criteria and assembled in machine readable form prior to the computer-assisted analysis. A suite of programmes have been written in SPITBOL to accomplish a variety of analytical tasks, and in particular to perform a battery of measurements intended to quantify the textual functioning of connectives in each corpus. Concordances and some word lists are produced by using OCP. Results of these researches confirm the existence of fundamental differences in text organisation in Arabic in comparison to English. This manifests itself in the way textual operations of grouping and sequencing are performed and in the intensity of the textual role of connectives in imposing linearity and continuity and in maintaining overall stability. Furthermore, computation of connective functionality and range of operationality has identified fundamental differences in the way favourable choices for text organisation are made and implemented.
Resumo:
Abstract: Loss of central vision caused by age-related macular degeneration (AMD) is a problem affecting increasingly large numbers of people within the ageing population. AMD is the leading cause of blindness in the developed world, with estimates of over 600,000 people affected in the UK . Central vision loss can be devastating for the sufferer, with vision loss impacting on the ability to carry out daily activities. In particular, inability to read is linked to higher rates of depression in AMD sufferers compared to age-matched controls. Methods to improve reading ability in the presence of central vision loss will help maintain independence and quality of life for those affected. Various attempts to improve reading with central vision loss have been made. Most textual manipulations, including font size, have led to only modest gains in reading speed. Previous experimental work and theoretical arguments on spatial integrative properties of the peripheral retina suggest that ‘visual crowding’ may be a major factor contributing to inefficient reading. Crowding refers to the phenomena in which juxtaposed targets viewed eccentrically may be difficult to identify. Manipulating text spacing of reading material may be a simple method that reduces crowding and benefits reading ability in macular disease patients. In this thesis the effect of textual manipulation on reading speed was investigated, firstly for normally sighted observers using eccentric viewing, and secondly for observers with central vision loss. Test stimuli mimicked normal reading conditions by using whole sentences that required normal saccadic eye movements and observer comprehension. Preliminary measures on normally-sighted observers (n = 2) used forced-choice procedures in conjunction with the method of constant stimuli. Psychometric functions relating the proportion of correct responses to exposure time were determined for text size, font type (Lucida Sans and Times New Roman) and text spacing, with threshold exposure time (75% correct responses) used as a measure of reading performance. The results of these initial measures were used to derive an appropriate search space, in terms of text spacing, for assessing reading performance in AMD patients. The main clinical measures were completed on a group of macular disease sufferers (n=24). Firstly, high and low contrast reading acuity and critical print size were measured using modified MNREAD test charts, and secondly, the effect of word and line spacing was investigated using a new test, designed specifically for this study, called the Equal Readability Passages (ERP) test. The results from normally-sighted observers were in close agreement with those from the group of macular disease sufferers. Results show that: (i) optimum reading performance was achieved when using both double line and double word spacing; (ii) the effect of line spacing was greater than the effect of word spacing (iii) a text size of approximately 0.85o is sufficiently large for reading at 5o eccentricity. In conclusion, the results suggest that crowding is detrimental to reading with peripheral vision, and its effects can be minimized with a modest increase in text spacing.
Resumo:
Germany's latest attempt at unification raises again the question of German nationhood and nationality. The present study examines the links between the development of the German language and the political history of Germany, principally in the nineteenth and twentieth centuries. By examining the role of language in the establishment and exercise of political power and in the creation of national and group solidarity in Germany, the study both provides insights into the nature of language as political action and contributes to the socio-cultural history of the German language. The language-theoretical hypothesis on which the study is based sees language as a central factor in political action, and opposes the notion that language is a reflection of underlying political 'realities' which exist independently of language. Language is viewed as language-in-text which performs identifiable functions. Following Leech, five functions are distinguished, two of which (the regulative and the phatic) are regarded as central to political processes. The phatic function is tested against the role of the German language as a creator and symbol of national identity, with particular attention being paid to concepts of the 'purity' of the language. The regulative function (under which a persuasive function is also subsumed) is illustrated using the examples of German fascist discourse and selected cases from German history post-1945. In addition, the interactions are examined between language change and socio-economic change by postulating that language change is both a condition and consequence of socio-economic change, in that socio-economic change both requires and conditions changes in the communicative environment. Finally, three politocolinguistic case studies from the eight and ninth decades of the twentieth century are introduced in order to demonstrate specific ways in which language has been deployed in an attempt to create political realities, thus verifying the initial hypothesis of the centrality of language to the political process.
Resumo:
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework called joint sentiment-topic (JST) model based on latent Dirichlet allocation (LDA), which detects sentiment and topic simultaneously from text. A reparameterized version of the JST model called Reverse-JST, obtained by reversing the sequence of sentiment and topic generation in the modeling process, is also studied. Although JST is equivalent to Reverse-JST without a hierarchical prior, extensive experiments show that when sentiment priors are added, JST performs consistently better than Reverse-JST. Besides, unlike supervised approaches to sentiment classification which often fail to produce satisfactory performance when shifting to other domains, the weakly supervised nature of JST makes it highly portable to other domains. This is verified by the experimental results on data sets from five different domains where the JST model even outperforms existing semi-supervised approaches in some of the data sets despite using no labeled documents. Moreover, the topics and topic sentiment detected by JST are indeed coherent and informative. We hypothesize that the JST model can readily meet the demand of large-scale sentiment analysis from the web in an open-ended fashion.
Resumo:
In this paper we present the design and analysis of an intonation model for text-to-speech (TTS) synthesis applications using a combination of Relational Tree (RT) and Fuzzy Logic (FL) technologies. The model is demonstrated using the Standard Yorùbá (SY) language. In the proposed intonation model, phonological information extracted from text is converted into an RT. RT is a sophisticated data structure that represents the peaks and valleys as well as the spatial structure of a waveform symbolically in the form of trees. An initial approximation to the RT, called Skeletal Tree (ST), is first generated algorithmically. The exact numerical values of the peaks and valleys on the ST is then computed using FL. Quantitative analysis of the result gives RMSE of 0.56 and 0.71 for peak and valley respectively. Mean Opinion Scores (MOS) of 9.5 and 6.8, on a scale of 1 - -10, was obtained for intelligibility and naturalness respectively.
Resumo:
This study focuses on the interactional functions of non-standard spelling, in particular letter repetition, used in text-based computer-mediated communication as a means of non-verbal signalling. The aim of this paper is to assess the current state of non-verbal cue research in computer-mediated discourse and demonstrate the need for a more comprehensive and methodologically rigorous exploration of written non-verbal signalling. The study proposes a contextual and usage-centered view of written paralanguage. Through illustrative, close linguistic analyses the study proves that previous approaches to non-standard spelling based on their relation to the spoken word might not account for the complexities of this CMC cue, and in order to further our understanding of their interactional functions it is more fruitful to describe the role they play during the contextualisation of the verbal messages. The interactional sociolinguistic approach taken in the analysis demonstrates the range of interactional functions letter repetition can achieve, including contribution to the inscription of socio-emotional information into writing, to the evoking of auditory cues or to a display of informality through using a relaxed writing style.
Resumo:
In this chapter we outline a sensory-linguistic approach to the, study of reading skill development. We call this a sensory-linguistic approach because the focus of interest is on the relationship between basic sensory processing skills and the ability to extract efficiently the orthographic and phonological information available in text during reading. Our review discusses how basic sensory processing deficits are associated with developmental dyslexia, and how these impairments may degrade word-decoding skills. We then review studies that demonstrate a more direct relationship between sensitivity to particular types of auditory and visual stimuli and the normal development of literacy skills. Specifically, we suggest that the phonological and orthographic skills engaged while reading are constrained by the ability to detect and discriminate dynamic stimuli in the auditory and visual systems respectively.
Resumo:
A major challenge in text mining for biomedicine is automatically extracting protein-protein interactions from the vast amount of biomedical literature. We have constructed an information extraction system based on the Hidden Vector State (HVS) model for protein-protein interactions. The HVS model can be trained using only lightly annotated data whilst simultaneously retaining sufficient ability to capture the hierarchical structure. When applied in extracting protein-protein interactions, we found that it performed better than other established statistical methods and achieved 61.5% in F-score with balanced recall and precision values. Moreover, the statistical nature of the pure data-driven HVS model makes it intrinsically robust and it can be easily adapted to other domains.
Resumo:
During the last decade, biomedicine has witnessed a tremendous development. Large amounts of experimental and computational biomedical data have been generated along with new discoveries, which are accompanied by an exponential increase in the number of biomedical publications describing these discoveries. In the meantime, there has been a great interest with scientific communities in text mining tools to find knowledge such as protein-protein interactions, which is most relevant and useful for specific analysis tasks. This paper provides a outline of the various information extraction methods in biomedical domain, especially for discovery of protein-protein interactions. It surveys methodologies involved in plain texts analyzing and processing, categorizes current work in biomedical information extraction, and provides examples of these methods. Challenges in the field are also presented and possible solutions are discussed.
Resumo:
Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet Allocation (LDA), called joint sentiment/topic model (JST), which detects sentiment and topic simultaneously from text. Unlike other machine learning approaches to sentiment classification which often require labeled corpora for classifier training, the proposed JST model is fully unsupervised. The model has been evaluated on the movie review dataset to classify the review sentiment polarity and minimum prior information have also been explored to further improve the sentiment classification accuracy. Preliminary experiments have shown promising results achieved by JST.
Resumo:
While much of a company's knowledge can be found in text repositories, current content management systems have limited capabilities for structuring and interpreting documents. In the emerging Semantic Web, search, interpretation and aggregation can be addressed by ontology-based semantic mark-up. In this paper, we examine semantic annotation, identify a number of requirements, and review the current generation of semantic annotation systems. This analysis shows that, while there is still some way to go before semantic annotation tools will be able to address fully all the knowledge management needs, research in the area is active and making good progress.