889 resultados para Text categorization
Resumo:
With the explosion of information resources, there is an imminent need to understand interesting text features or topics in massive text information. This thesis proposes a theoretical model to accurately weight specific text features, such as patterns and n-grams. The proposed model achieves impressive performance in two data collections, Reuters Corpus Volume 1 (RCV1) and Reuters 21578.
Resumo:
In my master’s thesis I analyse mystical Islamic poetry in ritualistic performance context, samā` , focusing on the poetry used by the Chishti Sufis. The work is based on both literary sources and ethnographic material collected in India. The central textual source is Surūd-i Rūhānī, a compilation of mystical poetry. Textual sources, however, can be understood properly only in relation to the living performance context and therefore I also utilise interviews of Sufis and performers of mystical music and recordings of samā` assemblies along with texts. First part of the thesis concentrates on thematic overview of the poems and the process of selecting a suitable text for performance. The poems are written in three languages, viz. in Persian, Urdu and Hindi. Among the authors are both Sufis and non-Sufis. The poems, mystical and non-mystical alike, share the same poetic images and they acquire a mystical meaning when they are set to qawwali music and performed in samā` assemblies. My work includes several translations of verses not previously translated. Latter part of the thesis analyses the musical idiom of qawwali and the ways in which the impact of text on listeners is intensified in performance. Typically the intensification is accomplished in the level of a single poem through three different techniques: using introductory verses, inserting verses between the verses of the main poem and repeating individual units of text. The former two techniques are tied to creating a mystical state in the listeners while the latter aims at sustaining it. It is customary that a listener enraptured by mystical experience offers a monetary contribution to the performers. Thus, intensification of the text’s impact aims at enabling the listeners to experience mystical states.
Resumo:
Relative Constructions with Pronominal Heads in Contemporary Russian Chapter 1 introduces the distinctive syntactic and semantic properties of Russian relative constructions (RCs), which are then divided into two main classes according to the type of the head phrase. The study concentrates on RCs with pronominal heads, which are systematically compared with noun-headed RCs. Chapter 2 clarifies the categorization of pronouns in Russian. The conclusion is that Russian pronouns include only personal, reflexive and wh-pronouns. The remaining words that are traditionally seen as pronouns are actually functional equivalents of determiners. This idea leads to the suggestion that RCs with these determiner-like words as the only constituent of the head phrase are actually headed by zero pronouns. In the other type of RCs with pronominal heads, the head position is occupied by wh-pronouns with clitics expressing different types of indefiniteness and quantification. Comparison of the two types of pronoun-headed RCs shows that the wh-heads and zero-heads share a number of common properties with respect to the grammatical gender, number and person as well as to the semantic distinction between animates and inanimates. The rest of Chapter 2 gives an overview of various uses of wh-pronouns in Russian and an experimental analysis of RCs headed by pronominal adverbs. Chapter 3 discusses fundamental differences between RCs with noun and pronominal heads. One of the main findings is that the choice of the relative pronoun (kto 'who' and chto 'what' versus kotoryj 'which') is motivated by a tendency to reproduce maximally the essential grammatical and semantic properties of the antecedent. Chapter 4 gives a detailed description of the determiner-like words and wh-based heads used in the two types of RCs with pronominal heads. In addition, several issues related to the syntax and semantics of free relatives are discussed. The conclusion is that there is no need to establish a separate category of free relatives in Russian. Chapter 5 discusses the syntax and semantics of correlative and free concessive constructions. They share a number of properties with pronoun-headed RCs and the two are often confused in Russian linguistics. However, a detailed analysis shows that these constructions must be distinguished from RCs. The study combines the methods of functionally-oriented Russian structuralism with some insights from generative syntax.
Resumo:
This thesis explores melodic and harmonic features of heavy metal, and while doing so, explores various methods of music analysis; their applicability and limitations regarding the study of heavy metal music. The study is built on three general hypotheses according to which 1) acoustic characteristics play a significant role for chord constructing in heavy metal, 2) heavy metal has strong ties and similarities with other Western musical styles, and 3) theories and analytical methods of Western art music may be applied to heavy metal. It seems evident that in heavy metal some chord structures appear far more frequently than others. It is suggested here that the fundamental reason for this is the use of guitar distortion effect. Subsequently, theories as to how and under what principles heavy metal is constructed need to be put under discussion; analytical models regarding the classification of consonance and dissonance and chord categorization are here revised to meet the common practices of this music. It is evident that heavy metal is not an isolated style of music; it is seen here as a cultural fusion of various musical styles. Moreover, it is suggested that the theoretical background to the construction of Western music and its analysis can offer invaluable insights to heavy metal. However, the analytical methods need to be reformed to some extent to meet the characteristics of the music. This reformation includes an accommodation of linear and functional theories that has been found rather rarely in music theory and musicology.
Resumo:
This paper considers two special cases of bottleneck grouped assignment problems when n jobs belong to m distinct categories (m < n). Solving these special problems through the available branch and bound algorithms will result in a heavy computational burden. Sequentially identifying nonopitmal variables, this paper provides more efficient methods for those cases. Propositions leading to the algorithms have been established. Numerical examples illustrate the respective algorithms.
Resumo:
The possibilities of developmental rehabilitation. A study on the construction of work relatedness and the customer in Aslak rehabilitation The challenge of work-related rehabilitation is to anticipate the factors threatening work ability and to affect them. The purpose of this study was to analyze how work-related rehabilitation is constructed in practice and what are the challenges and, at the same time, the possibilities of an innovative transformation of rehabilitation when trying to achive this goal. The theoretical basis is cultural-historical activity theory and developmental work research. Based on a historical analysis, I studied rehabilitation activity empirically using the data gathered from one Aslak programme (Aslak = occupationally oriented medical rehabilitation) over two years. I described and analysed the construction of Aslak using ethnographic data and interviews. The data includes audio- and video-recordings of the Aslak course, fieldnotes, documents and other materials used in the course. The study aimed to reveal rehabilitation practices from different perspectives carried out by different stakeholders and participants in the Aslak course. It focused on the Aslak trajectory produced by a multiorganizational subject. I analyzed the rehabilitation activity using the method of ethnographic analysis of infrastructure. The method of analyzing the construction of the object of rehabilititation the customer was a membership categorization analysis (MCD) based on the ethnomethodological research tradition. I analyzed the meanings denoting customers given by different parties during one Aslak process and the relations between the meanings. Based on this analysis, I studied the disturbances, ruptures, and innovations in the rehabilitation activity. The results of the study show that the infrastructure of Aslak has different basic ideas. Aslak is constructed most explicitly on the infrastructure of medical rehabilitation. The second layer has been provided with some tools of identifying and preventing well-defined occupation-specific load factors. However, it has failed to perform a new structure, as Aslak has encountered, at the same time, rapid changes in working life. The study identified some promising markers representing new kinds of work-related rehabilitation ideas, but they proved to be incomplete and fragile. As a consequence of the multilayered infrastructure, the contents of the Aslak course were split into fragmented phases and disconnected themes, which were blocked in by the master idea of medical orientation. Its relationship to work remained weak and obscure. The categorizations of customers in Aslak were manifold and contradictory. According to the results, the possibilities for transforming work-related rehabilitation lie both in changing the orientation to the customer to be more relevant to changing working life and forging the infrastructural innovations related to this change. The results showed that a new work-relatedeness would be difficult but possible to construct. What is needed is the construction of an infrastructure that will support a coherent master idea of work-related rehabilitation over the entire trajectory of a process. A shared idea of a rehabilitation object must be constructed in close collaboration between different stakeholders, such as Kela (the Social Insurance Institution of Finland), occupational health services, work organizations, and rehabilitation institutes. Key words: Aslak rehabilitation, work-related rehabilitation, development of rehabilitation, customer of rehabilitation, developmental work research, analysis of infrastructure, membership category analysis
Resumo:
The aim of this research was to explore how issues of power manifest themselves in bringing up children at home. The starting point for the study was a phenomenon centered, power focused, and theoretically orientated view which also included an empirical part as well. The most common aim of the research was to find out and theoretize of power which is suitable for bringing up children at home. Power was defined and researched on the basis of existing power theories, mostly those presented in anglo-american research on power. For closer investigation I chose the most common categorizations and theories of power, namely, the nature of power, the four dimensions of power, and forms of power. The empirical part of the research consisted of 22 thematic interviews with mothers, fathers and 14 – 16-year-old teenagers from 11 different families. The interviewees were found through snowball sampling. The questions for the interviews were based on power theories. The result of the research was that the most common categorizations and theories of power were useful but not satisfactory in the study of power in bringing up children at home. The nature of authority in bringing up children at home appears to have same characteristics as the categorization of authority put forward by Weber but in addition it included extra categories called moral authority and ontological-existential authority. Theoretically the most challenging problem concerns the conflict between modern and postmodern views of power. None of them alone is able to describe power in bringing up children at home. The best solution appeared to be to add an assumption about the inner relation to the modern power view and an assumption about the Popperian three worlds to the postmodern view of power. The relationship between the parent and the child is necessary the inner power relation where the relation itself modifies the parties’ identities. In that case positive and productive elements are also included in the power relationship. Parents use many forms of power in bringing up children at home. Manipulative and violent forms of power are not justifiable but other forms of power and their open exercise is sometimes necessary. The important criteria to use in order to determine the most suitable forms of power and the most appropriate ways of exercising that power is to see how they improve the development of the identity and internalization of values of the child. An ethically justified exercise of power in bringing up children at home is based on a dialogical, pedagogical relationship between the parent and the child, focuses on the relationship between the parent and the child, orientates itself further than present, aspires to promote the good of the child, and comes true in a caring atmosphere.
Resumo:
Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.
Resumo:
The aims of this study were to examine how workers' negative age stereotypes (i.e., denying older workers' ability to develop) and negative meta-stereotypes (i.e., beliefs that the majority of colleagues feel negative about older workers) are related to their attitudes towards retirement (i.e., occupational future time perspective and intention to retire), and whether the strength of these relationships is influenced by workers' self-categorization as an “older” person. Results of a study among Dutch taxi drivers provided mixed support for the hypotheses. Negative meta-stereotypes, but not negative age stereotypes, were associated with fewer perceived opportunities until retirement and, in turn, a stronger intention to retire. Self-categorization moderated the relationships between negative age (meta-)stereotypes and occupational future time perspective. However, contrary to expectations, the relations were stronger among workers with a low self-categorization as an older person in comparison with workers with a high self-categorization in this regard. Overall, results highlight the importance of psychosocial processes in the study of retirement intentions and their antecedents.
Resumo:
Objective Melanoma is on the rise, especially in Caucasian populations exposed to high ultraviolet radiation such as in Australia. This paper examined the psychological components facilitating change in skin cancer prevention or early detection behaviours following a text message intervention. Methods The Queensland-based participants were 18 to 42 years old, from the Healthy Text study (N = 546). Overall, 512 (94%) participants completed the 12-month follow-up questionnaires. Following the social cognitive model, potential mediators of skin self-examination (SSE) and sun protection behaviour change were examined using stepwise logistic regression models. Results At 12-month follow-up, odds of performing an SSE in the past 12 months were mediated by baseline confidence in finding time to check skin (an outcome expectation), with a change in odds ratio of 11.9% in the SSE group versus the control group when including the mediator. Odds of greater than average sun protective habits index at 12-month follow-up were mediated by (a) an attempt to get a suntan at baseline (an outcome expectation) and (b) baseline sun protective habits index, with a change in odds ratio of 10.0% and 11.8%, respectively in the SSE group versus the control group. Conclusions Few of the suspected mediation pathways were confirmed with the exception of outcome expectations and past behaviours. Future intervention programmes could use alternative theoretical models to elucidate how improvements in health behaviours can optimally be facilitated.
Resumo:
XML documents are becoming more and more common in various environments. In particular, enterprise-scale document management is commonly centred around XML, and desktop applications as well as online document collections are soon to follow. The growing number of XML documents increases the importance of appropriate indexing methods and search tools in keeping the information accessible. Therefore, we focus on content that is stored in XML format as we develop such indexing methods. Because XML is used for different kinds of content ranging all the way from records of data fields to narrative full-texts, the methods for Information Retrieval are facing a new challenge in identifying which content is subject to data queries and which should be indexed for full-text search. In response to this challenge, we analyse the relation of character content and XML tags in XML documents in order to separate the full-text from data. As a result, we are able to both reduce the size of the index by 5-6\% and improve the retrieval precision as we select the XML fragments to be indexed. Besides being challenging, XML comes with many unexplored opportunities which are not paid much attention in the literature. For example, authors often tag the content they want to emphasise by using a typeface that stands out. The tagged content constitutes phrases that are descriptive of the content and useful for full-text search. They are simple to detect in XML documents, but also possible to confuse with other inline-level text. Nonetheless, the search results seem to improve when the detected phrases are given additional weight in the index. Similar improvements are reported when related content is associated with the indexed full-text including titles, captions, and references. Experimental results show that for certain types of document collections, at least, the proposed methods help us find the relevant answers. Even when we know nothing about the document structure but the XML syntax, we are able to take advantage of the XML structure when the content is indexed for full-text search.