925 resultados para text analytic approaches
Resumo:
It is a big challenge to acquire correct user profiles for personalized text classification since users may be unsure in providing their interests. Traditional approaches to user profiling adopt machine learning (ML) to automatically discover classification knowledge from explicit user feedback in describing personal interests. However, the accuracy of ML-based methods cannot be significantly improved in many cases due to the term independence assumption and uncertainties associated with them. This paper presents a novel relevance feedback approach for personalized text classification. It basically applies data mining to discover knowledge from relevant and non-relevant text and constraints specific knowledge by reasoning rules to eliminate some conflicting information. We also developed a Dempster-Shafer (DS) approach as the means to utilise the specific knowledge to build high-quality data models for classification. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics support that the proposed technique achieves encouraging performance in comparing with the state-of-the-art relevance feedback models.
Resumo:
Much has been written on Michel Foucault’s reluctance to clearly delineate a research method, particularly with respect to genealogy (Harwood 2000; Meadmore, Hatcher, & McWilliam 2000; Tamboukou 1999). Foucault (1994, p. 288) himself disliked prescription stating, “I take care not to dictate how things should be” and wrote provocatively to disrupt equilibrium and certainty, so that “all those who speak for others or to others” no longer know what to do. It is doubtful, however, that Foucault ever intended for researchers to be stricken by that malaise to the point of being unwilling to make an intellectual commitment to methodological possibilities. Taking criticism of “Foucauldian” discourse analysis as a convenient point of departure to discuss the objectives of poststructural analyses of language, this paper develops what might be called a discursive analytic; a methodological plan to approach the analysis of discourses through the location of statements that function with constitutive effects.
Resumo:
Many older people have difficulties using modern consumer products due to increased product complexity both in terms of functionality and interface design. Previous research has shown that older people have more difficulty in using complex devices intuitively when compared to the younger. Furthermore, increased life expectancy and a falling birth rate have been catalysts for changes in world demographics over the past two decades. This trend also suggests a proportional increase of older people in the work-force. This realisation has led to research on the effective use of technology by older populations in an effort to engage them more productively and to assist them in leading independent lives. Ironically, not enough attention has been paid to the development of interaction design strategies that would actually enable older users to better exploit new technologies. Previous research suggests that if products are designed to reflect people's prior knowledge, they will appear intuitive to use. Since intuitive interfaces utilise domain-specific prior knowledge of users, they require minimal learning for effective interaction. However, older people are very diverse in their capabilities and domain-specific prior knowledge. In addition, ageing also slows down the process of acquiring new knowledge. Keeping these suggestions and limitations in view, the aim of this study was set to investigate possible approaches to developing interfaces that facilitate their intuitive use by older people. In this quest to develop intuitive interfaces for older people, two experiments were conducted that systematically investigated redundancy (the use of both text and icons) in interface design, complexity of interface structure (nested versus flat), and personal user factors such as cognitive abilities, perceived self-efficacy and technology anxiety. All of these factors could interfere with intuitive use. The results from the first experiment suggest that, contrary to what was hypothesised, older people (65+ years) completed the tasks on the text only based interface design faster than on the redundant interface design. The outcome of the second experiment showed that, as expected, older people took more time on a nested interface. However, they did not make significantly more errors compared with younger age groups. Contrary to what was expected, older age groups also did better under anxious conditions. The findings of this study also suggest that older age groups are more heterogeneous in their capabilities and their intuitive use of contemporary technological devices is mediated more by domain-specific technology prior knowledge and by their cognitive abilities, than chronological age. This makes it extremely difficult to develop product interfaces that are entirely intuitive to use. However, by keeping in view the cognitive limitations of older people when interfaces are developed, and using simple text-based interfaces with flat interface structure, would help them intuitively learn and use complex technological products successfully during early encounter with a product. These findings indicate that it might be more pragmatic if interfaces are designed for intuitive learning rather than for intuitive use. Based on this research and the existing literature, a model for adaptable interface design as a strategy for developing intuitively learnable product interfaces was proposed. An adaptable interface can initially use a simple text only interface to help older users to learn and successfully use the new system. Over time, this can be progressively changed to a symbols-based nested interface for more efficient and intuitive use.
Resumo:
Internet services are important part of daily activities for most of us. These services come with sophisticated authentication requirements which may not be handled by average Internet users. The management of secure passwords for example creates an extra overhead which is often neglected due to usability reasons. Furthermore, password-based approaches are applicable only for initial logins and do not protect against unlocked workstation attacks. In this paper, we provide a non-intrusive identity verification scheme based on behavior biometrics where keystroke dynamics based-on free-text is used continuously for verifying the identity of a user in real-time. We improved existing keystroke dynamics based verification schemes in four aspects. First, we improve the scalability where we use a constant number of users instead of whole user space to verify the identity of target user. Second, we provide an adaptive user model which enables our solution to take the change of user behavior into consideration in verification decision. Next, we identify a new distance measure which enables us to verify identity of a user with shorter text. Fourth, we decrease the number of false results. Our solution is evaluated on a data set which we have collected from users while they were interacting with their mail-boxes during their daily activities.
Resumo:
Much has been written on Michel Foucault’s reluctance to clearly delineate a research method, particularly with respect to genealogy (Harwood 2000; Meadmore, Hatcher, & McWilliam 2000; Tamboukou 1999). Foucault (1994, p. 288) himself disliked prescription stating, “I take care not to dictate how things should be” and wrote provocatively to disrupt equilibrium and certainty, so that “all those who speak for others or to others” no longer know what to do. It is doubtful, however, that Foucault ever intended for researchers to be stricken by that malaise to the point of being unwilling to make an intellectual commitment to methodological possibilities. Taking criticism of “Foucauldian” discourse analysis as a convenient point of departure to discuss the objectives of poststructural analyses of language, this paper develops what might be called a discursive analytic; a methodological plan to approach the analysis of discourses through the location of statements that function with constitutive effects.
Resumo:
Previous studies on lay theories of anorexia nervosa have examined the 'accuracy' of lay knowledge, and the identification of factors by family and friends that would encourage early interventions. In contrast to these approaches, we examine lay theories of anorexia nervosa using a critical psychology perspective. We argue that the use of a discourse analysis methodology enables the examination of the construction of lay theories through dominant concepts and ideas. Ten semi-structured interviews with five women and five men aged between 15 and 25 years were carried out. Participants were asked questions about three main aspects of anorexia nervosa: aetiology, treatment and relationship to gender. Each interview was analysed in terms of the structure, function and variability of discourse. Three discourses: sociocultural, individual and femininity, are discussed in relation to the interview questions. We conclude that, in this study, lay theories of anorexia nervosa were structured through key discourses that maintained a separation between sociocultural aspects of anorexia nervosa and individual psychology. This separation exists in dominant psychomedical conceptualizations of anorexia nervosa, reinforcing the concept that it is a form of psychopathology.
Resumo:
Health care interventions in the area of body image disturbance and eating disorders largely involve individual treatment approaches, while prevention and health promotion are relatively underexplored. A review of health promotion activities in the area of body image in Australia revealed three programmes, the most extensive and longest standing having been established in 1992. The aims of this programme are to reduce body image dissatisfaction and inappropriate eating behaviour, especially among women. Because health promotion is concerned with the social aspects of health, it was hypothesized by the authors that a social understanding of body image and eating disorders might be advanced in a health promotion setting and reflected in the approach to practice. In order to examine approaches to body image in health promotion, 10 health professionals responsible for the design and management of this programme participated in a series of semi-structured interviews between 1997 and 2000. Three discursive themes were evident in health workers' explanations of body image problems: (1) cognitive-behavioural themes; (2) gender themes; and (3) socio-cultural themes. While body image problems were constructed as psychological problems that are particularly experienced by women, their origins were largely conceived to be socio-cultural. The implications of these constructions are critically discussed in terms of the approach to health promotion used in this programme.
Resumo:
Previous studies on lay theories of anorexia nervosa have examined the ‘accuracy’ of lay knowledge, and the identification of factors by family and friends that would encourage early interventions (Huon, Brown, & Morris, 1988, 7, 239–252; Murray, Touyz, & Beumont, 1990, 9, 87–93). In contrast to these approaches, we examine lay theories of anorexia nervosa using a critical psychology perspective. We argue that the use of a discourse analysis methodology enables the examination of the construction of lay theories through dominant concepts and ideas. Ten semi-structured interviews with five women and five men aged between 15 and 25 years were carried out. Participants were asked questions about three main aspects of anorexia nervosa: aetiology, treatment and relationship to gender. Each interview was analysed in terms of the structure, function and variability of discourse. Three discourses: sociocultural, individual and femininity, are discussed in relation to the interview questions. We conclude that, in this study, lay theories of anorexia nervosa were structured through key discourses that maintained a separation between sociocultural aspects of anorexia nervosa and individual psychology. This separation exists in dominant psychomedical conceptualizations of anorexia nervosa, reinforcing the concept that it is a form of psychopathology.
Resumo:
Over the last decade, the majority of existing search techniques is either keyword- based or category-based, resulting in unsatisfactory effectiveness. Meanwhile, studies have illustrated that more than 80% of users preferred personalized search results. As a result, many studies paid a great deal of efforts (referred to as col- laborative filtering) investigating on personalized notions for enhancing retrieval performance. One of the fundamental yet most challenging steps is to capture precise user information needs. Most Web users are inexperienced or lack the capability to express their needs properly, whereas the existent retrieval systems are highly sensitive to vocabulary. Researchers have increasingly proposed the utilization of ontology-based tech- niques to improve current mining approaches. The related techniques are not only able to refine search intentions among specific generic domains, but also to access new knowledge by tracking semantic relations. In recent years, some researchers have attempted to build ontological user profiles according to discovered user background knowledge. The knowledge is considered to be both global and lo- cal analyses, which aim to produce tailored ontologies by a group of concepts. However, a key problem here that has not been addressed is: how to accurately match diverse local information to universal global knowledge. This research conducts a theoretical study on the use of personalized ontolo- gies to enhance text mining performance. The objective is to understand user information needs by a \bag-of-concepts" rather than \words". The concepts are gathered from a general world knowledge base named the Library of Congress Subject Headings. To return desirable search results, a novel ontology-based mining approach is introduced to discover accurate search intentions and learn personalized ontologies as user profiles. The approach can not only pinpoint users' individual intentions in a rough hierarchical structure, but can also in- terpret their needs by a set of acknowledged concepts. Along with global and local analyses, another solid concept matching approach is carried out to address about the mismatch between local information and world knowledge. Relevance features produced by the Relevance Feature Discovery model, are determined as representatives of local information. These features have been proven as the best alternative for user queries to avoid ambiguity and consistently outperform the features extracted by other filtering models. The two attempt-to-proposed ap- proaches are both evaluated by a scientific evaluation with the standard Reuters Corpus Volume 1 testing set. A comprehensive comparison is made with a num- ber of the state-of-the art baseline models, including TF-IDF, Rocchio, Okapi BM25, the deploying Pattern Taxonomy Model, and an ontology-based model. The gathered results indicate that the top precision can be improved remarkably with the proposed ontology mining approach, where the matching approach is successful and achieves significant improvements in most information filtering measurements. This research contributes to the fields of ontological filtering, user profiling, and knowledge representation. The related outputs are critical when systems are expected to return proper mining results and provide personalized services. The scientific findings have the potential to facilitate the design of advanced preference mining models, where impact on people's daily lives.
Resumo:
Objective To develop and evaluate machine learning techniques that identify limb fractures and other abnormalities (e.g. dislocations) from radiology reports. Materials and Methods 99 free-text reports of limb radiology examinations were acquired from an Australian public hospital. Two clinicians were employed to identify fractures and abnormalities from the reports; a third senior clinician resolved disagreements. These assessors found that, of the 99 reports, 48 referred to fractures or abnormalities of limb structures. Automated methods were then used to extract features from these reports that could be useful for their automatic classification. The Naive Bayes classification algorithm and two implementations of the support vector machine algorithm were formally evaluated using cross-fold validation over the 99 reports. Result Results show that the Naive Bayes classifier accurately identifies fractures and other abnormalities from the radiology reports. These results were achieved when extracting stemmed token bigram and negation features, as well as using these features in combination with SNOMED CT concepts related to abnormalities and disorders. The latter feature has not been used in previous works that attempted classifying free-text radiology reports. Discussion Automated classification methods have proven effective at identifying fractures and other abnormalities from radiology reports (F-Measure up to 92.31%). Key to the success of these techniques are features such as stemmed token bigrams, negations, and SNOMED CT concepts associated with morphologic abnormalities and disorders. Conclusion This investigation shows early promising results and future work will further validate and strengthen the proposed approaches.
Resumo:
Background: A major challenge for assessing students’ conceptual understanding of STEM subjects is the capacity of assessment tools to reliably and robustly evaluate student thinking and reasoning. Multiple-choice tests are typically used to assess student learning and are designed to include distractors that can indicate students’ incomplete understanding of a topic or concept based on which distractor the student selects. However, these tests fail to provide the critical information uncovering the how and why of students’ reasoning for their multiple-choice selections. Open-ended or structured response questions are one method for capturing higher level thinking, but are often costly in terms of time and attention to properly assess student responses. Purpose: The goal of this study is to evaluate methods for automatically assessing open-ended responses, e.g. students’ written explanations and reasoning for multiple-choice selections. Design/Method: We incorporated an open response component for an online signals and systems multiple-choice test to capture written explanations of students’ selections. The effectiveness of an automated approach for identifying and assessing student conceptual understanding was evaluated by comparing results of lexical analysis software packages (Leximancer and NVivo) to expert human analysis of student responses. In order to understand and delineate the process for effectively analysing text provided by students, the researchers evaluated strengths and weakness for both the human and automated approaches. Results: Human and automated analyses revealed both correct and incorrect associations for certain conceptual areas. For some questions, that were not anticipated or included in the distractor selections, showing how multiple-choice questions alone fail to capture the comprehensive picture of student understanding. The comparison of textual analysis methods revealed the capability of automated lexical analysis software to assist in the identification of concepts and their relationships for large textual data sets. We also identified several challenges to using automated analysis as well as the manual and computer-assisted analysis. Conclusions: This study highlighted the usefulness incorporating and analysing students’ reasoning or explanations in understanding how students think about certain conceptual ideas. The ultimate value of automating the evaluation of written explanations is that it can be applied more frequently and at various stages of instruction to formatively evaluate conceptual understanding and engage students in reflective
Resumo:
It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of large scale terms and data patterns. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, there has been often held the hypothesis that pattern-based methods should perform better than term-based ones in describing user preferences; yet, how to effectively use large scale patterns remains a hard problem in text mining. To make a breakthrough in this challenging issue, this paper presents an innovative model for relevance feature discovery. It discovers both positive and negative patterns in text documents as higher level features and deploys them over low-level features (terms). It also classifies terms into categories and updates term weights based on their specificity and their distributions in patterns. Substantial experiments using this model on RCV1, TREC topics and Reuters-21578 show that the proposed model significantly outperforms both the state-of-the-art term-based methods and the pattern based methods.
Resumo:
Narrative text is a useful way of identifying injury circumstances from the routine emergency department data collections. Automatically classifying narratives based on machine learning techniques is a promising technique, which can consequently reduce the tedious manual classification process. Existing works focus on using Naive Bayes which does not always offer the best performance. This paper proposes the Matrix Factorization approaches along with a learning enhancement process for this task. The results are compared with the performance of various other classification approaches. The impact on the classification results from the parameters setting during the classification of a medical text dataset is discussed. With the selection of right dimension k, Non Negative Matrix Factorization-model method achieves 10 CV accuracy of 0.93.
Resumo:
Experiences showed that developing business applications that base on text analysis normally requires a lot of time and expertise in the field of computer linguistics. Several approaches of integrating text analysis systems with business applications have been proposed, but so far there has been no coordinated approach which would enable building scalable and flexible applications of text analysis in enterprise scenarios. In this paper, a service-oriented architecture for text processing applications in the business domain is introduced. It comprises various groups of processing components and knowledge resources. The architecture, created as a result of our experiences with building natural language processing applications in business scenarios, allows for the reuse of text analysis and other components, and facilitates the development of business applications. We verify our approach by showing how the proposed architecture can be applied to create a text analytics enabled business application that addresses a concrete business scenario. © 2010 IEEE.
Resumo:
Concept mapping involves determining relevant concepts from a free-text input, where concepts are defined in an external reference ontology. This is an important process that underpins many applications for clinical information reporting, derivation of phenotypic descriptions, and a number of state-of-the-art medical information retrieval methods. Concept mapping can be cast into an information retrieval (IR) problem: free-text mentions are treated as queries and concepts from a reference ontology as the documents to be indexed and retrieved. This paper presents an empirical investigation applying general-purpose IR techniques for concept mapping in the medical domain. A dataset used for evaluating medical information extraction is adapted to measure the effectiveness of the considered IR approaches. Standard IR approaches used here are contrasted with the effectiveness of two established benchmark methods specifically developed for medical concept mapping. The empirical findings show that the IR approaches are comparable with one benchmark method but well below the best benchmark.