898 resultados para Text mining, Classificazione, Stemming, Text categorization
Resumo:
Retrieving information from Twitter is always challenging due to its large volume, inconsistent writing and noise. Most existing information retrieval (IR) and text mining methods focus on term-based approach, but suffers from the problems of terms variation such as polysemy and synonymy. This problem deteriorates when such methods are applied on Twitter due to the length limit. Over the years, people have held the hypothesis that pattern-based methods should perform better than term-based methods as it provides more context, but limited studies have been conducted to support such hypothesis especially in Twitter. This paper presents an innovative framework to address the issue of performing IR in microblog. The proposed framework discover patterns in tweets as higher level feature to assign weight for low-level features (i.e. terms) based on their distributions in higher level features. We present the experiment results based on TREC11 microblog dataset and shows that our proposed approach significantly outperforms term-based methods Okapi BM25, TF-IDF and pattern based methods, using precision, recall and F measures.
Resumo:
Much has been written on Michel Foucault’s reluctance to clearly delineate a research method, particularly with respect to genealogy (Harwood 2000; Meadmore, Hatcher, & McWilliam 2000; Tamboukou 1999). Foucault (1994, p. 288) himself disliked prescription stating, “I take care not to dictate how things should be” and wrote provocatively to disrupt equilibrium and certainty, so that “all those who speak for others or to others” no longer know what to do. It is doubtful, however, that Foucault ever intended for researchers to be stricken by that malaise to the point of being unwilling to make an intellectual commitment to methodological possibilities. Taking criticism of “Foucauldian” discourse analysis as a convenient point of departure to discuss the objectives of poststructural analyses of language, this paper develops what might be called a discursive analytic; a methodological plan to approach the analysis of discourses through the location of statements that function with constitutive effects.
Resumo:
Our everyday environment is full of text but this rich source of information remains largely inaccessible to mobile robots. In this paper we describe an active text spotting system that uses a small number of wide angle views to locate putative text in the environment and then foveates and zooms onto that text in order to improve the reliability of text recognition. We present extensive experimental results obtained with a pan/tilt/zoom camera and a ROS-based mobile robot operating in an indoor environment.
Resumo:
Information retrieval (IR) by clinicians in the healthcare setting is critical for informing clinical decision-making. However, a large part of this information is in the form of free-text and inhibits clinical decision support and effective healthcare services. This makes meaningful use of clinical free-text in electronic health records (EHRs) for patient care a difficult task. Within the context of IR, given a repository of free-text clinical reports, one might want to retrieve and analyse data for patients who have a known clinical finding.
Resumo:
Tax law and policy is a vital part of Australian society. Australian society insists that the Federal Government provide extensive public programs, such as health services, education, social security, foreign aid, legal infra¬structure, regulation, police services, national defence and funding for sports development. These programs are costly to provide and are funded by taxation. The aim of this book is to introduce and explain the principles of tax law and tax policy in plain English. The book contains detailed commentary on tax principles together with extracts from cases and materials that illustrate the application of the principles. The book considers tax policy and the economic and social aspects of tax law. While tax students must develop technical competence in tax law, given the speed with which changes are made to the technical details of tax law, it is also important to grasp tax principles and policy to understand why tax law has changed or why it should change. The chapters are structured to direct readers to the key provisions of the tax law. Each case is introduced by an explanation of the facts, followed by the taxpayer’s arguments, the Commissioner’s assertions and the decision of the Administrative Appeals Tribunal or a court. The commentary guides readers through the issues considered in the judgments. The book contains extracts from: articles; materials dealing with tax policy; and the Commissioner’s rulings. The book also has references for further reading and medium-neutral citations (Internet citations) for cases decided since 1998.
Resumo:
Internet services are important part of daily activities for most of us. These services come with sophisticated authentication requirements which may not be handled by average Internet users. The management of secure passwords for example creates an extra overhead which is often neglected due to usability reasons. Furthermore, password-based approaches are applicable only for initial logins and do not protect against unlocked workstation attacks. In this paper, we provide a non-intrusive identity verification scheme based on behavior biometrics where keystroke dynamics based-on free-text is used continuously for verifying the identity of a user in real-time. We improved existing keystroke dynamics based verification schemes in four aspects. First, we improve the scalability where we use a constant number of users instead of whole user space to verify the identity of target user. Second, we provide an adaptive user model which enables our solution to take the change of user behavior into consideration in verification decision. Next, we identify a new distance measure which enables us to verify identity of a user with shorter text. Fourth, we decrease the number of false results. Our solution is evaluated on a data set which we have collected from users while they were interacting with their mail-boxes during their daily activities.
Resumo:
Much has been written on Michel Foucault’s reluctance to clearly delineate a research method, particularly with respect to genealogy (Harwood 2000; Meadmore, Hatcher, & McWilliam 2000; Tamboukou 1999). Foucault (1994, p. 288) himself disliked prescription stating, “I take care not to dictate how things should be” and wrote provocatively to disrupt equilibrium and certainty, so that “all those who speak for others or to others” no longer know what to do. It is doubtful, however, that Foucault ever intended for researchers to be stricken by that malaise to the point of being unwilling to make an intellectual commitment to methodological possibilities. Taking criticism of “Foucauldian” discourse analysis as a convenient point of departure to discuss the objectives of poststructural analyses of language, this paper develops what might be called a discursive analytic; a methodological plan to approach the analysis of discourses through the location of statements that function with constitutive effects.
Resumo:
This thesis is a study for automatic discovery of text features for describing user information needs. It presents an innovative data-mining approach that discovers useful knowledge from both relevance and non-relevance feedback information. The proposed approach can largely reduce noises in discovered patterns and significantly improve the performance of text mining systems. This study provides a promising method for the study of Data Mining and Web Intelligence.
Resumo:
Reliability of the performance of biometric identity verification systems remains a significant challenge. Individual biometric samples of the same person (identity class) are not identical at each presentation and performance degradation arises from intra-class variability and inter-class similarity. These limitations lead to false accepts and false rejects that are dependent. It is therefore difficult to reduce the rate of one type of error without increasing the other. The focus of this dissertation is to investigate a method based on classifier fusion techniques to better control the trade-off between the verification errors using text-dependent speaker verification as the test platform. A sequential classifier fusion architecture that integrates multi-instance and multisample fusion schemes is proposed. This fusion method enables a controlled trade-off between false alarms and false rejects. For statistically independent classifier decisions, analytical expressions for each type of verification error are derived using base classifier performances. As this assumption may not be always valid, these expressions are modified to incorporate the correlation between statistically dependent decisions from clients and impostors. The architecture is empirically evaluated by applying the proposed architecture for text dependent speaker verification using the Hidden Markov Model based digit dependent speaker models in each stage with multiple attempts for each digit utterance. The trade-off between the verification errors is controlled using the parameters, number of decision stages (instances) and the number of attempts at each decision stage (samples), fine-tuned on evaluation/tune set. The statistical validation of the derived expressions for error estimates is evaluated on test data. The performance of the sequential method is further demonstrated to depend on the order of the combination of digits (instances) and the nature of repetitive attempts (samples). The false rejection and false acceptance rates for proposed fusion are estimated using the base classifier performances, the variance in correlation between classifier decisions and the sequence of classifiers with favourable dependence selected using the 'Sequential Error Ratio' criteria. The error rates are better estimated by incorporating user-dependent (such as speaker-dependent thresholds and speaker-specific digit combinations) and class-dependent (such as clientimpostor dependent favourable combinations and class-error based threshold estimation) information. The proposed architecture is desirable in most of the speaker verification applications such as remote authentication, telephone and internet shopping applications. The tuning of parameters - the number of instances and samples - serve both the security and user convenience requirements of speaker-specific verification. The architecture investigated here is applicable to verification using other biometric modalities such as handwriting, fingerprints and key strokes.
Resumo:
Managing large cohorts of undergraduate student nurses during off-campus clinical placement is complex and challenging. Clinical facilitators are required to support and assess nursing students during clinical placement. Therefore clear communication between university academic coordinators and clinical facilitators is essential for consistency and prompt management of emerging issues. Increasing work demands require both coordinators and facilitators to have an efficient and effective mode of communication. The aim of this study was to explore the use of Short Message Service (SMS) texts, sent between mobile phones, for communication between university Unit Coordinators and off-campus Clinical Facilitators. This study used an after-only design. During a two week clinical placement 46 clinical facilitators working with first and second year Bachelor of Nursing students from a large metropolitan Australian university were regularly sent SMS texts of relevant updates and reminders from the university coordinator. A 15 item questionnaire comprising x of 5 point likert scale and 3 open-ended questions was then used to survey the clinical facilitators. The response rate was 47.8% (n=22). Correlations were found between the approachability of the coordinator and facilitator perception of a) that the coordinator understood issues on clinical placement (r=0.785, p<0.001,), and b) being part of the teaching team (r=0.768, p<0.001). Analysis of responses to qualitative questions revealed three themes: connection, approachability and collaboration. Results indicate that SMS communication is convenient and appropriate in this setting. This quasi-experimental after-test study found regular SMS communication improves a sense of connection, approachability and collaboration.
Resumo:
Background The prevalence of type 2 diabetes is rising internationally. Patients with diabetes have a higher risk of cardiovascular events accounting for substantial premature morbidity and mortality, and health care expenditure. Given healthcare workforce limitations, there is a need to improve interventions that promote positive self-management behaviours that enable patients to manage their chronic conditions effectively, across different cultural contexts. Previous studies have evaluated the feasibility of including telephone and Short Message Service (SMS) follow up in chronic disease self-management programs, but only for single diseases or in one specific population. Therefore, the aim of this study is to evaluate the feasibility and short-term efficacy of incorporating telephone and text messaging to support the care of patients with diabetes and cardiac disease, in Australia and in Taiwan. Methods/design A randomised controlled trial design will be used to evaluate a self-management program for people with diabetes and cardiac disease that incorporates the use of simple remote-access communication technologies. A sample size of 180 participants from Australia and Taiwan will be recruited and randomised in a one-to-one ratio to receive either the intervention in addition to usual care (intervention) or usual care alone (control). The intervention will consist of in-hospital education as well as follow up utilising personal telephone calls and SMS reminders. Primary short term outcomes of interest include self-care behaviours and self-efficacy assessed at baseline and four weeks. Discussion If the results of this investigation substantiate the feasibility and efficacy of the telephone and SMS intervention for promoting self management among patients with diabetes and cardiac disease in Australia and Taiwan, it will support the external validity of the intervention. It is anticipated that empirical data from this investigation will provide valuable information to inform future international collaborations, while providing a platform for further enhancements of the program, which has potential to benefit patients internationally.
Resumo:
Purpose Contrast adaptation has been speculated to be an error signal for emmetropization. Myopic children exhibit higher contrast adaptation than emmetropic children. This study aimed to determine whether contrast adaptation varies with the type of text viewed by emmetropic and myopic young adults. Methods Baseline contrast sensitivity was determined in 25 emmetropic and 25 spectacle-corrected myopic young adults for 0.5, 1.2, 2.7, 4.4, and 6.2 cycles per degree (cpd) horizontal sine wave gratings. The adults spent periods looking at a 6.2 cpd high-contrast horizontal grating and reading lines of English and Chinese text (these texts comprised 1.2 cpd row and 6 cpd stroke frequencies). The effects of these near tasks on contrast sensitivity were determined, with decreases in sensitivity indicating contrast adaptation. Results Contrast adaptation was affected by the near task (F2,672 = 43.0; P < 0.001). Adaptation was greater for the grating task (0.13 ± 0.17 log unit, averaged across all frequencies) than reading tasks, but there was no significant difference between the two reading tasks (English 0.05 ± 0.13 log unit versus Chinese 0.04 ± 0.13 log unit). The myopic group showed significantly greater adaptation (by 0.04, 0.04, and 0.05 log units for English, Chinese, and grating tasks, respectively) than the emmetropic group (F1,48 = 5.0; P = 0.03). Conclusions In young adults, reading Chinese text induced similar contrast adaptation as reading English text. Myopes exhibited greater contrast adaptation than emmetropes. Contrast adaptation, independent of text type, might be associated with myopia development.