101 resultados para Text categorization


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information retrieval (IR) by clinicians in the healthcare setting is critical for informing clinical decision-making. However, a large part of this information is in the form of free-text and inhibits clinical decision support and effective healthcare services. This makes meaningful use of clinical free-­text in electronic health records (EHRs) for patient care a difficult task. Within the context of IR, given a repository of free-­text clinical reports, one might want to retrieve and analyse data for patients who have a known clinical finding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tax law and policy is a vital part of Australian society. Australian society insists that the Federal Government provide extensive public programs, such as health services, education, social security, foreign aid, legal infra¬structure, regulation, police services, national defence and funding for sports development. These programs are costly to provide and are funded by taxation. The aim of this book is to introduce and explain the principles of tax law and tax policy in plain English. The book contains detailed commentary on tax principles together with extracts from cases and materials that illustrate the application of the principles. The book considers tax policy and the economic and social aspects of tax law. While tax students must develop technical competence in tax law, given the speed with which changes are made to the technical details of tax law, it is also important to grasp tax principles and policy to understand why tax law has changed or why it should change. The chapters are structured to direct readers to the key provisions of the tax law. Each case is introduced by an explanation of the facts, followed by the taxpayer’s arguments, the Commissioner’s assertions and the decision of the Administrative Appeals Tribunal or a court. The commentary guides readers through the issues considered in the judgments. The book contains extracts from: articles; materials dealing with tax policy; and the Commissioner’s rulings. The book also has references for further reading and medium-neutral citations (Internet citations) for cases decided since 1998.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the overwhelming increase in the amount of texts on the web, it is almost impossible for people to keep abreast of up-to-date information. Text mining is a process by which interesting information is derived from text through the discovery of patterns and trends. Text mining algorithms are used to guarantee the quality of extracted knowledge. However, the extracted patterns using text or data mining algorithms or methods leads to noisy patterns and inconsistency. Thus, different challenges arise, such as the question of how to understand these patterns, whether the model that has been used is suitable, and if all the patterns that have been extracted are relevant. Furthermore, the research raises the question of how to give a correct weight to the extracted knowledge. To address these issues, this paper presents a text post-processing method, which uses a pattern co-occurrence matrix to find the relation between extracted patterns in order to reduce noisy patterns. The main objective of this paper is not only reducing the number of closed sequential patterns, but also improving the performance of pattern mining as well. The experimental results on Reuters Corpus Volume 1 data collection and TREC filtering topics show that the proposed method is promising.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Internet services are important part of daily activities for most of us. These services come with sophisticated authentication requirements which may not be handled by average Internet users. The management of secure passwords for example creates an extra overhead which is often neglected due to usability reasons. Furthermore, password-based approaches are applicable only for initial logins and do not protect against unlocked workstation attacks. In this paper, we provide a non-intrusive identity verification scheme based on behavior biometrics where keystroke dynamics based-on free-text is used continuously for verifying the identity of a user in real-time. We improved existing keystroke dynamics based verification schemes in four aspects. First, we improve the scalability where we use a constant number of users instead of whole user space to verify the identity of target user. Second, we provide an adaptive user model which enables our solution to take the change of user behavior into consideration in verification decision. Next, we identify a new distance measure which enables us to verify identity of a user with shorter text. Fourth, we decrease the number of false results. Our solution is evaluated on a data set which we have collected from users while they were interacting with their mail-boxes during their daily activities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A big challenge for classification on text is the noisy of text data. It makes classification quality low. Many classification process can be divided into two sequential steps scoring and threshold setting (thresholding). Therefore to deal with noisy data problem, it is important to describe positive feature effectively scoring and to set a suitable threshold. Most existing text classifiers do not concentrate on these two jobs. In this paper, we propose a novel text classifier with pattern-based scoring that describe positive feature effectively, followed by threshold setting. The thresholding is based on score of training set, make it is simple to implement in other scoring methods. Experiment shows that our pattern-based classifier is promising.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Much has been written on Michel Foucault’s reluctance to clearly delineate a research method, particularly with respect to genealogy (Harwood 2000; Meadmore, Hatcher, & McWilliam 2000; Tamboukou 1999). Foucault (1994, p. 288) himself disliked prescription stating, “I take care not to dictate how things should be” and wrote provocatively to disrupt equilibrium and certainty, so that “all those who speak for others or to others” no longer know what to do. It is doubtful, however, that Foucault ever intended for researchers to be stricken by that malaise to the point of being unwilling to make an intellectual commitment to methodological possibilities. Taking criticism of “Foucauldian” discourse analysis as a convenient point of departure to discuss the objectives of poststructural analyses of language, this paper develops what might be called a discursive analytic; a methodological plan to approach the analysis of discourses through the location of statements that function with constitutive effects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics in documents. The performance of text categorisation relies on the quality of samples, effectiveness of document features, and the topic coverage of categories, depending on the employing strategies; supervised or unsupervised; single labelled or multi-labelled. Attempting to deal with these reliability issues in text categorisation, we propose an unsupervised multi-labelled text categorisation approach that maps the local knowledge in documents to global knowledge in a world ontology to optimise categorisation result. The conceptual framework of the approach consists of three modules; pattern mining for feature extraction; feature-subject mapping for categorisation; concept generalisation for optimised categorisation. The approach has been promisingly evaluated by compared with typical text categorisation methods, based on the ground truth encoded by human experts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a classification problem typically we face two challenging issues, the diverse characteristic of negative documents and sometimes a lot of negative documents that are closed to positive documents. Therefore, it is hard for a single classifier to clearly classify incoming documents into classes. This paper proposes a novel gradual problem solving to create a two-stage classifier. The first stage identifies reliable negatives (negative documents with weak positive characteristics). It concentrates on minimizing the number of false negative documents (recall-oriented). We use Rocchio, an existing recall based classifier, for this stage. The second stage is a precision-oriented “fine tuning”, concentrates on minimizing the number of false positive documents by applying pattern (a statistical phrase) mining techniques. In this stage a pattern-based scoring is followed by threshold setting (thresholding). Experiment shows that our statistical phrase based two-stage classifier is promising.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The question as to whether poser race affects the happy categorization advantage, the faster categorization of happy than of negative emotional expressions, has been answered inconsistently. Hugenberg (2005) found the happy categorization advantage only for own race faces whereas faster categorization of angry expressions was evident for other race faces. Kubota and Ito (2007) found a happy categorization advantage for both own race and other race faces. These results have vastly different implications for understanding the influence of race cues on the processing of emotional expressions. The current study replicates the results of both prior studies and indicates that face type (computer-generated vs. photographic), presentation duration, and especially stimulus set size influence the happy categorization advantage as well as the moderating effect of poser race.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We propose a cluster ensemble method to map the corpus documents into the semantic space embedded in Wikipedia and group them using multiple types of feature space. A heterogeneous cluster ensemble is constructed with multiple types of relations i.e. document-term, document-concept and document-category. A final clustering solution is obtained by exploiting associations between document pairs and hubness of the documents. Empirical analysis with various real data sets reveals that the proposed meth-od outperforms state-of-the-art text clustering approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reliability of the performance of biometric identity verification systems remains a significant challenge. Individual biometric samples of the same person (identity class) are not identical at each presentation and performance degradation arises from intra-class variability and inter-class similarity. These limitations lead to false accepts and false rejects that are dependent. It is therefore difficult to reduce the rate of one type of error without increasing the other. The focus of this dissertation is to investigate a method based on classifier fusion techniques to better control the trade-off between the verification errors using text-dependent speaker verification as the test platform. A sequential classifier fusion architecture that integrates multi-instance and multisample fusion schemes is proposed. This fusion method enables a controlled trade-off between false alarms and false rejects. For statistically independent classifier decisions, analytical expressions for each type of verification error are derived using base classifier performances. As this assumption may not be always valid, these expressions are modified to incorporate the correlation between statistically dependent decisions from clients and impostors. The architecture is empirically evaluated by applying the proposed architecture for text dependent speaker verification using the Hidden Markov Model based digit dependent speaker models in each stage with multiple attempts for each digit utterance. The trade-off between the verification errors is controlled using the parameters, number of decision stages (instances) and the number of attempts at each decision stage (samples), fine-tuned on evaluation/tune set. The statistical validation of the derived expressions for error estimates is evaluated on test data. The performance of the sequential method is further demonstrated to depend on the order of the combination of digits (instances) and the nature of repetitive attempts (samples). The false rejection and false acceptance rates for proposed fusion are estimated using the base classifier performances, the variance in correlation between classifier decisions and the sequence of classifiers with favourable dependence selected using the 'Sequential Error Ratio' criteria. The error rates are better estimated by incorporating user-dependent (such as speaker-dependent thresholds and speaker-specific digit combinations) and class-dependent (such as clientimpostor dependent favourable combinations and class-error based threshold estimation) information. The proposed architecture is desirable in most of the speaker verification applications such as remote authentication, telephone and internet shopping applications. The tuning of parameters - the number of instances and samples - serve both the security and user convenience requirements of speaker-specific verification. The architecture investigated here is applicable to verification using other biometric modalities such as handwriting, fingerprints and key strokes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Managing large cohorts of undergraduate student nurses during off-campus clinical placement is complex and challenging. Clinical facilitators are required to support and assess nursing students during clinical placement. Therefore clear communication between university academic coordinators and clinical facilitators is essential for consistency and prompt management of emerging issues. Increasing work demands require both coordinators and facilitators to have an efficient and effective mode of communication. The aim of this study was to explore the use of Short Message Service (SMS) texts, sent between mobile phones, for communication between university Unit Coordinators and off-campus Clinical Facilitators. This study used an after-only design. During a two week clinical placement 46 clinical facilitators working with first and second year Bachelor of Nursing students from a large metropolitan Australian university were regularly sent SMS texts of relevant updates and reminders from the university coordinator. A 15 item questionnaire comprising x of 5 point likert scale and 3 open-ended questions was then used to survey the clinical facilitators. The response rate was 47.8% (n=22). Correlations were found between the approachability of the coordinator and facilitator perception of a) that the coordinator understood issues on clinical placement (r=0.785, p<0.001,), and b) being part of the teaching team (r=0.768, p<0.001). Analysis of responses to qualitative questions revealed three themes: connection, approachability and collaboration. Results indicate that SMS communication is convenient and appropriate in this setting. This quasi-experimental after-test study found regular SMS communication improves a sense of connection, approachability and collaboration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Over the last decade, the majority of existing search techniques is either keyword- based or category-based, resulting in unsatisfactory effectiveness. Meanwhile, studies have illustrated that more than 80% of users preferred personalized search results. As a result, many studies paid a great deal of efforts (referred to as col- laborative filtering) investigating on personalized notions for enhancing retrieval performance. One of the fundamental yet most challenging steps is to capture precise user information needs. Most Web users are inexperienced or lack the capability to express their needs properly, whereas the existent retrieval systems are highly sensitive to vocabulary. Researchers have increasingly proposed the utilization of ontology-based tech- niques to improve current mining approaches. The related techniques are not only able to refine search intentions among specific generic domains, but also to access new knowledge by tracking semantic relations. In recent years, some researchers have attempted to build ontological user profiles according to discovered user background knowledge. The knowledge is considered to be both global and lo- cal analyses, which aim to produce tailored ontologies by a group of concepts. However, a key problem here that has not been addressed is: how to accurately match diverse local information to universal global knowledge. This research conducts a theoretical study on the use of personalized ontolo- gies to enhance text mining performance. The objective is to understand user information needs by a \bag-of-concepts" rather than \words". The concepts are gathered from a general world knowledge base named the Library of Congress Subject Headings. To return desirable search results, a novel ontology-based mining approach is introduced to discover accurate search intentions and learn personalized ontologies as user profiles. The approach can not only pinpoint users' individual intentions in a rough hierarchical structure, but can also in- terpret their needs by a set of acknowledged concepts. Along with global and local analyses, another solid concept matching approach is carried out to address about the mismatch between local information and world knowledge. Relevance features produced by the Relevance Feature Discovery model, are determined as representatives of local information. These features have been proven as the best alternative for user queries to avoid ambiguity and consistently outperform the features extracted by other filtering models. The two attempt-to-proposed ap- proaches are both evaluated by a scientific evaluation with the standard Reuters Corpus Volume 1 testing set. A comprehensive comparison is made with a num- ber of the state-of-the art baseline models, including TF-IDF, Rocchio, Okapi BM25, the deploying Pattern Taxonomy Model, and an ontology-based model. The gathered results indicate that the top precision can be improved remarkably with the proposed ontology mining approach, where the matching approach is successful and achieves significant improvements in most information filtering measurements. This research contributes to the fields of ontological filtering, user profiling, and knowledge representation. The related outputs are critical when systems are expected to return proper mining results and provide personalized services. The scientific findings have the potential to facilitate the design of advanced preference mining models, where impact on people's daily lives.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background The prevalence of type 2 diabetes is rising internationally. Patients with diabetes have a higher risk of cardiovascular events accounting for substantial premature morbidity and mortality, and health care expenditure. Given healthcare workforce limitations, there is a need to improve interventions that promote positive self-management behaviours that enable patients to manage their chronic conditions effectively, across different cultural contexts. Previous studies have evaluated the feasibility of including telephone and Short Message Service (SMS) follow up in chronic disease self-management programs, but only for single diseases or in one specific population. Therefore, the aim of this study is to evaluate the feasibility and short-term efficacy of incorporating telephone and text messaging to support the care of patients with diabetes and cardiac disease, in Australia and in Taiwan. Methods/design A randomised controlled trial design will be used to evaluate a self-management program for people with diabetes and cardiac disease that incorporates the use of simple remote-access communication technologies. A sample size of 180 participants from Australia and Taiwan will be recruited and randomised in a one-to-one ratio to receive either the intervention in addition to usual care (intervention) or usual care alone (control). The intervention will consist of in-hospital education as well as follow up utilising personal telephone calls and SMS reminders. Primary short term outcomes of interest include self-care behaviours and self-efficacy assessed at baseline and four weeks. Discussion If the results of this investigation substantiate the feasibility and efficacy of the telephone and SMS intervention for promoting self management among patients with diabetes and cardiac disease in Australia and Taiwan, it will support the external validity of the intervention. It is anticipated that empirical data from this investigation will provide valuable information to inform future international collaborations, while providing a platform for further enhancements of the program, which has potential to benefit patients internationally.