152 resultados para Rule-Based Classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose – The work presented in this paper aims to provide an approach to classifying web logs by personal properties of users. Design/methodology/approach – The authors describe an iterative system that begins with a small set of manually labeled terms, which are used to label queries from the log. A set of background knowledge related to these labeled queries is acquired by combining web search results on these queries. This background set is used to obtain many terms that are related to the classification task. The system then ranks each of the related terms, choosing those that most fit the personal properties of the users. These terms are then used to begin the next iteration. Findings – The authors identify the difficulties of classifying web logs, by approaching this problem from a machine learning perspective. By applying the approach developed, the authors are able to show that many queries in a large query log can be classified. Research limitations/implications – Testing results in this type of classification work is difficult, as the true personal properties of web users are unknown. Evaluation of the classification results in terms of the comparison of classified queries to well known age-related sites is a direction that is currently being exploring. Practical implications – This research is background work that can be incorporated in search engines or other web-based applications, to help marketing companies and advertisers. Originality/value – This research enhances the current state of knowledge in short-text classification and query log learning. Classification schemes, Computer networks, Information retrieval, Man-machine systems, User interfaces

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In information retrieval (IR) research, more and more focus has been placed on optimizing a query language model by detecting and estimating the dependencies between the query and the observed terms occurring in the selected relevance feedback documents. In this paper, we propose a novel Aspect Language Modeling framework featuring term association acquisition, document segmentation, query decomposition, and an Aspect Model (AM) for parameter optimization. Through the proposed framework, we advance the theory and practice of applying high-order and context-sensitive term relationships to IR. We first decompose a query into subsets of query terms. Then we segment the relevance feedback documents into chunks using multiple sliding windows. Finally we discover the higher order term associations, that is, the terms in these chunks with high degree of association to the subsets of the query. In this process, we adopt an approach by combining the AM with the Association Rule (AR) mining. In our approach, the AM not only considers the subsets of a query as “hidden” states and estimates their prior distributions, but also evaluates the dependencies between the subsets of a query and the observed terms extracted from the chunks of feedback documents. The AR provides a reasonable initial estimation of the high-order term associations by discovering the associated rules from the document chunks. Experimental results on various TREC collections verify the effectiveness of our approach, which significantly outperforms a baseline language model and two state-of-the-art query language models namely the Relevance Model and the Information Flow model

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is a big challenge to acquire correct user profiles for personalized text classification since users may be unsure in providing their interests. Traditional approaches to user profiling adopt machine learning (ML) to automatically discover classification knowledge from explicit user feedback in describing personal interests. However, the accuracy of ML-based methods cannot be significantly improved in many cases due to the term independence assumption and uncertainties associated with them. This paper presents a novel relevance feedback approach for personalized text classification. It basically applies data mining to discover knowledge from relevant and non-relevant text and constraints specific knowledge by reasoning rules to eliminate some conflicting information. We also developed a Dempster-Shafer (DS) approach as the means to utilise the specific knowledge to build high-quality data models for classification. The experimental results conducted on Reuters Corpus Volume 1 and TREC topics support that the proposed technique achieves encouraging performance in comparing with the state-of-the-art relevance feedback models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Inspection of solder joints has been a critical process in the electronic manufacturing industry to reduce manufacturing cost, improve yield, and ensure project quality and reliability. This paper proposes the use of the Log-Gabor filter bank, Discrete Wavelet Transform and Discrete Cosine Transform for feature extraction of solder joint images on Printed Circuit Boards (PCBs). A distance based on the Mahalanobis Cosine metric is also presented for classification of five different types of solder joints. From the experimental results, this methodology achieved high accuracy and a well generalised performance. This can be an effective method to reduce cost and improve quality in the production of PCBs in the manufacturing industry.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Design-build (DB) is a generic form of construction procurement, and, rather than simply representing a single system, it has evolved in practice into a variety of forms, each of which is similar to, and yet different from each other. Although the importance of selecting an appropriate DB variant has been widely accepted, difficulties occur in practice due to the multiplicity of terms and concepts used. What is needed is some kind of taxonomy or framework within which the individual variants can be placed and their relative attributes identified and understood. Through a comprehensive literature review and content analysis, this paper establishes a systematic classification framework for DB variants based on their operational attributes. In addition to providing much needed support for decision-making, this classification framework provides client/owners with perspectives to understand and examine different categories of DB variants from an operational perspective.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The development of text classification techniques has been largely promoted in the past decade due to the increasing availability and widespread use of digital documents. Usually, the performance of text classification relies on the quality of categories and the accuracy of classifiers learned from samples. When training samples are unavailable or categories are unqualified, text classification performance would be degraded. In this paper, we propose an unsupervised multi-label text classification method to classify documents using a large set of categories stored in a world ontology. The approach has been promisingly evaluated by compared with typical text classification methods, using a real-world document collection and based on the ground truth encoded by human experts.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conventional rainfall classification for modelling and prediction is quantity based. This approach can lead to inaccuracies in stormwater quality modelling due to the assignment of stochastic pollutant parameters to a rainfall event. A taxonomy for natural rainfall events in the context of stormwater quality is presented based on an in-depth investigation of the influence of rainfall characteristics on stormwater quality. In the research study, the natural rainfall events were classified into three types based on average rainfall intensity and rainfall duration and the classification was found to be independent of the catchment characteristics. The proposed taxonomy provides an innovative concept in stormwater quality modelling and prediction and will contribute to enhancing treatment design for stormwater quality mitigation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: To derive preference-based measures from various condition-specific descriptive health-related quality of life (HRQOL) measures. A general 2-stage method is evolved: 1) an item from each domain of the HRQOL measure is selected to form a health state classification system (HSCS); 2) a sample of health states is valued and an algorithm derived for estimating the utility of all possible health states. The aim of this analysis was to determine whether confirmatory or exploratory factor analysis (CFA, EFA) should be used to derive a cancer-specific utility measure from the EORTC QLQ-C30. Methods: Data were collected with the QLQ-C30v3 from 356 patients receiving palliative radiotherapy for recurrent or metastatic cancer (various primary sites). The dimensional structure of the QLQ-C30 was tested with EFA and CFA, the latter based on a conceptual model (the established domain structure of the QLQ-C30: physical, role, emotional, social and cognitive functioning, plus several symptoms) and clinical considerations (views of both patients and clinicians about issues relevant to HRQOL in cancer). The dimensions determined by each method were then subjected to item response theory, including Rasch analysis. Results: CFA results generally supported the proposed conceptual model, with residual correlations requiring only minor adjustments (namely, introduction of two cross-loadings) to improve model fit (increment χ2(2) = 77.78, p < .001). Although EFA revealed a structure similar to the CFA, some items had loadings that were difficult to interpret. Further assessment of dimensionality with Rasch analysis aligned the EFA dimensions more closely with the CFA dimensions. Three items exhibited floor effects (>75% observation at lowest score), 6 exhibited misfit to the Rasch model (fit residual > 2.5), none exhibited disordered item response thresholds, 4 exhibited DIF by gender or cancer site. Upon inspection of the remaining items, three were considered relatively less clinically important than the remaining nine. Conclusions: CFA appears more appropriate than EFA, given the well-established structure of the QLQ-C30 and its clinical relevance. Further, the confirmatory approach produced more interpretable results than the exploratory approach. Other aspects of the general method remain largely the same. The revised method will be applied to a large number of data sets as part of the international and interdisciplinary project to develop a multi-attribute utility instrument for cancer (MAUCa).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Articular cartilage is a complex structure with an architecture in which fluid-swollen proteoglycans constrained within a 3D network of collagen fibrils. Because of the complexity of the cartilage structure, the relationship between its mechanical behaviours at the macroscale level and its components at the micro-scale level are not completely understood. The research objective in this thesis is to create a new model of articular cartilage that can be used to simulate and obtain insight into the micro-macro-interaction and mechanisms underlying its mechanical responses during physiological function. The new model of articular cartilage has two characteristics, namely: i) not use fibre-reinforced composite material idealization ii) Provide a framework for that it does probing the micro mechanism of the fluid-solid interaction underlying the deformation of articular cartilage using simple rules of repartition instead of constitutive / physical laws and intuitive curve-fitting. Even though there are various microstructural and mechanical behaviours that can be studied, the scope of this thesis is limited to osmotic pressure formation and distribution and their influence on cartilage fluid diffusion and percolation, which in turn governs the deformation of the compression-loaded tissue. The study can be divided into two stages. In the first stage, the distributions and concentrations of proteoglycans, collagen and water were investigated using histological protocols. Based on this, the structure of cartilage was conceptualised as microscopic osmotic units that consist of these constituents that were distributed according to histological results. These units were repeated three-dimensionally to form the structural model of articular cartilage. In the second stage, cellular automata were incorporated into the resulting matrix (lattice) to simulate the osmotic pressure of the fluid and the movement of water within and out of the matrix; following the osmotic pressure gradient in accordance with the chosen rule of repartition of the pressure. The outcome of this study is the new model of articular cartilage that can be used to simulate and study the micromechanical behaviours of cartilage under different conditions of health and loading. These behaviours are illuminated at the microscale level using the socalled neighbourhood rules developed in the thesis in accordance with the typical requirements of cellular automata modelling. Using these rules and relevant Boundary Conditions to simulate pressure distribution and related fluid motion produced significant results that provided the following insight into the relationships between osmotic pressure gradient and associated fluid micromovement, and the deformation of the matrix. For example, it could be concluded that: 1. It is possible to model articular cartilage with the agent-based model of cellular automata and the Margolus neighbourhood rule. 2. The concept of 3D inter connected osmotic units is a viable structural model for the extracellular matrix of articular cartilage. 3. Different rules of osmotic pressure advection lead to different patterns of deformation in the cartilage matrix, enabling an insight into how this micromechanism influences macromechanical deformation. 4. When features such as transition coefficient were changed, permeability (representing change) is altered due to the change in concentrations of collagen, proteoglycans (i.e. degenerative conditions), the deformation process is impacted. 5. The boundary conditions also influence the relationship between osmotic pressure gradient and fluid movement at the micro-scale level. The outcomes are important to cartilage research since we can use these to study the microscale damage in the cartilage matrix. From this, we are able to monitor related diseases and their progression leading to potential insight into drug-cartilage interaction for treatment. This innovative model is an incremental progress on attempts at creating further computational modelling approaches to cartilage research and other fluid-saturated tissues and material systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A big challenge for classification on text is the noisy of text data. It makes classification quality low. Many classification process can be divided into two sequential steps scoring and threshold setting (thresholding). Therefore to deal with noisy data problem, it is important to describe positive feature effectively scoring and to set a suitable threshold. Most existing text classifiers do not concentrate on these two jobs. In this paper, we propose a novel text classifier with pattern-based scoring that describe positive feature effectively, followed by threshold setting. The thresholding is based on score of training set, make it is simple to implement in other scoring methods. Experiment shows that our pattern-based classifier is promising.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Currently, recommender systems (RS) have been widely applied in many commercial e-commerce sites to help users deal with the information overload problem. Recommender systems provide personalized recommendations to users and thus help them in making good decisions about which product to buy from the vast number of product choices available to them. Many of the current recommender systems are developed for simple and frequently purchased products like books and videos, by using collaborative-filtering and content-based recommender system approaches. These approaches are not suitable for recommending luxurious and infrequently purchased products as they rely on a large amount of ratings data that is not usually available for such products. This research aims to explore novel approaches for recommending infrequently purchased products by exploiting user generated content such as user reviews and product click streams data. From reviews on products given by the previous users, association rules between product attributes are extracted using an association rule mining technique. Furthermore, from product click streams data, user profiles are generated using the proposed user profiling approach. Two recommendation approaches are proposed based on the knowledge extracted from these resources. The first approach is developed by formulating a new query from the initial query given by the target user, by expanding the query with the suitable association rules. In the second approach, a collaborative-filtering recommender system and search-based approaches are integrated within a hybrid system. In this hybrid system, user profiles are used to find the target user’s neighbour and the subsequent products viewed by them are then used to search for other relevant products. Experiments have been conducted on a real world dataset collected from one of the online car sale companies in Australia to evaluate the effectiveness of the proposed recommendation approaches. The experiment results show that user profiles generated from user click stream data and association rules generated from user reviews can improve recommendation accuracy. In addition, the experiment results also prove that the proposed query expansion and the hybrid collaborative filtering and search-based approaches perform better than the baseline approaches. Integrating the collaborative-filtering and search-based approaches has been challenging as this strategy has not been widely explored so far especially for recommending infrequently purchased products. Therefore, this research will provide a theoretical contribution to the recommender system field as a new technique of combining collaborative-filtering and search-based approaches will be developed. This research also contributes to a development of a new query expansion technique for infrequently purchased products recommendation. This research will also provide a practical contribution to the development of a prototype system for recommending cars.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Authenticated Encryption (AE) is the cryptographic process of providing simultaneous confidentiality and integrity protection to messages. This approach is more efficient than applying a two-step process of providing confidentiality for a message by encrypting the message, and in a separate pass providing integrity protection by generating a Message Authentication Code (MAC). AE using symmetric ciphers can be provided by either stream ciphers with built in authentication mechanisms or block ciphers using appropriate modes of operation. However, stream ciphers have the potential for higher performance and smaller footprint in hardware and/or software than block ciphers. This property makes stream ciphers suitable for resource constrained environments, where storage and computational power are limited. There have been several recent stream cipher proposals that claim to provide AE. These ciphers can be analysed using existing techniques that consider confidentiality or integrity separately; however currently there is no existing framework for the analysis of AE stream ciphers that analyses these two properties simultaneously. This thesis introduces a novel framework for the analysis of AE using stream cipher algorithms. This thesis analyzes the mechanisms for providing confidentiality and for providing integrity in AE algorithms using stream ciphers. There is a greater emphasis on the analysis of the integrity mechanisms, as there is little in the public literature on this, in the context of authenticated encryption. The thesis has four main contributions as follows. The first contribution is the design of a framework that can be used to classify AE stream ciphers based on three characteristics. The first classification applies Bellare and Namprempre's work on the the order in which encryption and authentication processes take place. The second classification is based on the method used for accumulating the input message (either directly or indirectly) into the into the internal states of the cipher to generate a MAC. The third classification is based on whether the sequence that is used to provide encryption and authentication is generated using a single key and initial vector, or two keys and two initial vectors. The second contribution is the application of an existing algebraic method to analyse the confidentiality algorithms of two AE stream ciphers; namely SSS and ZUC. The algebraic method is based on considering the nonlinear filter (NLF) of these ciphers as a combiner with memory. This method enables us to construct equations for the NLF that relate the (inputs, outputs and memory of the combiner) to the output keystream. We show that both of these ciphers are secure from this type of algebraic attack. We conclude that using a keydependent SBox in the NLF twice, and using two different SBoxes in the NLF of ZUC, prevents this type of algebraic attack. The third contribution is a new general matrix based model for MAC generation where the input message is injected directly into the internal state. This model describes the accumulation process when the input message is injected directly into the internal state of a nonlinear filter generator. We show that three recently proposed AE stream ciphers can be considered as instances of this model; namely SSS, NLSv2 and SOBER-128. Our model is more general than a previous investigations into direct injection. Possible forgery attacks against this model are investigated. It is shown that using a nonlinear filter in the accumulation process of the input message when either the input message or the initial states of the register is unknown prevents forgery attacks based on collisions. The last contribution is a new general matrix based model for MAC generation where the input message is injected indirectly into the internal state. This model uses the input message as a controller to accumulate a keystream sequence into an accumulation register. We show that three current AE stream ciphers can be considered as instances of this model; namely ZUC, Grain-128a and Sfinks. We establish the conditions under which the model is susceptible to forgery and side-channel attacks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In a classification problem typically we face two challenging issues, the diverse characteristic of negative documents and sometimes a lot of negative documents that are closed to positive documents. Therefore, it is hard for a single classifier to clearly classify incoming documents into classes. This paper proposes a novel gradual problem solving to create a two-stage classifier. The first stage identifies reliable negatives (negative documents with weak positive characteristics). It concentrates on minimizing the number of false negative documents (recall-oriented). We use Rocchio, an existing recall based classifier, for this stage. The second stage is a precision-oriented “fine tuning”, concentrates on minimizing the number of false positive documents by applying pattern (a statistical phrase) mining techniques. In this stage a pattern-based scoring is followed by threshold setting (thresholding). Experiment shows that our statistical phrase based two-stage classifier is promising.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cardiomyopathies represent a group of diseases of the myocardium of the heart and include diseases both primarily of the cardiac muscle and systemic diseases leading to adverse effects on the heart muscle size, shape, and function. Traditionally cardiomyopathies were defined according to phenotypical appearance. Now, as our understanding of the pathophysiology of the different entities classified under each of the different phenotypes improves and our knowledge of the molecular and genetic basis for these entities progresses, the traditional classifications seem oversimplistic and do not reflect current understanding of this myriad of diseases and disease processes. Although our knowledge of the exact basis of many of the disease processes of cardiomyopathies is still in its infancy, it is important to have a classification system that has the ability to incorporate the coming tide of molecular and genetic information. This paper discusses how the traditional classification of cardiomyopathies based on morphology has evolved due to rapid advances in our understanding of the genetic and molecular basis for many of these clinical entities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE We wanted to assess the effectiveness of a home-based physical activity program, the Depression in Late Life Intervention Trial of Exercise (DeLLITE), in improving function, quality of life, and mood in older people with depressive symptoms. METHODS We undertook a randomized controlled trial involving 193 people aged 75 years and older with depressive symptoms at enrollment who were recruited from primary health care practices in Auckland, New Zealand. Participants received either an individualized physical activity program or social visits to control for the contact time of the activity intervention delivered over 6 months. Primary outcome measures were function, a short physical performance battery comprising balance and mobility, and the Nottingham Extended Activities of Daily Living scale. Secondary outcome measures were quality of life, the Medical Outcomes Study 36-item short form, mood, Geriatric Depression Scale (GDS-15), physical activity, Auckland Heart Study Physical Activity Questionnaire, and self-report of falls. Repeated measures analyses tested the differential impact on outcomes over 12 months’ follow-up. RESULTS The mean age of the participants was 81 years, and 59% were women. All participants scored in the at–risk category on the depression screen, 53% had a Diagnostic and Statistical Manual of Mental Disorders or International Classification of Diseases, Tenth Revision diagnosis of major depression or scored more than 4 on the GDS-15 at baseline, indicating moderate or severe depression. Almost all participants, 187 (97%), completed the trial. Overall there were no differences in the impact of the 2 interventions on outcomes. Mood and mental health related quality of life improved for both groups. CONCLUSION he DeLLITE activity program improved mood and quality of life for older people with depressive symptoms as much as the effect of social visits. Future social and activity interventions should be tested against a true usual care control.