13 resultados para complexity classification

em Deakin Research Online - Australia


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Artificial neural networks (ANN) are increasingly used to solve many problems related to pattern recognition and object classification. In this paper, we report on a study using artificial neural networks to classify two kinds of animal fibers: merino and mohair. We have developed two different models, one extracting nine scale parameters with image processing, and the other using an unsupervised artificial neural network to extract features automatically, which are determined in accordance with the complexity of the scale structure and the accuracy of the model. Although the first model can achieve higher accuracy, it requires more effort for image processing and more prior knowledge, since the accuracy of the ANN largely depends on the parameters selected. The second model is more robust than the first, since only raw images are used. Because only ordinary optical images taken with a microscope are employed, we can use the approach for many textile applications without expensive equipment such as scanning electron microscopy.


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data mining refers to extracting or "mining" knowledge from large amounts of data. It is an increasingly popular field that uses statistical, visualization, machine learning, and other data manipulation and knowledge extraction techniques aimed at gaining an insight into the relationships and patterns hidden in the data. Availability of digital data within picture archiving and communication systems raises a possibility of health care and research enhancement associated with manipulation, processing and handling of data by computers.That is the basis for computer-assisted radiology development. Further development of computer-assisted radiology is associated with the use of new intelligent capabilities such as multimedia support and data mining in order to discover the relevant knowledge for diagnosis. It is very useful if results of data mining can be communicated to humans in an understandable way. In this paper, we present our work on data mining in medical image archiving systems. We investigate the use of a very efficient data mining technique, a decision tree, in order to learn the knowledge for computer-assisted image analysis. We apply our method to the classification of x-ray images for lung cancer diagnosis. The proposed technique is based on an inductive decision tree learning algorithm that has low complexity with high transparency and accuracy. The results show that the proposed algorithm is robust, accurate, fast, and it produces a comprehensible structure, summarizing the knowledge it induces.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A major challenge facing freshwater ecologists and managers is the development of models that link stream ecological condition to catchment scale effects, such as land use. Previous attempts to make such models have followed two general approaches. The bottom-up approach employs mechanistic models, which can quickly become too complex to be useful. The top-down approach employs empirical models derived from large data sets, and has often suffered from large amounts of unexplained variation in stream condition.

We believe that the lack of success of both modelling approaches may be at least partly explained by scientists considering too wide a breadth of catchment type. Thus, we believe that by stratifying large sets of catchments into groups of similar types prior to modelling, both types of models may be improved. This paper describes preliminary work using a Bayesian classification software package, ‘Autoclass’ (Cheeseman and Stutz 1996) to create classes of catchments within the Murray Darling Basin based on physiographic data.

Autoclass uses a model-based classification method that employs finite mixture modelling and trades off model fit versus complexity, leading to a parsimonious solution. The software provides information on the posterior probability that the classification is ‘correct’ and also probabilities for alternative classifications. The importance of each attribute in defining the individual classes is calculated and presented, assisting description of the classes. Each case is ‘assigned’ to a class based on membership probability, but the probability of membership of other classes is also provided. This feature deals very well with cases that do not fit neatly into a larger class. Lastly, Autoclass requires the user to specify the measurement error of continuous variables.

Catchments were derived from the Australian digital elevation model. Physiographic data werederived from national spatial data sets. There was very little information on measurement errors for the spatial data, and so a conservative error of 5% of data range was adopted for all continuous attributes. The incorporation of uncertainty into spatial data sets remains a research challenge.

The results of the classification were very encouraging. The software found nine classes of catchments in the Murray Darling Basin. The classes grouped together geographically, and followed altitude and latitude gradients, despite the fact that these variables were not included in the classification. Descriptions of the classes reveal very different physiographic environments, ranging from dry and flat catchments (i.e. lowlands), through to wet and hilly catchments (i.e. mountainous areas). Rainfall and slope were two important discriminators between classes. These two attributes, in particular, will affect the ways in which the stream interacts with the catchment, and can thus be expected to modify the effects of land use change on ecological condition. Thus, realistic models of the effects of land use change on streams would differ between the different types of catchments, and sound management practices will differ.

A small number of catchments were assigned to their primary class with relatively low probability. These catchments lie on the boundaries of groups of catchments, with the second most likely class being an adjacent group. The locations of these ‘uncertain’ catchments show that the Bayesian classification dealt well with cases that do not fit neatly into larger classes.

Although the results are intuitive, we cannot yet assess whether the classifications described in this paper would assist the modelling of catchment scale effects on stream ecological condition. It is most likely that catchment classification and modelling will be an iterative process, where the needs of the model are used to guide classification, and the results of classifications used to suggest further refinements to models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Aim: To determine the time needed to provide clinical pharmacy services to individual patient episodes for medical and surgical patients and the effect of patient presentation and complexity on the clinical pharmacy workload. Method: During a 5-month period in 2006 at two general hospitals, pharmacists recorded a defined range of activities that they provided for patients, including the actual times required for these tasks. A customised database linked to the two hospitals' patient administration systems stored the data according to the specific patient episode number. The influence of patient presentation and complexity on the clinical pharmacy activities provided was also examined. Results: The average time required by pharmacists to undertake a medication history interview and medication reconciliation was 9.6 (SD 4.9) minutes. Interventions required 5.7 (SD 4.6) minutes, clinical review of the medical record 5.5 (SD 4.0) minutes and medication order review 3.5 (SD 2.0) minutes. For all of these activities, the time required for medical patients was greater than for surgical patients and greater for 'complicated' patients. The average time required to perform all clinical pharmacy activities for 1071 completed patient episodes was 14.4 (SD 10.9) minutes and was greater for medical and 'complicated' patients. Conclusion: The time needed to provide clinical pharmacy services was affected by whether the patients were medical or surgical. The existence of comorbidities or complications affected these times. The times required to perform clinical pharmacy activities may not be consistent with recently proposed staff ratios for the provision of a basic clinical pharmacy service.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we have proposed a spam filtering technique using (2+1)-tier classification approach. The main focus of this paper is to reduce the false positive (FP) rate which is considered as an important research issue in spam filtering. In our approach, firstly the email message will classify using first two tier classifiers and the outputs will appear to the analyzer. The analyzer will check the labeling of the output emails and send to the corresponding mailboxes based on labeling, for the case of identical prediction. If there are any misclassifications occurred by first two tier classifiers then tier-3 classifier will invoked by the analyzer and the tier-3 will take final decision. This technique reduced the analyzing complexity of our previous work. It has also been shown that the proposed technique gives better performance in terms of reducing false positive as well as better accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis proposes an innovative adaptive multi-classifier spam filtering model, with a grey-list analyser and a dynamic feature selection method, to overcome false-positive problems in email classification. It also presents additional techniques to minimize the added complexity. Empirical evidence indicates the success of this model over existing approaches.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the last decade, the Internet email has become one of the primary method of communication used by everyone for the exchange of ideas and information. However, in recent years, along with the rapid growth of the Internet and email, there has been a dramatic growth in spam. Classifications algorithms have been successfully used to filter spam, but with a certain amount of false positive trade-offs. This problem is mainly caused by the dynamic nature of spam content, spam delivery strategies, as well as the diversification of the classification algorithms. This paper presents an approach of email classification to overcome the burden of analyzing technique of GL (grey list) analyser as further refinements of our previous multi-classifier based email classification [10]. In this approach, we introduce a “majority voting grey list (MVGL)” analyzing technique with two different variations which will analyze only the product of GL emails. Our empirical evidence proofs the improvements of this approach, in terms of complexity and cost, compared to existing GL analyser. This approach also overcomes the limitation of human interaction of existing analyzing technique.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The effective management of our marine ecosystems requires the capability to identify, characterise and predict the distribution of benthic biological communities within the overall seascape architecture. The rapid expansion of seabed mapping studies has seen an increase in the application of automated classification techniques to efficiently map benthic habitats, and the need of techniques to assess confidence of model outputs. We use towed video observations and 11 seafloor complexity variables derived from multibeam echosounder (MBES) bathymetry and backscatter to predict the distribution of 8 dominant benthic biological communities in a 54 km2 site, off the central coast of Victoria, Australia. The same training and evaluation datasets were used to compare the accuracies of a Maximum Likelihood Classifier (MLC) and two new generation decision tree methods, QUEST (Quick Unbiased Efficient Statistical Tree) and CRUISE (Classification Rule with Unbiased Interaction Selection and Estimation), for predicting dominant biological communities. The QUEST classifier produced significantly better results than CRUISE and MLC model runs, with an overall accuracy of 80% (Kappa 0.75). We found that the level of accuracy with the size of training set varies for different algorithms. The QUEST results generally increased in a linear fashion, CRUISE performed well with smaller training data sets, and MLC performed least favourably overall, generating anomalous results with changes to training size. We also demonstrate how predicted habitat maps can provide insights into habitat spatial complexity on the continental shelf. Significant variation between patch-size and habitat types and significant correlations between patch size and depth were also observed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we purpose a rule pruning strategy to reduce the number of rules in a fuzzy rule-based classification system.A confidence factor, which is formulated based on the compatibility of the rules with the input patterns is under deployed for rule pruning.The pruning strategy aims at reducing the complexity of the fuzzy classification system and, at the same time, maintaining the accuracy rate at a good level.To evaluate the effectiveness of the pruning strategy, two benchmark data sets are first tested. Then, a fault classification problem with real senor measurements collected from a power generation plant is evaluated.The results obtained are analyzed and explained, and implications of the proposed rule pruning strategy to the fuzzy classification system are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Brain Computer Interface (BCI) plays an important role in the communication between human and machines. This communication is based on the human brain signals. In these systems, users use their brain instead of the limbs or body movements to do tasks. The brain signals are analyzed and translated into commands to control any communication devices, robots or computers. In this paper, the aim was to enhance the performance of a brain computer interface (BCI) systems through better prosthetic motor imaginary tasks classification. The challenging part is to use only a single channel of electroencephalography (EEG). Arm movement imagination is the task of the user, where (s)he was asked to imagine moving his arm up or down. Our system detected the imagination based on the input brain signal. Some EEG quality features were extracted from the brain signal, and the Decision Tree was used to classify the participant's imagination based on the extracted features. Our system is online which means that it can give the decision as soon as the signal is given to the system (takes only 20 ms). Also, only one EEG channel is used for classification which reduces the complexity of the system which leads to fast performance. Hundred signals were used for testing, on average 97.4% of the up-down prosthetic motor imaginary tasks were detected correctly. This method can be used in many different applications such as: moving artificial limbs and wheelchairs due to it's high speed and accuracy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Heart rate complexity analysis is a powerful non-invasive means to diagnose several cardiac ailments. Non-linear tools of complexity measurement are indispensable in order to bring out the complete non-linear behavior of Physiological signals. The most popularly used non-linear tools to measure signal complexity are the entropy measures like Approximate entropy (ApEn) and Sample entropy (SampEn). But, these methods become unreliable and inaccurate at times, in particular, for short length data. Recently, a novel method of complexity measurement called Distribution Entropy (DistEn) was introduced, which showed reliable performance to capture complexity of both short term synthetic and short term physiologic data. This study aims to i) examine the competence of DistEn in discriminating Arrhythmia from Normal sinus rhythm (NSR) subjects, using RR interval time series data; ii) explore the level of consistency of DistEn with data length N; and iii) compare the performance of DistEn with ApEn and SampEn. Sixty six RR interval time series data belonging to two groups of cardiac conditions namely `Arrhythmia' and `NSR' have been used for the analysis. The data length N was varied from 50 to 1000 beats with embedding dimension m = 2 for all entropy measurements. Maximum ROC area obtained using ApEn, SampEn and DistEn were 0.83, 0.86 and 0.94 for data length 1000, 1000 and 500 beats respectively. The results show that DistEn undoubtedly exhibits a consistently high performance as a classification feature in comparison with ApEn and SampEn. Therefore, DistEn shows a promising behavior as bio marker for detecting Arrhythmia from short length RR interval data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A two-stage hybrid model for data classification and rule extraction is proposed. The first stage uses a Fuzzy ARTMAP (FAM) classifier with Q-learning (known as QFAM) for incremental learning of data samples, while the second stage uses a Genetic Algorithm (GA) for rule extraction from QFAM. Given a new data sample, the resulting hybrid model, known as QFAM-GA, is able to provide prediction pertaining to the target class of the data sample as well as to give a fuzzy if-then rule to explain the prediction. To reduce the network complexity, a pruning scheme using Q-values is applied to reduce the number of prototypes generated by QFAM. A 'don't care' technique is employed to minimize the number of input features using the GA. A number of benchmark problems are used to evaluate the effectiveness of QFAM-GA in terms of test accuracy, noise tolerance, model complexity (number of rules and total rule length). The results are comparable, if not better, than many other models reported in the literature. The main significance of this research is a usable and useful intelligent model (i.e., QFAM-GA) for data classification in noisy conditions with the capability of yielding a set of explanatory rules with minimum antecedents. In addition, QFAM-GA is able to maximize accuracy and minimize model complexity simultaneously. The empirical outcome positively demonstrate the potential impact of QFAM-GA in the practical environment, i.e., providing an accurate prediction with a concise justification pertaining to the prediction to the domain users, therefore allowing domain users to adopt QFAM-GA as a useful decision support tool in assisting their decision-making processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Epilepsy is an electrophysiological disorder of the brain, the hallmark of which is recurrent and unprovoked seizures. Electroencephalogram (EEG) measures electrical activity of the brain that is commonly applied as a non-invasive technique for seizure detection. Although a vast number of publications have been published on intelligent algorithms to classify interictal and ictal EEG, it remains an open question whether they can be detected using short-length EEG recordings. In this study, we proposed three protocols to select 5 s EEG segment for classifying interictal and ictal EEG from normal. We used the publicly-accessible Bonn database, which consists of normal, interical, and ictal EEG signals with a length of 4097 sampling points (23.6 s) per record. In this study, we selected three segments of 868 points (5 s) length from each recordings and evaluated results for each of them separately. The well-studied irregularity measure-sample entropy (SampEn)-and a more recently proposed complexity measure-distribution entropy (DistEn)-were used as classification features. A total of 20 combinations of input parameters m and τ for the calculation of SampEn and DistEn were selected for compatibility. Results showed that SampEn was undefined for half of the used combinations of input parameters and indicated a large intra-class variance. Moreover, DistEn performed robustly for short-length EEG data indicating relative independence from input parameters and small intra-class fluctuations. In addition, it showed acceptable performance for all three classification problems (interictal EEG from normal, ictal EEG from normal, and ictal EEG from interictal) compared to SampEn, which showed better results only for distinguishing normal EEG from interictal and ictal. Both SampEn and DistEn showed good reproducibility and consistency, as evidenced by the independence of results on analysing protocol.