50 resultados para Macrobrachium - Classification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining refers to extracting or "mining" knowledge from large amounts of data. It is an increasingly popular field that uses statistical, visualization, machine learning, and other data manipulation and knowledge extraction techniques aimed at gaining an insight into the relationships and patterns hidden in the data. Availability of digital data within picture archiving and communication systems raises a possibility of health care and research enhancement associated with manipulation, processing and handling of data by computers.That is the basis for computer-assisted radiology development. Further development of computer-assisted radiology is associated with the use of new intelligent capabilities such as multimedia support and data mining in order to discover the relevant knowledge for diagnosis. It is very useful if results of data mining can be communicated to humans in an understandable way. In this paper, we present our work on data mining in medical image archiving systems. We investigate the use of a very efficient data mining technique, a decision tree, in order to learn the knowledge for computer-assisted image analysis. We apply our method to the classification of x-ray images for lung cancer diagnosis. The proposed technique is based on an inductive decision tree learning algorithm that has low complexity with high transparency and accuracy. The results show that the proposed algorithm is robust, accurate, fast, and it produces a comprehensible structure, summarizing the knowledge it induces.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The communication via email is one of the most popular services of the Internet. Emails have brought us great convenience in our daily work and life. However, unsolicited messages or spam, flood our email boxes, which results in bandwidth, time and money wasting. To this end, this paper presents a rough set based model to classify emails into three categories - spam, no-spam and suspicious, rather than two classes (spam and non-spam) in most currently used approaches. By comparing with popular classification methods like Naive Bayes classification, the error ratio that a non-spam is discriminated to spam can be reduced using our proposed model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose - Research has so far not approached the contents of corporate code of ethics from a strategic classification point of view. Therefore, the objective of this paper is to introduce and describe a framework of classification and empirical illustration to provide insights into the strategic approaches of corporate code of ethics content within and across contextual business environments.

Design/methodology/approach -
The paper summarizes the content analysis of code prescription and the intensity of codification in the contents of 78 corporate codes of ethics in Australia.

Findings - The paper finds that, generally, the studied corporate codes of ethics in Australia are of standardized and replicated strategic approaches. In particular, customized and individualized strategic approaches are far from penetrating the ethos of corporate codes of ethics content.

Research limitations/implications -
The research is limited to Australian codes of ethics. Suggestions for further research are provided in terms of the search for best practice of customized and individualized corporate codes of ethics content across countries.

Practical implications -
The framework contributes to an identification of four strategic approaches of corporate codes of ethics content, namely standardized, replicated, individualized and customized.

Originality/value - The principal contribution of this paper is a generic framework to identify strategic approaches of corporate codes of ethics content. The framework is derived from two generic dimensions: the context of application and the application of content. The timing of application is also a crucial generic dimension to the success or failure of codes of ethics content. Empirical illustrations based upon corporate codes of ethics in Australia's top companies underpin the topic explored.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper contributes to a better understanding of geophysical characteristics and benthic communities in the Hopkins site in Victoria, Australia. An automated decision tree classification system was used to classify substrata and dominant biota communities. Geophysical sampling and underwater video data collected in this study reveals a complex bathymetry and biological structure which complements the limited information of benthic marine ecosystems in coastal waters of Victoria. The technique of combining derivative products from the backscatter and the bathymetry datasets was found to improve separability for broad biota and substrata categories over the use of either of these datasets alone.


Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the key applications of microarray studies is to select and classify gene expression profiles of cancer and normal subjects. In this study, two hybrid approaches–genetic algorithm with decision tree (GADT) and genetic algorithm with neural network (GANN)–are utilized to select optimal gene sets which contribute to the highest classification accuracy. Two benchmark microarray datasets were tested, and the most significant disease related genes have been identified. Furthermore, the selected gene sets achieved comparably high sample classification accuracy (96.79% and 94.92% in colon cancer dataset, 98.67% and 98.05% in leukemia dataset) compared with those obtained by mRMR algorithm. The study results indicate that these two hybrid methods are able to select disease related genes and improve classification accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering of multivariate data is a commonly used technique in ecology, and many approaches to clustering are available. The results from a clustering algorithm are uncertain, but few clustering approaches explicitly acknowledge this uncertainty. One exception is Bayesian mixture modelling, which treats all results probabilistically, and allows comparison of multiple plausible classifications of the same data set. We used this method, implemented in the AutoClass program, to classify catchments (watersheds) in the Murray Darling Basin (MDB), Australia, based on their physiographic characteristics (e.g. slope, rainfall, lithology). The most likely classification found nine classes of catchments. Members of each class were aggregated geographically within the MDB. Rainfall and slope were the two most important variables that defined classes. The second-most likely classification was very similar to the first, but had one fewer class. Increasing the nominal uncertainty of continuous data resulted in a most likely classification with five classes, which were again aggregated geographically. Membership probabilities suggested that a small number of cases could be members of either of two classes. Such cases were located on the edges of groups of catchments that belonged to one class, with a group belonging to the second-most likely class adjacent. A comparison of the Bayesian approach to a distance-based deterministic method showed that the Bayesian mixture model produced solutions that were more spatially cohesive and intuitively appealing. The probabilistic presentation of results from the Bayesian classification allows richer interpretation, including decisions on how to treat cases that are intermediate between two or more classes, and whether to consider more than one classification. The explicit consideration and presentation of uncertainty makes this approach useful for ecological investigations, where both data and expectations are often highly uncertain.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Microarray data classification is one of the most important emerging clinical applications in the medical community. Machine learning algorithms are most frequently used to complete this task. We selected one of the state-of-the-art kernel-based algorithms, the support vector machine (SVM), to classify microarray data. As a large number of kernels are available, a significant research question is what is the best kernel for patient diagnosis based on microarray data classification using SVM? We first suggest three solutions based on data visualization and quantitative measures. Different types of microarray problems then test the proposed solutions. Finally, we found that the rule-based approach is most useful for automatic kernel selection for SVM to classify microarray data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel approach to Episodic Associative Memory (EAM), known as Episodic Associative Memory with a Neighborhood Effect (EAMwNE) is presented in this paper. It overcomes the representation limitations of existing episodic memory models and increases the potential for their use in practical application.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a random forest-based face image classification method. The random forest is an ensemble learning method that grows many classification trees. Each tree gives a classification. The forest selects the classification that has the most votes. Three experiments are performed. The random forest-based method together with several existing approaches are trained and evaluated. The experimental results are presented and discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an innovative email categorization using a serialized multi-stage classification ensembles technique. Many approaches are used in practice for email categorization to control the menace of spam emails in different ways. Content-based email categorization employs filtering techniques using classification algorithms to learn to predict spam e-mails given a corpus of training e-mails. This process achieves a substantial performance with some amount of FP tradeoffs. It has been studied and investigated with different classification algorithms and found that the outputs of the classifiers vary from one classifier to another with same email corpora. In this paper we have proposed a multi-stage classification technique using different popular learning algorithms with an analyser which reduces the FP (false positive) problems substantially and increases classification accuracy compared to similar existing techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a new technique of email classification based on grey list (GL) analysis of user emails. This technique is based on the analysis of output emails of an integrated model which uses multiple classifiers of statistical learning algorithms. The GL is a list of classifier/(s) output which is/are not considered as true positive (TP) and true negative (TN) but in the middle of them. Many works have been done to filter spam from legitimate emails using classification algorithm and substantial performance has been achieved with some amount of false positive (FP) tradeoffs. In the case of spam detection the FP problem is unacceptable, sometimes. The proposed technique will provide a list of output emails, called "grey list (GL)", to the analyser for making decisions about the status of these emails. It has been shown that the performance of our proposed technique for email classification is much better compare to existing systems, in order to reducing FP problems and accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There exists an enormous gap between low-level visual feature and high-level semantic information, and the accuracy of content-based image classification and retrieval depends greatly on the description of low-level visual features. Taking this into consideration, a novel texture and edge descriptor is proposed in this paper, which can be represented with a histogram. Furthermore, with the incorporation of the color, texture and edge histograms searnlessly, the images are grouped into semantic classes using a support vector machine (SVM). Experiment results show that the combination descriptor is more discriminative than other feature descriptors such as Gabor texture.