791 resultados para Rule-Based Classification
Resumo:
In this paper, we present a study on a deterministic partially self-avoiding walk (tourist walk), which provides a novel method for texture feature extraction. The method is able to explore an image on all scales simultaneously. Experiments were conducted using different dynamics concerning the tourist walk. A new strategy, based on histograms. to extract information from its joint probability distribution is presented. The promising results are discussed and compared to the best-known methods for texture description reported in the literature. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Texture is an important visual attribute used to describe the pixel organization in an image. As well as it being easily identified by humans, its analysis process demands a high level of sophistication and computer complexity. This paper presents a novel approach for texture analysis, based on analyzing the complexity of the surface generated from a texture, in order to describe and characterize it. The proposed method produces a texture signature which is able to efficiently characterize different texture classes. The paper also illustrates a novel method performance on an experiment using texture images of leaves. Leaf identification is a difficult and complex task due to the nature of plants, which presents a huge pattern variation. The high classification rate yielded shows the potential of the method, improving on traditional texture techniques, such as Gabor filters and Fourier analysis.
Resumo:
This paper introduces a novel methodology to shape boundary characterization, where a shape is modeled into a small-world complex network. It uses degree and joint degree measurements in a dynamic evolution network to compose a set of shape descriptors. The proposed shape characterization method has all efficient power of shape characterization, it is robust, noise tolerant, scale invariant and rotation invariant. A leaf plant classification experiment is presented on three image databases in order to evaluate the method and compare it with other descriptors in the literature (Fourier descriptors, Curvature, Zernike moments and multiscale fractal dimension). (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Differently from theoretical scale-free networks, most real networks present multi-scale behavior, with nodes structured in different types of functional groups and communities. While the majority of approaches for classification of nodes in a complex network has relied on local measurements of the topology/connectivity around each node, valuable information about node functionality can be obtained by concentric (or hierarchical) measurements. This paper extends previous methodologies based on concentric measurements, by studying the possibility of using agglomerative clustering methods, in order to obtain a set of functional groups of nodes, considering particular institutional collaboration network nodes, including various known communities (departments of the University of Sao Paulo). Among the interesting obtained findings, we emphasize the scale-free nature of the network obtained, as well as identification of different patterns of authorship emerging from different areas (e.g. human and exact sciences). Another interesting result concerns the relatively uniform distribution of hubs along concentric levels, contrariwise to the non-uniform pattern found in theoretical scale-free networks such as the BA model. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
In this paper we present a novel approach for multispectral image contextual classification by combining iterative combinatorial optimization algorithms. The pixel-wise decision rule is defined using a Bayesian approach to combine two MRF models: a Gaussian Markov Random Field (GMRF) for the observations (likelihood) and a Potts model for the a priori knowledge, to regularize the solution in the presence of noisy data. Hence, the classification problem is stated according to a Maximum a Posteriori (MAP) framework. In order to approximate the MAP solution we apply several combinatorial optimization methods using multiple simultaneous initializations, making the solution less sensitive to the initial conditions and reducing both computational cost and time in comparison to Simulated Annealing, often unfeasible in many real image processing applications. Markov Random Field model parameters are estimated by Maximum Pseudo-Likelihood (MPL) approach, avoiding manual adjustments in the choice of the regularization parameters. Asymptotic evaluations assess the accuracy of the proposed parameter estimation procedure. To test and evaluate the proposed classification method, we adopt metrics for quantitative performance assessment (Cohen`s Kappa coefficient), allowing a robust and accurate statistical analysis. The obtained results clearly show that combining sub-optimal contextual algorithms significantly improves the classification performance, indicating the effectiveness of the proposed methodology. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
BACKGROUND: A major problem in Chagas disease donor screening is the high frequency of samples with inconclusive results. The objective of this study was to describe patterns of serologic results among donors to the three Brazilian REDS-II blood centers and correlate with epidemiologic characteristics. STUDY DESIGN AND METHODS: The centers screened donor samples with one Trypanosoma cruzi lysate enzyme immunoassay (EIA). EIA-reactive samples were tested with a second lysate EIA, a recombinant-antigen based EIA, and an immunfluorescence assay. Based on the serologic results, samples were classified as confirmed positive (CP), probable positive (PP), possible other parasitic infection (POPI), and false positive (FP). RESULTS: In 2007 to 2008, a total of 877 of 615,433 donations were discarded due to Chagas assay reactivity. The prevalences (95% confidence intervals [CIs]) among first-time donors for CP, PP, POPI, and FP patterns were 114 (99-129), 26 (19-34), 10 (5-14), and 96 (82-110) per 100,000 donations, respectively. CP and PP had similar patterns of prevalence when analyzed by age, sex, education, and location, suggesting that PP cases represent true T. cruzi infections; in contrast the demographics of donors with POPI were distinct and likely unrelated to Chagas disease. No CP cases were detected among 218,514 repeat donors followed for a total of 718,187 person-years. CONCLUSION: We have proposed a classification algorithm that may have practical importance for donor counseling and epidemiologic analyses of T. cruzi-seroreactive donors. The absence of incident T. cruzi infections is reassuring with respect to risk of window phase infections within Brazil and travel-related infections in nonendemic countries such as the United States.
Resumo:
This project is based on Artificial Intelligence (A.I) and Digital Image processing (I.P) for automatic condition monitoring of sleepers in the railway track. Rail inspection is a very important task in railway maintenance for traffic safety issues and in preventing dangerous situations. Monitoring railway track infrastructure is an important aspect in which the periodical inspection of rail rolling plane is required.Up to the present days the inspection of the railroad is operated manually by trained personnel. A human operator walks along the railway track searching for sleeper anomalies. This monitoring way is not more acceptable for its slowness and subjectivity. Hence, it is desired to automate such intuitive human skills for the development of more robust and reliable testing methods. Images of wooden sleepers have been used as data for my project. The aim of this project is to present a vision based technique for inspecting railway sleepers (wooden planks under the railway track) by automatic interpretation of Non Destructive Test (NDT) data using A.I. techniques in determining the results of inspection.
Resumo:
Wikipedia is a free, web-based, collaborative, multilingual encyclopedia project supported by the non-profit Wikimedia Foundation. Due to the free nature of Wikipedia and allowing open access to everyone to edit articles the quality of articles may be affected. As all people don’t have equal level of knowledge and also different people have different opinions about a topic so there may be difference between the contributions made by different authors. To overcome this situation it is very important to classify the articles so that the articles of good quality can be separated from the poor quality articles and should be removed from the database. The aim of this study is to classify the articles of Wikipedia into two classes class 0 (poor quality) and class 1(good quality) using the Adaptive Neuro Fuzzy Inference System (ANFIS) and data mining techniques. Two ANFIS are built using the Fuzzy Logic Toolbox [1] available in Matlab. The first ANFIS is based on the rules obtained from J48 classifier in WEKA while the other one was built by using the expert’s knowledge. The data used for this research work contains 226 article’s records taken from the German version of Wikipedia. The dataset consists of 19 inputs and one output. The data was preprocessed to remove any similar attributes. The input variables are related to the editors, contributors, length of articles and the lifecycle of articles. In the end analysis of different methods implemented in this research is made to analyze the performance of each classification method used.
Resumo:
Parkinson's disease (PD) is a degenerative illness whose cardinal symptoms include rigidity, tremor, and slowness of movement. In addition to its widely recognized effects PD can have a profound effect on speech and voice.The speech symptoms most commonly demonstrated by patients with PD are reduced vocal loudness, monopitch, disruptions of voice quality, and abnormally fast rate of speech. This cluster of speech symptoms is often termed Hypokinetic Dysarthria.The disease can be difficult to diagnose accurately, especially in its early stages, due to this reason, automatic techniques based on Artificial Intelligence should increase the diagnosing accuracy and to help the doctors make better decisions. The aim of the thesis work is to predict the PD based on the audio files collected from various patients.Audio files are preprocessed in order to attain the features.The preprocessed data contains 23 attributes and 195 instances. On an average there are six voice recordings per person, By using data compression technique such as Discrete Cosine Transform (DCT) number of instances can be minimized, after data compression, attribute selection is done using several WEKA build in methods such as ChiSquared, GainRatio, Infogain after identifying the important attributes, we evaluate attributes one by one by using stepwise regression.Based on the selected attributes we process in WEKA by using cost sensitive classifier with various algorithms like MultiPass LVQ, Logistic Model Tree(LMT), K-Star.The classified results shows on an average 80%.By using this features 95% approximate classification of PD is acheived.This shows that using the audio dataset, PD could be predicted with a higher level of accuracy.
Resumo:
The aim of this thesis is to investigate computerized voice assessment methods to classify between the normal and Dysarthric speech signals. In this proposed system, computerized assessment methods equipped with signal processing and artificial intelligence techniques have been introduced. The sentences used for the measurement of inter-stress intervals (ISI) were read by each subject. These sentences were computed for comparisons between normal and impaired voice. Band pass filter has been used for the preprocessing of speech samples. Speech segmentation is performed using signal energy and spectral centroid to separate voiced and unvoiced areas in speech signal. Acoustic features are extracted from the LPC model and speech segments from each audio signal to find the anomalies. The speech features which have been assessed for classification are Energy Entropy, Zero crossing rate (ZCR), Spectral-Centroid, Mean Fundamental-Frequency (Meanf0), Jitter (RAP), Jitter (PPQ), and Shimmer (APQ). Naïve Bayes (NB) has been used for speech classification. For speech test-1 and test-2, 72% and 80% accuracies of classification between healthy and impaired speech samples have been achieved respectively using the NB. For speech test-3, 64% correct classification is achieved using the NB. The results direct the possibility of speech impairment classification in PD patients based on the clinical rating scale.
Predictive models for chronic renal disease using decision trees, naïve bayes and case-based methods
Resumo:
Data mining can be used in healthcare industry to “mine” clinical data to discover hidden information for intelligent and affective decision making. Discovery of hidden patterns and relationships often goes intact, yet advanced data mining techniques can be helpful as remedy to this scenario. This thesis mainly deals with Intelligent Prediction of Chronic Renal Disease (IPCRD). Data covers blood, urine test, and external symptoms applied to predict chronic renal disease. Data from the database is initially transformed to Weka (3.6) and Chi-Square method is used for features section. After normalizing data, three classifiers were applied and efficiency of output is evaluated. Mainly, three classifiers are analyzed: Decision Tree, Naïve Bayes, K-Nearest Neighbour algorithm. Results show that each technique has its unique strength in realizing the objectives of the defined mining goals. Efficiency of Decision Tree and KNN was almost same but Naïve Bayes proved a comparative edge over others. Further sensitivity and specificity tests are used as statistical measures to examine the performance of a binary classification. Sensitivity (also called recall rate in some fields) measures the proportion of actual positives which are correctly identified while Specificity measures the proportion of negatives which are correctly identified. CRISP-DM methodology is applied to build the mining models. It consists of six major phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment.