867 resultados para Typological Classification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Suffix separation plays a vital role in improving the quality of training in the Statistical Machine Translation from English into Malayalam. The morphological richness and the agglutinative nature of Malayalam make it necessary to retrieve the root word from its inflected form in the training process. The suffix separation process accomplishes this task by scrutinizing the Malayalam words and by applying sandhi rules. In this paper, various handcrafted rules designed for the suffix separation process in the English Malayalam SMT are presented. A classification of these rules is done based on the Malayalam syllable preceding the suffix in the inflected form of the word (check_letter). The suffixes beginning with the vowel sounds like ആല, ഉെെ, ഇല etc are mainly considered in this process. By examining the check_letter in a word, the suffix separation rules can be directly applied to extract the root words. The quick look up table provided in this paper can be used as a guideline in implementing suffix separation in Malayalam language

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A spectral angle based feature extraction method, Spectral Clustering Independent Component Analysis (SC-ICA), is proposed in this work to improve the brain tissue classification from Magnetic Resonance Images (MRI). SC-ICA provides equal priority to global and local features; thereby it tries to resolve the inefficiency of conventional approaches in abnormal tissue extraction. First, input multispectral MRI is divided into different clusters by a spectral distance based clustering. Then, Independent Component Analysis (ICA) is applied on the clustered data, in conjunction with Support Vector Machines (SVM) for brain tissue analysis. Normal and abnormal datasets, consisting of real and synthetic T1-weighted, T2-weighted and proton density/fluid-attenuated inversion recovery images, were used to evaluate the performance of the new method. Comparative analysis with ICA based SVM and other conventional classifiers established the stability and efficiency of SC-ICA based classification, especially in reproduction of small abnormalities. Clinical abnormal case analysis demonstrated it through the highest Tanimoto Index/accuracy values, 0.75/98.8%, observed against ICA based SVM results, 0.17/96.1%, for reproduced lesions. Experimental results recommend the proposed method as a promising approach in clinical and pathological studies of brain diseases

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a multispectral analysis system using wavelet based Principal Component Analysis (PCA), to improve the brain tissue classification from MRI images. Global transforms like PCA often neglects significant small abnormality details, while dealing with a massive amount of multispectral data. In order to resolve this issue, input dataset is expanded by detail coefficients from multisignal wavelet analysis. Then, PCA is applied on the new dataset to perform feature analysis. Finally, an unsupervised classification with Fuzzy C-Means clustering algorithm is used to measure the improvement in reproducibility and accuracy of the results. A detailed comparative analysis of classified tissues with those from conventional PCA is also carried out. Proposed method yielded good improvement in classification of small abnormalities with high sensitivity/accuracy values, 98.9/98.3, for clinical analysis. Experimental results from synthetic and clinical data recommend the new method as a promising approach in brain tissue analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multispectral analysis is a promising approach in tissue classification and abnormality detection from Magnetic Resonance (MR) images. But instability in accuracy and reproducibility of the classification results from conventional techniques keeps it far from clinical applications. Recent studies proposed Independent Component Analysis (ICA) as an effective method for source signals separation from multispectral MR data. However, it often fails to extract the local features like small abnormalities, especially from dependent real data. A multisignal wavelet analysis prior to ICA is proposed in this work to resolve these issues. Best de-correlated detail coefficients are combined with input images to give better classification results. Performance improvement of the proposed method over conventional ICA is effectively demonstrated by segmentation and classification using k-means clustering. Experimental results from synthetic and real data strongly confirm the positive effect of the new method with an improved Tanimoto index/Sensitivity values, 0.884/93.605, for reproduced small white matter lesions

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper an attempt has been made to determine the number of Premature Ventricular Contraction (PVC) cycles accurately from a given Electrocardiogram (ECG) using a wavelet constructed from multiple Gaussian functions. It is difficult to assess the ECGs of patients who are continuously monitored over a long period of time. Hence the proposed method of classification will be helpful to doctors to determine the severity of PVC in a patient. Principal Component Analysis (PCA) and a simple classifier have been used in addition to the specially developed wavelet transform. The proposed wavelet has been designed using multiple Gaussian functions which when summed up looks similar to that of a normal ECG. The number of Gaussians used depends on the number of peaks present in a normal ECG. The developed wavelet satisfied all the properties of a traditional continuous wavelet. The new wavelet was optimized using genetic algorithm (GA). ECG records from Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) database have been used for validation. Out of the 8694 ECG cycles used for evaluation, the classification algorithm responded with an accuracy of 97.77%. In order to compare the performance of the new wavelet, classification was also performed using the standard wavelets like morlet, meyer, bior3.9, db5, db3, sym3 and haar. The new wavelet outperforms the rest

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cancer treatment is most effective when it is detected early and the progress in treatment will be closely related to the ability to reduce the proportion of misses in the cancer detection task. The effectiveness of algorithms for detecting cancers can be greatly increased if these algorithms work synergistically with those for characterizing normal mammograms. This research work combines computerized image analysis techniques and neural networks to separate out some fraction of the normal mammograms with extremely high reliability, based on normal tissue identification and removal. The presence of clustered microcalcifications is one of the most important and sometimes the only sign of cancer on a mammogram. 60% to 70% of non-palpable breast carcinoma demonstrates microcalcifications on mammograms [44], [45], [46].WT based techniques are applied on the remaining mammograms, those are obviously abnormal, to detect possible microcalcifications. The goal of this work is to improve the detection performance and throughput of screening-mammography, thus providing a ‘second opinion ‘ to the radiologists. The state-of- the- art DWT computation algorithms are not suitable for practical applications with memory and delay constraints, as it is not a block transfonn. Hence in this work, the development of a Block DWT (BDWT) computational structure having low processing memory requirement has also been taken up.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The characterization and grading of glioma tumors, via image derived features, for diagnosis, prognosis, and treatment response has been an active research area in medical image computing. This paper presents a novel method for automatic detection and classification of glioma from conventional T2 weighted MR images. Automatic detection of the tumor was established using newly developed method called Adaptive Gray level Algebraic set Segmentation Algorithm (AGASA).Statistical Features were extracted from the detected tumor texture using first order statistics and gray level co-occurrence matrix (GLCM) based second order statistical methods. Statistical significance of the features was determined by t-test and its corresponding p-value. A decision system was developed for the grade detection of glioma using these selected features and its p-value. The detection performance of the decision system was validated using the receiver operating characteristic (ROC) curve. The diagnosis and grading of glioma using this non-invasive method can contribute promising results in medical image computing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As a result of the drive towards waste-poor world and reserving the non-renewable materials, recycling the construction and demolition materials become very essential. Now reuse of the recycled concrete aggregate more than 4 mm in producing new concrete is allowed but with natural sand a fine aggregate while. While the sand portion that represent about 30\% to 60\% of the crushed demolition materials is disposed off. To perform this research, recycled concrete sand was produced in the laboratory while nine recycled sands produced from construction and demolitions materials and two sands from natural crushed limestone were delivered from three plants. Ten concrete mix designs representing the concrete exposition classes XC1, XC2, XF3 and XF4 according to European standard EN 206 were produced with partial and full replacement of natural sand by the different recycled sands. Bituminous mixtures achieving the requirements of base courses according to Germany standards and both base and binder courses according to Egyptian standards were produced with the recycled sands as a substitution to the natural sands. The mechanical properties and durability of concrete produced with the different recycled sands were investigated and analyzed. Also the volumetric analysis and Marshall test were performed hot bituminous mixtures produced with the recycled sands. According to the effect of replacement the natural sand by the different recycled sands on the concrete compressive strength and durability, the recycled sands were classified into three groups. The maximum allowable recycled sand that can be used in the different concrete exposition class was determined for each group. For the asphalt concrete mixes all the investigated recycled sands can be used in mixes for base and binder courses up to 21\% of the total aggregate mass.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Die thermische Verarbeitung von Lebensmitteln beeinflusst deren Qualität und ernährungsphysiologischen Eigenschaften. Im Haushalt ist die Überwachung der Temperatur innerhalb des Lebensmittels sehr schwierig. Zudem ist das Wissen über optimale Temperatur- und Zeitparameter für die verschiedenen Speisen oft unzureichend. Die optimale Steuerung der thermischen Zubereitung ist maßgeblich abhängig von der Art des Lebensmittels und der äußeren und inneren Temperatureinwirkung während des Garvorgangs. Das Ziel der Arbeiten war die Entwicklung eines automatischen Backofens, der in der Lage ist, die Art des Lebensmittels zu erkennen und die Temperatur im Inneren des Lebensmittels während des Backens zu errechnen. Die für die Temperaturberechnung benötigten Daten wurden mit mehreren Sensoren erfasst. Hierzu kam ein Infrarotthermometer, ein Infrarotabstandssensor, eine Kamera, ein Temperatursensor und ein Lambdasonde innerhalb des Ofens zum Einsatz. Ferner wurden eine Wägezelle, ein Strom- sowie Spannungs-Sensor und ein Temperatursensor außerhalb des Ofens genutzt. Die während der Aufheizphase aufgenommen Datensätze ermöglichten das Training mehrerer künstlicher neuronaler Netze, die die verschiedenen Lebensmittel in die entsprechenden Kategorien einordnen konnten, um so das optimale Backprogram auszuwählen. Zur Abschätzung der thermische Diffusivität der Nahrung, die von der Zusammensetzung (Kohlenhydrate, Fett, Protein, Wasser) abhängt, wurden mehrere künstliche neuronale Netze trainiert. Mit Ausnahme des Fettanteils der Lebensmittel konnten alle Komponenten durch verschiedene KNNs mit einem Maximum von 8 versteckten Neuronen ausreichend genau abgeschätzt werden um auf deren Grundlage die Temperatur im inneren des Lebensmittels zu berechnen. Die durchgeführte Arbeit zeigt, dass mit Hilfe verschiedenster Sensoren zur direkten beziehungsweise indirekten Messung der äußeren Eigenschaften der Lebensmittel sowie KNNs für die Kategorisierung und Abschätzung der Lebensmittelzusammensetzung die automatische Erkennung und Berechnung der inneren Temperatur von verschiedensten Lebensmitteln möglich ist.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There are numerous text documents available in electronic form. More and more are becoming available every day. Such documents represent a massive amount of information that is easily accessible. Seeking value in this huge collection requires organization; much of the work of organizing documents can be automated through text classification. The accuracy and our understanding of such systems greatly influences their usefulness. In this paper, we seek 1) to advance the understanding of commonly used text classification techniques, and 2) through that understanding, improve the tools that are available for text classification. We begin by clarifying the assumptions made in the derivation of Naive Bayes, noting basic properties and proposing ways for its extension and improvement. Next, we investigate the quality of Naive Bayes parameter estimates and their impact on classification. Our analysis leads to a theorem which gives an explanation for the improvements that can be found in multiclass classification with Naive Bayes using Error-Correcting Output Codes. We use experimental evidence on two commonly-used data sets to exhibit an application of the theorem. Finally, we show fundamental flaws in a commonly-used feature selection algorithm and develop a statistics-based framework for text feature selection. Greater understanding of Naive Bayes and the properties of text allows us to make better use of it in text classification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis describes a representation of gait appearance for the purpose of person identification and classification. This gait representation is based on simple localized image features such as moments extracted from orthogonal view video silhouettes of human walking motion. A suite of time-integration methods, spanning a range of coarseness of time aggregation and modeling of feature distributions, are applied to these image features to create a suite of gait sequence representations. Despite their simplicity, the resulting feature vectors contain enough information to perform well on human identification and gender classification tasks. We demonstrate the accuracy of recognition on gait video sequences collected over different days and times and under varying lighting environments. Each of the integration methods are investigated for their advantages and disadvantages. An improved gait representation is built based on our experiences with the initial set of gait representations. In addition, we show gender classification results using our gait appearance features, the effect of our heuristic feature selection method, and the significance of individual features.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel approach to multiclass tumor classification using Artificial Neural Networks (ANNs) was introduced in a recent paper cite{Khan2001}. The method successfully classified and diagnosed small, round blue cell tumors (SRBCTs) of childhood into four distinct categories, neuroblastoma (NB), rhabdomyosarcoma (RMS), non-Hodgkin lymphoma (NHL) and the Ewing family of tumors (EWS), using cDNA gene expression profiles of samples that included both tumor biopsy material and cell lines. We report that using an approach similar to the one reported by Yeang et al cite{Yeang2001}, i.e. multiclass classification by combining outputs of binary classifiers, we achieved equal accuracy with much fewer features. We report the performances of 3 binary classifiers (k-nearest neighbors (kNN), weighted-voting (WV), and support vector machines (SVM)) with 3 feature selection techniques (Golub's Signal to Noise (SN) ratios cite{Golub99}, Fisher scores (FSc) and Mukherjee's SVM feature selection (SVMFS))cite{Sayan98}.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We compare Naive Bayes and Support Vector Machines on the task of multiclass text classification. Using a variety of approaches to combine the underlying binary classifiers, we find that SVMs substantially outperform Naive Bayes. We present full multiclass results on two well-known text data sets, including the lowest error to date on both data sets. We develop a new indicator of binary performance to show that the SVM's lower multiclass error is a result of its improved binary performance. Furthermore, we demonstrate and explore the surprising result that one-vs-all classification performs favorably compared to other approaches even though it has no error-correcting properties.