8 resultados para infrared spectroscopy,chemometrics,least squares support vector machines
em Cochin University of Science
Resumo:
This paper presents the application of wavelet processing in the domain of handwritten character recognition. To attain high recognition rate, robust feature extractors and powerful classifiers that are invariant to degree of variability of human writing are needed. The proposed scheme consists of two stages: a feature extraction stage, which is based on Haar wavelet transform and a classification stage that uses support vector machine classifier. Experimental results show that the proposed method is effective
Resumo:
In our study we use a kernel based classification technique, Support Vector Machine Regression for predicting the Melting Point of Drug – like compounds in terms of Topological Descriptors, Topological Charge Indices, Connectivity Indices and 2D Auto Correlations. The Machine Learning model was designed, trained and tested using a dataset of 100 compounds and it was found that an SVMReg model with RBF Kernel could predict the Melting Point with a mean absolute error 15.5854 and Root Mean Squared Error 19.7576
Resumo:
This is a Named Entity Based Question Answering System for Malayalam Language. Although a vast amount of information is available today in digital form, no effective information access mechanism exists to provide humans with convenient information access. Information Retrieval and Question Answering systems are the two mechanisms available now for information access. Information systems typically return a long list of documents in response to a user’s query which are to be skimmed by the user to determine whether they contain an answer. But a Question Answering System allows the user to state his/her information need as a natural language question and receives most appropriate answer in a word or a sentence or a paragraph. This system is based on Named Entity Tagging and Question Classification. Document tagging extracts useful information from the documents which will be used in finding the answer to the question. Question Classification extracts useful information from the question to determine the type of the question and the way in which the question is to be answered. Various Machine Learning methods are used to tag the documents. Rule-Based Approach is used for Question Classification. Malayalam belongs to the Dravidian family of languages and is one of the four major languages of this family. It is one of the 22 Scheduled Languages of India with official language status in the state of Kerala. It is spoken by 40 million people. Malayalam is a morphologically rich agglutinative language and relatively of free word order. Also Malayalam has a productive morphology that allows the creation of complex words which are often highly ambiguous. Document tagging tools such as Parts-of-Speech Tagger, Phrase Chunker, Named Entity Tagger, and Compound Word Splitter are developed as a part of this research work. No such tools were available for Malayalam language. Finite State Transducer, High Order Conditional Random Field, Artificial Immunity System Principles, and Support Vector Machines are the techniques used for the design of these document preprocessing tools. This research work describes how the Named Entity is used to represent the documents. Single sentence questions are used to test the system. Overall Precision and Recall obtained are 88.5% and 85.9% respectively. This work can be extended in several directions. The coverage of non-factoid questions can be increased and also it can be extended to include open domain applications. Reference Resolution and Word Sense Disambiguation techniques are suggested as the future enhancements
Resumo:
Speech is the most natural means of communication among human beings and speech processing and recognition are intensive areas of research for the last five decades. Since speech recognition is a pattern recognition problem, classification is an important part of any speech recognition system. In this work, a speech recognition system is developed for recognizing speaker independent spoken digits in Malayalam. Voice signals are sampled directly from the microphone. The proposed method is implemented for 1000 speakers uttering 10 digits each. Since the speech signals are affected by background noise, the signals are tuned by removing the noise from it using wavelet denoising method based on Soft Thresholding. Here, the features from the signals are extracted using Discrete Wavelet Transforms (DWT) because they are well suitable for processing non-stationary signals like speech. This is due to their multi- resolutional, multi-scale analysis characteristics. Speech recognition is a multiclass classification problem. So, the feature vector set obtained are classified using three classifiers namely, Artificial Neural Networks (ANN), Support Vector Machines (SVM) and Naive Bayes classifiers which are capable of handling multiclasses. During classification stage, the input feature vector data is trained using information relating to known patterns and then they are tested using the test data set. The performances of all these classifiers are evaluated based on recognition accuracy. All the three methods produced good recognition accuracy. DWT and ANN produced a recognition accuracy of 89%, SVM and DWT combination produced an accuracy of 86.6% and Naive Bayes and DWT combination produced an accuracy of 83.5%. ANN is found to be better among the three methods.
Resumo:
A spectral angle based feature extraction method, Spectral Clustering Independent Component Analysis (SC-ICA), is proposed in this work to improve the brain tissue classification from Magnetic Resonance Images (MRI). SC-ICA provides equal priority to global and local features; thereby it tries to resolve the inefficiency of conventional approaches in abnormal tissue extraction. First, input multispectral MRI is divided into different clusters by a spectral distance based clustering. Then, Independent Component Analysis (ICA) is applied on the clustered data, in conjunction with Support Vector Machines (SVM) for brain tissue analysis. Normal and abnormal datasets, consisting of real and synthetic T1-weighted, T2-weighted and proton density/fluid-attenuated inversion recovery images, were used to evaluate the performance of the new method. Comparative analysis with ICA based SVM and other conventional classifiers established the stability and efficiency of SC-ICA based classification, especially in reproduction of small abnormalities. Clinical abnormal case analysis demonstrated it through the highest Tanimoto Index/accuracy values, 0.75/98.8%, observed against ICA based SVM results, 0.17/96.1%, for reproduced lesions. Experimental results recommend the proposed method as a promising approach in clinical and pathological studies of brain diseases
Resumo:
Rice husk silica was utilized as the promoter of ceria for preparing supported vanadia catalysts. Effect of vanadium content was investigated with 2–10 wt.% V2O5 loading over the support. Structural characterization of the catalysts was done by various techniques like energy dispersive X-ray (EDX), X-ray diffraction (XRD), BET surface area, thermal analysis (TGA/DTA), FT-infrared spectroscopy (FT-IR), UV–vis diffused reflectance spectroscopy (DR UV–vis), electron paramagnetic spectroscopy (EPR) and solid state magnetic resonance spectroscopies (29Si and 51V MASNMR). Catalytic activity was studied towards liquid-phase oxidation of benzene. Surface area of ceria enhanced upon rice husk silica promotion, thus makes dispersion of the active sites of vanadia easier. Highly dispersed vanadia was found for low V2O5 loading and formation of cerium orthovanadate (CeVO4) occurs as the loading increases. Spectroscopic investigation clearly confirms the formation of CeVO4 phase at higher loadings of V2O5. The oxidation activity increases with vanadia loading up to 8 wt.% V2O5, and further increase reduces the conversion rate. Selective formation of phenol can be attributed to the presence of highly dispersed active sites of vanadia over the support.
Resumo:
A series of rare-earth neodymia supported vanadium oxide catalysts with various V205 loadings ranging from 3 to 15 wt.% were prepared by the wet impregnation method using ammonium metavanadate as the vanadium precursor. The nature of vanadia species formed on the support surface is characterized hy a series of different physicochemical techniques like X-ray diffraction (XRD). Fourier transform infrared spectroscopy (FTIR). BET surface area, diffuse reflectance UV-vis spectroscopy (DR UV-vis), thermal analysis (TG-DTG/DTA) and SEM. The acidity of the prepared systems were verified by the stepwise temperature programmed desorprion of ammonia (NH3-TPD) and found that the total acidity gets increased with the percentage of vanadia loading. XRD and FT1R results shows the presence of surface dispersed vanadyl species at lower loadings and the formation of higher vanadate species as the percentage composition of vanadia is increased above 9 wt.%. The low surface area of the support. calcination temperature and the percentage of vanadia loading are found to influence the formation of higher vanadia species. The catalytic activity of the V205-Nd203 catalysts was probed in the liquid phase hydroxylation of phenol and the result show that the present catalysts are active at lower vanadia concentrations.
Resumo:
Near-infrared spectroscopy can be a workhorse technique for materials analysis in industries such as agriculture, pharmaceuticals, chemicals and polymers. A near-infrared spectrum represents combination bands and overtone bands that are harmonics of absorption frequencies in the mid-infrared. Near-infrared absorption includes a combination-band region immediately adjacent to the mid-infrared and three overtone regions. All four near-infrared regions contain "echoes" of the fundamental mid-infrared absorptions. For example, vibrations in the mid-infrared due to the C-H stretches will produce four distinct bands in each of the overtone and combination regions. As the bands become more removed from the fundamental frequencies they become more widely separated from their neighbors, more broadened and are dramatically reduced in intensity. Because near-infrared bands are much less intense, more of the sample can be used to produce a spectra and with near-infrared, sample preparation activities are greatly reduced or eliminated so more of the sample can be utilized. In addition, long path lengths and the ability to sample through glass in the near-infrared allows samples to be measured in common media such as culture tubes, cuvettes and reaction bottles. This is unlike mid-infrared where very small amounts of a sample produce a strong spectrum; thus sample preparation techniques must be employed to limit the amount of the sample that interacts with the beam. In the present work we describe the successful the fabrication and calibration of a linear high resolution linear spectrometer using tunable diode laser and a 36 m path length cell and meuurement of a highly resolved structure of OH group in methanol in the transition region A v =3. We then analyse the NIR spectrum of certain aromatic molecules and study the substituent effects using local mode theory