8 resultados para Compound Word Splitter
em Cochin University of Science
Resumo:
This is a Named Entity Based Question Answering System for Malayalam Language. Although a vast amount of information is available today in digital form, no effective information access mechanism exists to provide humans with convenient information access. Information Retrieval and Question Answering systems are the two mechanisms available now for information access. Information systems typically return a long list of documents in response to a user’s query which are to be skimmed by the user to determine whether they contain an answer. But a Question Answering System allows the user to state his/her information need as a natural language question and receives most appropriate answer in a word or a sentence or a paragraph. This system is based on Named Entity Tagging and Question Classification. Document tagging extracts useful information from the documents which will be used in finding the answer to the question. Question Classification extracts useful information from the question to determine the type of the question and the way in which the question is to be answered. Various Machine Learning methods are used to tag the documents. Rule-Based Approach is used for Question Classification. Malayalam belongs to the Dravidian family of languages and is one of the four major languages of this family. It is one of the 22 Scheduled Languages of India with official language status in the state of Kerala. It is spoken by 40 million people. Malayalam is a morphologically rich agglutinative language and relatively of free word order. Also Malayalam has a productive morphology that allows the creation of complex words which are often highly ambiguous. Document tagging tools such as Parts-of-Speech Tagger, Phrase Chunker, Named Entity Tagger, and Compound Word Splitter are developed as a part of this research work. No such tools were available for Malayalam language. Finite State Transducer, High Order Conditional Random Field, Artificial Immunity System Principles, and Support Vector Machines are the techniques used for the design of these document preprocessing tools. This research work describes how the Named Entity is used to represent the documents. Single sentence questions are used to test the system. Overall Precision and Recall obtained are 88.5% and 85.9% respectively. This work can be extended in several directions. The coverage of non-factoid questions can be increased and also it can be extended to include open domain applications. Reference Resolution and Word Sense Disambiguation techniques are suggested as the future enhancements
Resumo:
The mechanism of devulcanization of sulfur-vulcanized natural rubber with aromatic disulfides and aliphatic amines has been studied using 23-dimethyl-2-butene (C5H1,) as a low-molecular weight model compound. First C6H12 was vulcanized with a mixture of sulfur, zinc stearate and N-cyclohexyl-2-benzothiazylsulfenamide (CBS) as accelerator at 140 °C, resulting in a mixture of addition products (C(,H 1 i-S,-C5H 1 i ). The compounds were isolated and identified by High Performance Liquid Chromatography (HPLC) with respect to their various sulfur ranks. In it second stage, the vulcanized products were devulcanized using the agents mentioned above at 200 °C. The kinetics and chemistry of the breakdown of the sulfur-hridges were monitored. Both devulcanization agents decompose sulfidic vulcanization products with sulfur ranks equal or higher than 3 quite effectively and with comparable speed. Di phenyldisulfide as devulcanization agent gives rise to a high amount of mono- and disulfidic compounds formed during the devulcanization, hexadecylamine, as devulcanization agent, prevents these lower sulfur ranks from being formed.
Resumo:
It is observed that reclamation of natural rubber latex based rubber using 2,2'-dibenzamidodiphenvldisulphide as reclaiming agent is an optional methodology for recycling of waste latex rubber (WLR). For progressive replacement of virgin natural rubber by the reclaim, two alternatives curing system were investigated: adjustment or reduction of the curing system with increasing reclaim content, to compensate for the extra amount of curatives brought along by the reclaim. For fixed curing system, as if the reclaim were equivalent to virgin NR. The cure behavior, final crosslink density and distribution, mechanical properties, and dynamic viscoelastic properties of the blends with reclaimed WLR are measured and compared with the virgin compound. The morphology of the blends, sulfur migration, and final distribution are analyzed.The mechanical and dynamic viscoelastic properties deteriorate for both curing systems, but to a lesser extent for fixed curing system compared to adjusted curing system. With the fixed cure system, many properties like tensile strength and compression set do still deteriorate, but tan 6 and Mrrr„/Murxr, representative for the rolling resistance of tires are improved. On the other hand, with the adjusted cure system both mechanical and dynamic properties still deteriorate.
Resumo:
The present work is mainly concentrated on setting up a NIR tunable diode laser absorption (TDLA) spectrometer for high-resolution molecular spectroscopic studies. For successfully recording the high-resolution tunable diode laser spectrum, various experimental considerations are to be taken into account like the setup should be free from mechanical vibrations, sample should be kept at a low pressure, laser should be in a single mode operation etc. The present experimental set up considers all these factors. It is to be mentioned here that the setting up of a high resolution NIR TDLA spectrometer is a novel experiment requiring much effort and patience. The analysis of near infrared (NIR) vibrational overtone spectra of some substituted benzene compounds using local mode model forms another part of the present work. An attempt is made to record the pulsed laser induced fluorescence/Raman spectra of some organic compounds. A Q-switched Nd:YAG laser is used as the excitation source. A TRIAX monochromator and CCD detector is used for the spectral recording. The observed fluorescence emission for carbon disulphide is centered at 680 nm; this is assigned as due to the n, p* transition. Aniline also shows a broad fluorescence emission centered at 725 nm, which is due to the p,p* transition. The pulsed laser Raman spectra of some organic compounds are also recorded using the same experimental setup. The calibration of the set up is done using the laser Raman spectra of carbon tetrachloride and carbon disulphide. The observed laser Raman spectra for aniline, o-chloroaniline and m-chlorotoluene show peaks characteristics of the aromatic ring in common and the characteristics peaks due to the substitutuent groups. Some new peaks corresponding to low-lying vibrations of these molecules are also assigned
Resumo:
A simple and efficient method for determining the complex permittivity of dielectric materials from both reflected and transmitted signals is presented. It is also novel because the technique is implemented using two pyramidal horns without any focusing mechanisms. The dielectric constant of a noninteractive and distributive (NID) mixture of dielectrics is also determined
Resumo:
Persistence of the antivibrio property of the potential antagonistic probiotics, Pseudomonas MCCB 102 and 103, at di¡erent temperatures, pH and in organic solvents was studied. The antivibrio compound was extracted, puri¢ed and characterized using thin-layer chromatography, high-pressure liquid chromatography, liquid chromatography-mass spectroscopy, UV^ Vis and nuclear magnetic resonance spectroscopy and identi¢ed as N-methyl-1-hydroxyphenazine, a phenazine antibiotic. The toxicity of the compound was tested in Penaeus monodon haemocyte culture and the IC50 valuewas found to be1.4 0.31mg L 1. The compound was found to be bacteriostatic at 0.5mg L 1. Its stability to varying temperature, pH, organic solvents, prolonged shelf-life and vibriostatic nature point to its suitability for prophylatic aquaculture application.
Resumo:
In Statistical Machine Translation from English to Malayalam, an unseen English sentence is translated into its equivalent Malayalam translation using statistical models like translation model, language model and a decoder. A parallel corpus of English-Malayalam is used in the training phase. Word to word alignments has to be set up among the sentence pairs of the source and target language before subjecting them for training. This paper is deals with the techniques which can be adopted for improving the alignment model of SMT. Incorporating the parts of speech information into the bilingual corpus has eliminated many of the insignificant alignments. Also identifying the name entities and cognates present in the sentence pairs has proved to be advantageous while setting up the alignments. Moreover, reduction of the unwanted alignments has brought in better training results. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics