972 resultados para User classification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This is a Named Entity Based Question Answering System for Malayalam Language. Although a vast amount of information is available today in digital form, no effective information access mechanism exists to provide humans with convenient information access. Information Retrieval and Question Answering systems are the two mechanisms available now for information access. Information systems typically return a long list of documents in response to a user’s query which are to be skimmed by the user to determine whether they contain an answer. But a Question Answering System allows the user to state his/her information need as a natural language question and receives most appropriate answer in a word or a sentence or a paragraph. This system is based on Named Entity Tagging and Question Classification. Document tagging extracts useful information from the documents which will be used in finding the answer to the question. Question Classification extracts useful information from the question to determine the type of the question and the way in which the question is to be answered. Various Machine Learning methods are used to tag the documents. Rule-Based Approach is used for Question Classification. Malayalam belongs to the Dravidian family of languages and is one of the four major languages of this family. It is one of the 22 Scheduled Languages of India with official language status in the state of Kerala. It is spoken by 40 million people. Malayalam is a morphologically rich agglutinative language and relatively of free word order. Also Malayalam has a productive morphology that allows the creation of complex words which are often highly ambiguous. Document tagging tools such as Parts-of-Speech Tagger, Phrase Chunker, Named Entity Tagger, and Compound Word Splitter are developed as a part of this research work. No such tools were available for Malayalam language. Finite State Transducer, High Order Conditional Random Field, Artificial Immunity System Principles, and Support Vector Machines are the techniques used for the design of these document preprocessing tools. This research work describes how the Named Entity is used to represent the documents. Single sentence questions are used to test the system. Overall Precision and Recall obtained are 88.5% and 85.9% respectively. This work can be extended in several directions. The coverage of non-factoid questions can be increased and also it can be extended to include open domain applications. Reference Resolution and Word Sense Disambiguation techniques are suggested as the future enhancements

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Treating e-mail filtering as a binary text classification problem, researchers have applied several statistical learning algorithms to email corpora with promising results. This paper examines the performance of a Naive Bayes classifier using different approaches to feature selection and tokenization on different email corpora

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Underwater target localization and tracking attracts tremendous research interest due to various impediments to the estimation task caused by the noisy ocean environment. This thesis envisages the implementation of a prototype automated system for underwater target localization, tracking and classification using passive listening buoy systems and target identification techniques. An autonomous three buoy system has been developed and field trials have been conducted successfully. Inaccuracies in the localization results, due to changes in the environmental parameters, measurement errors and theoretical approximations are refined using the Kalman filter approach. Simulation studies have been conducted for the tracking of targets with different scenarios even under maneuvering situations. This system can as well be used for classifying the unknown targets by extracting the features of the noise emanations from the targets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Suffix separation plays a vital role in improving the quality of training in the Statistical Machine Translation from English into Malayalam. The morphological richness and the agglutinative nature of Malayalam make it necessary to retrieve the root word from its inflected form in the training process. The suffix separation process accomplishes this task by scrutinizing the Malayalam words and by applying sandhi rules. In this paper, various handcrafted rules designed for the suffix separation process in the English Malayalam SMT are presented. A classification of these rules is done based on the Malayalam syllable preceding the suffix in the inflected form of the word (check_letter). The suffixes beginning with the vowel sounds like ആല, ഉെെ, ഇല etc are mainly considered in this process. By examining the check_letter in a word, the suffix separation rules can be directly applied to extract the root words. The quick look up table provided in this paper can be used as a guideline in implementing suffix separation in Malayalam language

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this study is to show the importance of two classification techniques, viz. decision tree and clustering, in prediction of learning disabilities (LD) of school-age children. LDs affect about 10 percent of all children enrolled in schools. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. Decision trees and clustering are powerful and popular tools used for classification and prediction in Data mining. Different rules extracted from the decision tree are used for prediction of learning disabilities. Clustering is the assignment of a set of observations into subsets, called clusters, which are useful in finding the different signs and symptoms (attributes) present in the LD affected child. In this paper, J48 algorithm is used for constructing the decision tree and K-means algorithm is used for creating the clusters. By applying these classification techniques, LD in any child can be identified

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A spectral angle based feature extraction method, Spectral Clustering Independent Component Analysis (SC-ICA), is proposed in this work to improve the brain tissue classification from Magnetic Resonance Images (MRI). SC-ICA provides equal priority to global and local features; thereby it tries to resolve the inefficiency of conventional approaches in abnormal tissue extraction. First, input multispectral MRI is divided into different clusters by a spectral distance based clustering. Then, Independent Component Analysis (ICA) is applied on the clustered data, in conjunction with Support Vector Machines (SVM) for brain tissue analysis. Normal and abnormal datasets, consisting of real and synthetic T1-weighted, T2-weighted and proton density/fluid-attenuated inversion recovery images, were used to evaluate the performance of the new method. Comparative analysis with ICA based SVM and other conventional classifiers established the stability and efficiency of SC-ICA based classification, especially in reproduction of small abnormalities. Clinical abnormal case analysis demonstrated it through the highest Tanimoto Index/accuracy values, 0.75/98.8%, observed against ICA based SVM results, 0.17/96.1%, for reproduced lesions. Experimental results recommend the proposed method as a promising approach in clinical and pathological studies of brain diseases

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a multispectral analysis system using wavelet based Principal Component Analysis (PCA), to improve the brain tissue classification from MRI images. Global transforms like PCA often neglects significant small abnormality details, while dealing with a massive amount of multispectral data. In order to resolve this issue, input dataset is expanded by detail coefficients from multisignal wavelet analysis. Then, PCA is applied on the new dataset to perform feature analysis. Finally, an unsupervised classification with Fuzzy C-Means clustering algorithm is used to measure the improvement in reproducibility and accuracy of the results. A detailed comparative analysis of classified tissues with those from conventional PCA is also carried out. Proposed method yielded good improvement in classification of small abnormalities with high sensitivity/accuracy values, 98.9/98.3, for clinical analysis. Experimental results from synthetic and clinical data recommend the new method as a promising approach in brain tissue analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multispectral analysis is a promising approach in tissue classification and abnormality detection from Magnetic Resonance (MR) images. But instability in accuracy and reproducibility of the classification results from conventional techniques keeps it far from clinical applications. Recent studies proposed Independent Component Analysis (ICA) as an effective method for source signals separation from multispectral MR data. However, it often fails to extract the local features like small abnormalities, especially from dependent real data. A multisignal wavelet analysis prior to ICA is proposed in this work to resolve these issues. Best de-correlated detail coefficients are combined with input images to give better classification results. Performance improvement of the proposed method over conventional ICA is effectively demonstrated by segmentation and classification using k-means clustering. Experimental results from synthetic and real data strongly confirm the positive effect of the new method with an improved Tanimoto index/Sensitivity values, 0.884/93.605, for reproduced small white matter lesions

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The purpose of this paper is to examine the information-seeking behavior of science and social science research scholars, including service effectiveness, satisfaction level on different type of sources and various methods adopted by the scholars for keeping up to dateData were gathered using a questionnaire survey of 200, randomly selected, PhD students of science and social science departments of four universities in Kerala, IndiaAlthough similarities exist between social science and science PhD students with regard to information-seeking behavior, there are significant differences as well. There is a significant difference between science and social science scholars on the perception of the adequacy of print journals and database collection which are very relevant to the research purposes. There is no significant difference between science and social science scholars on the perception of the adequacy of e-journals, the most used source for keeping up to date. The study proved that scholars of both the fields are dissatisfied with the effectiveness of the library in keeping them up to date with latest developments

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper an attempt has been made to determine the number of Premature Ventricular Contraction (PVC) cycles accurately from a given Electrocardiogram (ECG) using a wavelet constructed from multiple Gaussian functions. It is difficult to assess the ECGs of patients who are continuously monitored over a long period of time. Hence the proposed method of classification will be helpful to doctors to determine the severity of PVC in a patient. Principal Component Analysis (PCA) and a simple classifier have been used in addition to the specially developed wavelet transform. The proposed wavelet has been designed using multiple Gaussian functions which when summed up looks similar to that of a normal ECG. The number of Gaussians used depends on the number of peaks present in a normal ECG. The developed wavelet satisfied all the properties of a traditional continuous wavelet. The new wavelet was optimized using genetic algorithm (GA). ECG records from Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) database have been used for validation. Out of the 8694 ECG cycles used for evaluation, the classification algorithm responded with an accuracy of 97.77%. In order to compare the performance of the new wavelet, classification was also performed using the standard wavelets like morlet, meyer, bior3.9, db5, db3, sym3 and haar. The new wavelet outperforms the rest

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Cancer treatment is most effective when it is detected early and the progress in treatment will be closely related to the ability to reduce the proportion of misses in the cancer detection task. The effectiveness of algorithms for detecting cancers can be greatly increased if these algorithms work synergistically with those for characterizing normal mammograms. This research work combines computerized image analysis techniques and neural networks to separate out some fraction of the normal mammograms with extremely high reliability, based on normal tissue identification and removal. The presence of clustered microcalcifications is one of the most important and sometimes the only sign of cancer on a mammogram. 60% to 70% of non-palpable breast carcinoma demonstrates microcalcifications on mammograms [44], [45], [46].WT based techniques are applied on the remaining mammograms, those are obviously abnormal, to detect possible microcalcifications. The goal of this work is to improve the detection performance and throughput of screening-mammography, thus providing a ‘second opinion ‘ to the radiologists. The state-of- the- art DWT computation algorithms are not suitable for practical applications with memory and delay constraints, as it is not a block transfonn. Hence in this work, the development of a Block DWT (BDWT) computational structure having low processing memory requirement has also been taken up.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The characterization and grading of glioma tumors, via image derived features, for diagnosis, prognosis, and treatment response has been an active research area in medical image computing. This paper presents a novel method for automatic detection and classification of glioma from conventional T2 weighted MR images. Automatic detection of the tumor was established using newly developed method called Adaptive Gray level Algebraic set Segmentation Algorithm (AGASA).Statistical Features were extracted from the detected tumor texture using first order statistics and gray level co-occurrence matrix (GLCM) based second order statistical methods. Statistical significance of the features was determined by t-test and its corresponding p-value. A decision system was developed for the grade detection of glioma using these selected features and its p-value. The detection performance of the decision system was validated using the receiver operating characteristic (ROC) curve. The diagnosis and grading of glioma using this non-invasive method can contribute promising results in medical image computing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The country has witnessed tremendous increase in the vehicle population and increased axle loading pattern during the last decade, leaving its road network overstressed and leading to premature failure. The type of deterioration present in the pavement should be considered for determining whether it has a functional or structural deficiency, so that appropriate overlay type and design can be developed. Structural failure arises from the conditions that adversely affect the load carrying capability of the pavement structure. Inadequate thickness, cracking, distortion and disintegration cause structural deficiency. Functional deficiency arises when the pavement does not provide a smooth riding surface and comfort to the user. This can be due to poor surface friction and texture, hydro planning and splash from wheel path, rutting and excess surface distortion such as potholes, corrugation, faulting, blow up, settlement, heaves etc. Functional condition determines the level of service provided by the facility to its users at a particular time and also the Vehicle Operating Costs (VOC), thus influencing the national economy. Prediction of the pavement deterioration is helpful to assess the remaining effective service life (RSL) of the pavement structure on the basis of reduction in performance levels, and apply various alternative designs and rehabilitation strategies with a long range funding requirement for pavement preservation. In addition, they can predict the impact of treatment on the condition of the sections. The infrastructure prediction models can thus be classified into four groups, namely primary response models, structural performance models, functional performance models and damage models. The factors affecting the deterioration of the roads are very complex in nature and vary from place to place. Hence there is need to have a thorough study of the deterioration mechanism under varied climatic zones and soil conditions before arriving at a definite strategy of road improvement. Realizing the need for a detailed study involving all types of roads in the state with varying traffic and soil conditions, the present study has been attempted. This study attempts to identify the parameters that affect the performance of roads and to develop performance models suitable to Kerala conditions. A critical review of the various factors that contribute to the pavement performance has been presented based on the data collected from selected road stretches and also from five corporations of Kerala. These roads represent the urban conditions as well as National Highways, State Highways and Major District Roads in the sub urban and rural conditions. This research work is a pursuit towards a study of the road condition of Kerala with respect to varying soil, traffic and climatic conditions, periodic performance evaluation of selected roads of representative types and development of distress prediction models for roads of Kerala. In order to achieve this aim, the study is focused into 2 parts. The first part deals with the study of the pavement condition and subgrade soil properties of urban roads distributed in 5 Corporations of Kerala; namely Thiruvananthapuram, Kollam, Kochi, Thrissur and Kozhikode. From selected 44 roads, 68 homogeneous sections were studied. The data collected on the functional and structural condition of the surface include pavement distress in terms of cracks, potholes, rutting, raveling and pothole patching. The structural strength of the pavement was measured as rebound deflection using Benkelman Beam deflection studies. In order to collect the details of the pavement layers and find out the subgrade soil properties, trial pits were dug and the in-situ field density was found using the Sand Replacement Method. Laboratory investigations were carried out to find out the subgrade soil properties, soil classification, Atterberg limits, Optimum Moisture Content, Field Moisture Content and 4 days soaked CBR. The relative compaction in the field was also determined. The traffic details were also collected by conducting traffic volume count survey and axle load survey. From the data thus collected, the strength of the pavement was calculated which is a function of the layer coefficient and thickness and is represented as Structural Number (SN). This was further related to the CBR value of the soil and the Modified Structural Number (MSN) was found out. The condition of the pavement was represented in terms of the Pavement Condition Index (PCI) which is a function of the distress of the surface at the time of the investigation and calculated in the present study using deduct value method developed by U S Army Corps of Engineers. The influence of subgrade soil type and pavement condition on the relationship between MSN and rebound deflection was studied using appropriate plots for predominant types of soil and for classified value of Pavement Condition Index. The relationship will be helpful for practicing engineers to design the overlay thickness required for the pavement, without conducting the BBD test. Regression analysis using SPSS was done with various trials to find out the best fit relationship between the rebound deflection and CBR, and other soil properties for Gravel, Sand, Silt & Clay fractions. The second part of the study deals with periodic performance evaluation of selected road stretches representing National Highway (NH), State Highway (SH) and Major District Road (MDR), located in different geographical conditions and with varying traffic. 8 road sections divided into 15 homogeneous sections were selected for the study and 6 sets of continuous periodic data were collected. The periodic data collected include the functional and structural condition in terms of distress (pothole, pothole patch, cracks, rutting and raveling), skid resistance using a portable skid resistance pendulum, surface unevenness using Bump Integrator, texture depth using sand patch method and rebound deflection using Benkelman Beam. Baseline data of the study stretches were collected as one time data. Pavement history was obtained as secondary data. Pavement drainage characteristics were collected in terms of camber or cross slope using camber board (slope meter) for the carriage way and shoulders, availability of longitudinal side drain, presence of valley, terrain condition, soil moisture content, water table data, High Flood Level, rainfall data, land use and cross slope of the adjoining land. These data were used for finding out the drainage condition of the study stretches. Traffic studies were conducted, including classified volume count and axle load studies. From the field data thus collected, the progression of each parameter was plotted for all the study roads; and validated for their accuracy. Structural Number (SN) and Modified Structural Number (MSN) were calculated for the study stretches. Progression of the deflection, distress, unevenness, skid resistance and macro texture of the study roads were evaluated. Since the deterioration of the pavement is a complex phenomena contributed by all the above factors, pavement deterioration models were developed as non linear regression models, using SPSS with the periodic data collected for all the above road stretches. General models were developed for cracking progression, raveling progression, pothole progression and roughness progression using SPSS. A model for construction quality was also developed. Calibration of HDM–4 pavement deterioration models for local conditions was done using the data for Cracking, Raveling, Pothole and Roughness. Validation was done using the data collected in 2013. The application of HDM-4 to compare different maintenance and rehabilitation options were studied considering the deterioration parameters like cracking, pothole and raveling. The alternatives considered for analysis were base alternative with crack sealing and patching, overlay with 40 mm BC using ordinary bitumen, overlay with 40 mm BC using Natural Rubber Modified Bitumen and an overlay of Ultra Thin White Topping. Economic analysis of these options was done considering the Life Cycle Cost (LCC). The average speed that can be obtained by applying these options were also compared. The results were in favour of Ultra Thin White Topping over flexible pavements. Hence, Design Charts were also plotted for estimation of maximum wheel load stresses for different slab thickness under different soil conditions. The design charts showed the maximum stress for a particular slab thickness and different soil conditions incorporating different k values. These charts can be handy for a design engineer. Fuzzy rule based models developed for site specific conditions were compared with regression models developed using SPSS. The Riding Comfort Index (RCI) was calculated and correlated with unevenness to develop a relationship. Relationships were developed between Skid Number and Macro Texture of the pavement. The effort made through this research work will be helpful to highway engineers in understanding the behaviour of flexible pavements in Kerala conditions and for arriving at suitable maintenance and rehabilitation strategies. Key Words: Flexible Pavements – Performance Evaluation – Urban Roads – NH – SH and other roads – Performance Models – Deflection – Riding Comfort Index – Skid Resistance – Texture Depth – Unevenness – Ultra Thin White Topping

Relevância:

20.00% 20.00%

Publicador:

Resumo:

As a result of the drive towards waste-poor world and reserving the non-renewable materials, recycling the construction and demolition materials become very essential. Now reuse of the recycled concrete aggregate more than 4 mm in producing new concrete is allowed but with natural sand a fine aggregate while. While the sand portion that represent about 30\% to 60\% of the crushed demolition materials is disposed off. To perform this research, recycled concrete sand was produced in the laboratory while nine recycled sands produced from construction and demolitions materials and two sands from natural crushed limestone were delivered from three plants. Ten concrete mix designs representing the concrete exposition classes XC1, XC2, XF3 and XF4 according to European standard EN 206 were produced with partial and full replacement of natural sand by the different recycled sands. Bituminous mixtures achieving the requirements of base courses according to Germany standards and both base and binder courses according to Egyptian standards were produced with the recycled sands as a substitution to the natural sands. The mechanical properties and durability of concrete produced with the different recycled sands were investigated and analyzed. Also the volumetric analysis and Marshall test were performed hot bituminous mixtures produced with the recycled sands. According to the effect of replacement the natural sand by the different recycled sands on the concrete compressive strength and durability, the recycled sands were classified into three groups. The maximum allowable recycled sand that can be used in the different concrete exposition class was determined for each group. For the asphalt concrete mixes all the investigated recycled sands can be used in mixes for base and binder courses up to 21\% of the total aggregate mass.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The process of developing software that takes advantage of multiple processors is commonly referred to as parallel programming. For various reasons, this process is much harder than the sequential case. For decades, parallel programming has been a problem for a small niche only: engineers working on parallelizing mostly numerical applications in High Performance Computing. This has changed with the advent of multi-core processors in mainstream computer architectures. Parallel programming in our days becomes a problem for a much larger group of developers. The main objective of this thesis was to find ways to make parallel programming easier for them. Different aims were identified in order to reach the objective: research the state of the art of parallel programming today, improve the education of software developers about the topic, and provide programmers with powerful abstractions to make their work easier. To reach these aims, several key steps were taken. To start with, a survey was conducted among parallel programmers to find out about the state of the art. More than 250 people participated, yielding results about the parallel programming systems and languages in use, as well as about common problems with these systems. Furthermore, a study was conducted in university classes on parallel programming. It resulted in a list of frequently made mistakes that were analyzed and used to create a programmers' checklist to avoid them in the future. For programmers' education, an online resource was setup to collect experiences and knowledge in the field of parallel programming - called the Parawiki. Another key step in this direction was the creation of the Thinking Parallel weblog, where more than 50.000 readers to date have read essays on the topic. For the third aim (powerful abstractions), it was decided to concentrate on one parallel programming system: OpenMP. Its ease of use and high level of abstraction were the most important reasons for this decision. Two different research directions were pursued. The first one resulted in a parallel library called AthenaMP. It contains so-called generic components, derived from design patterns for parallel programming. These include functionality to enhance the locks provided by OpenMP, to perform operations on large amounts of data (data-parallel programming), and to enable the implementation of irregular algorithms using task pools. AthenaMP itself serves a triple role: the components are well-documented and can be used directly in programs, it enables developers to study the source code and learn from it, and it is possible for compiler writers to use it as a testing ground for their OpenMP compilers. The second research direction was targeted at changing the OpenMP specification to make the system more powerful. The main contributions here were a proposal to enable thread-cancellation and a proposal to avoid busy waiting. Both were implemented in a research compiler, shown to be useful in example applications, and proposed to the OpenMP Language Committee.