979 resultados para contextual text mining


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the current study, epidemiology study is done by means of literature survey in groups identified to be at higher potential for DDIs as well as in other cases to explore patterns of DDIs and the factors affecting them. The structure of the FDA Adverse Event Reporting System (FAERS) database is studied and analyzed in detail to identify issues and challenges in data mining the drug-drug interactions. The necessary pre-processing algorithms are developed based on the analysis and the Apriori algorithm is modified to suit the process. Finally, the modules are integrated into a tool to identify DDIs. The results are compared using standard drug interaction database for validation. 31% of the associations obtained were identified to be new and the match with existing interactions was 69%. This match clearly indicates the validity of the methodology and its applicability to similar databases. Formulation of the results using the generic names expanded the relevance of the results to a global scale. The global applicability helps the health care professionals worldwide to observe caution during various stages of drug administration thus considerably enhancing pharmacovigilance

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper a method of copy detection in short Malayalam text passages is proposed. Given two passages one as the source text and another as the copied text it is determined whether the second passage is plagiarized version of the source text. An algorithm for plagiarism detection using the n-gram model for word retrieval is developed and found tri-grams as the best model for comparing the Malayalam text. Based on the probability and the resemblance measures calculated from the n-gram comparison , the text is categorized on a threshold. Texts are compared by variable length n-gram(n={2,3,4}) comparisons. The experiments show that trigram model gives the average acceptable performance with affordable cost in terms of complexity

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A methodology for translating text from English into the Dravidian language, Malayalam using statistical models is discussed in this paper. The translator utilizes a monolingual Malayalam corpus and a bilingual English/Malayalam corpus in the training phase and generates automatically the Malayalam translation of an unseen English sentence. Various techniques to improve the alignment model by incorporating the morphological inputs into the bilingual corpus are discussed. Removing the insignificant alignments from the sentence pairs by this approach has ensured better training results. Pre-processing techniques like suffix separation from the Malayalam corpus and stop word elimination from the bilingual corpus also proved to be effective in producing better alignments. Difficulties in translation process that arise due to the structural difference between the English Malayalam pair is resolved in the decoding phase by applying the order conversion rules. The handcrafted rules designed for the suffix separation process which can be used as a guideline in implementing suffix separation in Malayalam language are also presented in this paper. Experiments conducted on a sample corpus have generated reasonably good Malayalam translations and the results are verified with F measure, BLEU and WER evaluation metrics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper discusses the implementation details of a child friendly, good quality, English text-to-speech (TTS) system that is phoneme-based, concatenative, easy to set up and use with little memory. Direct waveform concatenation and linear prediction coding (LPC) are used. Most existing TTS systems are unit-selection based, which use standard speech databases available in neutral adult voices.Here reduced memory is achieved by the concatenation of phonemes and by replacing phonetic wave files with their LPC coefficients. Linguistic analysis was used to reduce the algorithmic complexity instead of signal processing techniques. Sufficient degree of customization and generalization catering to the needs of the child user had been included through the provision for vocabulary and voice selection to suit the requisites of the child. Prosody had also been incorporated. This inexpensive TTS systemwas implemented inMATLAB, with the synthesis presented by means of a graphical user interface (GUI), thus making it child friendly. This can be used not only as an interesting language learning aid for the normal child but it also serves as a speech aid to the vocally disabled child. The quality of the synthesized speech was evaluated using the mean opinion score (MOS).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recognizing that high satisfaction leads to high customer loyalty, companies today are aiming for total customer satisfaction. This article explains relative impact of product quality, service quality and contextual experience on customer perceived value and intention to shop in the future. The data has been collected using a questionnaire from 205 customers of a national retailer chain. The relative importance of product quality, service quality and contextual experience on customer perceived value and thus on customer preference and future intentions was measured using multiple regression. Also, the contribution of perceived value to preference and thus on future buying intention was also measured. Structural Equation Model (SEM) using Amos 4 was used to find the overall fitness of the model. It was found that product quality, service quality and contextual experience have a major influence on customer perceived value

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining means to summarize information from large amounts of raw data. It is one of the key technologies in many areas of economy, science, administration and the internet. In this report we introduce an approach for utilizing evolutionary algorithms to breed fuzzy classifier systems. This approach was exercised as part of a structured procedure by the students Achler, Göb and Voigtmann as contribution to the 2006 Data-Mining-Cup contest, yielding encouragingly positive results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this paper is to indicate how TOSCANA may be extended to allow graphical representations not only of concept lattices but also of concept graphs in the sense of Contextual Logic. The contextual-logic extension of TOSCANA requires the logical scaling of conceptual and relatioal scales for which we propose the Peircean Algebraic Logic as reconstructed by R. W. Burch. As graphical representations we recommend, besides labelled line diagrams of concept lattices and Sowa's diagrams of conceptual graphs, particular information maps for utilizing background knowledge as much as possible. Our considerations are illustrated by a small information system about the domestic flights in Austria.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new algorithm called TITANIC for computing concept lattices. It is based on data mining techniques for computing frequent itemsets. The algorithm is experimentally evaluated and compared with B. Ganter's Next-Closure algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of the relevance and the usefulness of extracted association rules is of primary importance because, in the majority of cases, real-life databases lead to several thousands association rules with high confidence and among which are many redundancies. Using the closure of the Galois connection, we define two new bases for association rules which union is a generating set for all valid association rules with support and confidence. These bases are characterized using frequent closed itemsets and their generators; they consist of the non-redundant exact and approximate association rules having minimal antecedents and maximal consequences, i.e. the most relevant association rules. Algorithms for extracting these bases are presented and results of experiments carried out on real-life databases show that the proposed bases are useful, and that their generation is not time consuming.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. The idea is to improve, on the one hand, the results of Web Mining by exploiting the new semantic structures in the Web; and to make use of Web Mining, on overview of where the two areas meet today, and sketches ways of how a closer integration could be profitable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Formal Concept Analysis is an unsupervised learning technique for conceptual clustering. We introduce the notion of iceberg concept lattices and show their use in Knowledge Discovery in Databases (KDD). Iceberg lattices are designed for analyzing very large databases. In particular they serve as a condensed representation of frequent patterns as known from association rule mining. In order to show the interplay between Formal Concept Analysis and association rule mining, we discuss the algorithm TITANIC. We show that iceberg concept lattices are a starting point for computing condensed sets of association rules without loss of information, and are a visualization method for the resulting rules.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: Growing numbers of researchers work on improving the results of Web Mining by exploiting semantic structures in the Web, and they use Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The second aim of this paper is to use these concepts to circumscribe what Web space is, what it represents and how it can be represented and analyzed. This is used to sketch the role that Semantic Web Mining and the software agents and human agents involved in it can play in the evolution of Web space.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. These systems provide currently relatively few structure. We discuss in this paper, how association rule mining can be adopted to analyze and structure folksonomies, and how the results can be used for ontology learning and supporting emergent semantics. We demonstrate our approach on a large scale dataset stemming from an online system.