600 resultados para Association mining
Resumo:
Automated analysis of the sentiments presented in online consumer feedbacks can facilitate both organizations’ business strategy development and individual consumers’ comparison shopping. Nevertheless, existing opinion mining methods either adopt a context-free sentiment classification approach or rely on a large number of manually annotated training examples to perform context sensitive sentiment classification. Guided by the design science research methodology, we illustrate the design, development, and evaluation of a novel fuzzy domain ontology based contextsensitive opinion mining system. Our novel ontology extraction mechanism underpinned by a variant of Kullback-Leibler divergence can automatically acquire contextual sentiment knowledge across various product domains to improve the sentiment analysis processes. Evaluated based on a benchmark dataset and real consumer reviews collected from Amazon.com, our system shows remarkable performance improvement over the context-free baseline.
Resumo:
Many data mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively use and update discovered patterns is still an open research issue, especially in the domain of text mining. Since most existing text mining methods adopted term-based approaches, they all suffer from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern (or phrase) based approaches should perform better than the term-based ones, but many experiments did not support this hypothesis. This paper presents an innovative technique, effective pattern discovery which includes the processes of pattern deploying and pattern evolving, to improve the effectiveness of using and updating discovered patterns for finding relevant and interesting information. Substantial experiments on RCV1 data collection and TREC topics demonstrate that the proposed solution achieves encouraging performance.
Resumo:
It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of the large number of terms, patterns, and noise. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern-based methods should perform better than term-based ones in describing user preferences, but many experiments do not support this hypothesis. The innovative technique presented in paper makes a breakthrough for this difficulty. This technique discovers both positive and negative patterns in text documents as higher level features in order to accurately weight low-level features (terms) based on their specificity and their distributions in the higher level features. Substantial experiments using this technique on Reuters Corpus Volume 1 and TREC topics show that the proposed approach significantly outperforms both the state-of-the-art term-based methods underpinned by Okapi BM25, Rocchio or Support Vector Machine and pattern based methods on precision, recall and F measures.
Resumo:
This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern- based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experiments have been conducted to compare the proposed two-stage filtering (T-SM) model with other possible "term-based + pattern-based" or "term-based + term-based" IF models. The results based on the RCV1 corpus show that the T-SM model significantly outperforms other types of "two-stage" IF models.
Resumo:
This paper presents an automated image‐based safety assessment method for earthmoving and surface mining activities. The literature review revealed the possible causes of accidents on earthmoving operations, investigated the spatial risk factors of these types of accident, and identified spatial data needs for automated safety assessment based on current safety regulations. Image‐based data collection devices and algorithms for safety assessment were then evaluated. Analysis methods and rules for monitoring safety violations were also discussed. The experimental results showed that the safety assessment method collected spatial data using stereo vision cameras, applied object identification and tracking algorithms, and finally utilized identified and tracked object information for safety decision making.
Resumo:
This article presents the results of a study on the association between measured air pollutants and the respiratory health of resident women and children in Lao PDR, one of the least developed countries in Southeast Asia. The study, commissioned by the World Health Organisation, included PM10, CO and NO2 measurements made inside 181 dwellings in nine districts within two provinces in Lao PDR over a 5- month period (12/05–04/06), and respiratory health information (via questionnaires and peak expiratory flow rate (PEFR) measurements) for all residents in the same dwellings. Adjusted odds ratios were calculated separately for each health outcome using binary logistic regression. There was a strong and consistent positive association between NO2 and CO for almost all questionnaire-based health outcomes for both women and children. Women in dwellings with higher measured NO2 had more than triple of the odds of almost all of the health outcomes, and higher concentrations of NO2 and CO were significantly associated with lower PEFR. This study supports a growing literature confirming the role of indoor air pollution in the burden of respiratory disease in developing countries. The results will directly support changes in health and housing policy in Lao PDR.
Resumo:
Recently, a polymorphism was identified in exon 25 of the factor V gene that is possibly a functional candidate for the HR2 haplotype. This haplotype is characterized by a single base substitution named R2 (A4070G) in the B domain of the protein. A mutation (A6755G; 2194Asp→Gly) located near the C terminus has been hypothesized to influence protein folding and glycosylation, and might be responsible for the shift in factor V isoform (FV1 / FV2) ratio. This study investigated the prevalence of these two factor V HR2 haplotype polymorphisms in a cohort of normal blood donors, patients with osteoarthritis and women with complications during pregnancy, and in families of factor V Leiden individuals. A high allele frequency for the two polymorphisms was found in the blood donor group (6.2% R2, 5.6% A6755G). No significant difference in allele frequency was observed in the clinical groups (obstetric complications and osteoarthritis, 4.1-4.9% for the two polymorphisms) when compared with that of healthy blood donors. We confirm that the factor V A6755G polymorphism shows strong linkage to the R2 allele, although it is not exclusively inherited with the exon 13 A4070G variant and can occur independently. © 2001 Lippincott Williams & Wilkins.
Resumo:
Background Nitric oxide is released by immune, epithelial and endothelial cells, and plays an important part in the pathophysiology of asthma. Objective To investigate the association of inducible nitric oxide synthases (iNOS) gene repeat polymorphisms with asthma. Methods 230 families with asthma (842 individuals) were recruited to identify and establish the genetic association of iNOS repeats with asthma and associated phenotypes. Serum nitric oxide levels in selected individuals were measured and correlated with specific genotypes. Multiple logistic regression analysis was performed to determine the effect of age and sex. Results A total of four repeats—a (CCTTT)n promoter repeat, a novel intron 2 (GT)n repeat (BV680047), an intron 4 (GT)n repeat (AFM311ZB1) and an intron 5 (CA)n repeat (D17S1878)—were identified and genotyped. A significant transmission distortion to the probands with asthma was seen for allele 3 of the AFM311ZB1 gene (p = 0.006). This allele was also found to be significantly associated with percentage blood eosinophils (p<0.001) and asthma severity (p = 0.04). Moreover, it was functionally correlated with high serum nitric oxide levels (p = 0.006). Similarly, the promoter repeat was found to be associated with serum total immunoglobulin (Ig)E (p = 0.028). Individuals carrying allele 4 of this repeat have high serum IgE (p<0.001) and nitric oxide levels (p = 0.03). Conclusion This is the first study to identify the repeat polymorphisms in the iNOS gene that are associated with severity of asthma and eosinophils. The functional relevance of the associated alleles with serum nitric oxide levels was also shown. Therefore, these results could be valuable in elucidating the role of nitric oxide in asthma pathogenesis.