420 resultados para Big data, Spark, Hadoop


Relevância:

20.00% 20.00%

Publicador:

Resumo:

It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of the large number of terms, patterns, and noise. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern-based methods should perform better than term-based ones in describing user preferences, but many experiments do not support this hypothesis. The innovative technique presented in paper makes a breakthrough for this difficulty. This technique discovers both positive and negative patterns in text documents as higher level features in order to accurately weight low-level features (terms) based on their specificity and their distributions in the higher level features. Substantial experiments using this technique on Reuters Corpus Volume 1 and TREC topics show that the proposed approach significantly outperforms both the state-of-the-art term-based methods underpinned by Okapi BM25, Rocchio or Support Vector Machine and pattern based methods on precision, recall and F measures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Participatory sensing enables collection, processing, dissemination and analysis of environmental sensory data by ordinary citizens, through mobile devices. Researchers have recognized the potential of participatory sensing and attempted applying it to many areas. However, participants may submit low quality, misleading, inaccurate, or even malicious data. Therefore, finding a way to improve the data quality has become a significant issue. This study proposes using reputation management to classify the gathered data and provide useful information for campaign organizers and data analysts to facilitate their decisions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We examined properties of culture-level personality traits in ratings of targets (N=5,109) ages 12 to 17 in 24 cultures. Aggregate scores were generalizable across gender, age, and relationship groups and showed convergence with culture-level scores from previous studies of self-reports and observer ratings of adults, but they were unrelated to national character stereotypes. Trait profiles also showed cross-study agreement within most cultures, 8 of which had not previously been studied. Multidimensional scaling showed that Western and non-Western cultures clustered along a dimension related to Extraversion. A culture-level factor analysis replicated earlier findings of a broad Extraversion factor but generally resembled the factor structure found in individuals. Continued analysis of aggregate personality scores is warranted.