28 resultados para Frequent itemset


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many previous approaches to frequent episode discovery only accept simple sequences. Although a recent approach has been able to nd frequent episodes from complex sequences, the discovered sets are neither condensed nor accurate. This paper investigates the discovery of condensed sets of frequent episodes from complex sequences. We adopt a novel anti-monotonic frequency measure based on non-redundant occurrences, and dene a condensed set, nDaCF (the set of non-derivable approximately closed frequent episodes) within a given maximal error bound of support. We then introduce a series of effective pruning strategies, and develop a method, nDaCF-Miner, for discovering nDaCF sets. Experimental results show that, when the error bound is somewhat high, the discovered nDaCF sets are two orders of magnitude smaller than complete sets, and nDaCF-miner is more efficient than previous mining approaches. In addition, the nDaCF sets are more accurate than the sets found by previous approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The knowledge embedded in an online data stream is likely to change over time due to the dynamic evolution of the stream. Consequently, infrequent episode mining over an online stream, frequent episodes should be adaptively extracted from recently generated stream segments instead of the whole stream. However, almost all existing frequent episode mining approaches find episodes frequently occurring over the whole sequence. This paper proposes and investigates a new problem: online mining of recently frequent episodes over data streams. In order to meet strict requirements of stream mining such as one-scan, adaptive result update and instant result return, we choose a novel frequency metric and define a highly condensed set called the base of recently frequent episodes. We then introduce a one-pass method for mining bases of recently frequent episodes. Experimental results show that the proposed method is capable of finding bases of recently frequent episodes quickly and adaptively. The proposed method outperforms the previous approaches with the advantages of one-pass, instant result update and return, more condensed resulting sets and less space usage.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider a CPU constrained environment for finding approximation of frequent sets in data streams using the landmark window. Our algorithm can detect overload situations, i.e., breaching the CPU capacity, and sheds data in the stream to “keep up”. This is done within a controlled error threshold by exploiting the Chernoff-bound. Empirical evaluation of the algorithm confirms the feasibility.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Discovering frequent patterns plays an essential role in many data mining applications. The aim of frequent patterns is to obtain the information about the most common patterns that appeared together. However, designing an efficient model to mine these patterns is still demanding due to the capacity of current database size. Therefore, we propose an Efficient Frequent Pattern Mining Model (EFP-M2) to mine the frequent patterns in timely manner. The result shows that the algorithm in EFP-M2l is outperformed at least at 2 orders of magnitudes against the benchmarked FP-Growth.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Adequate vegetable and fruit consumption is necessary for preventing nutrition-related diseases. Socio-economically disadvantaged adolescents tend to consume relatively few vegetables and fruits. However, despite nutritional challenges associated with socio-economic disadvantage, a minority of adolescents manage to eat vegetables and fruit in quantities that are more in line with dietary recommendations. This investigation aimed to identify predictors of more frequent intakes of fruits and vegetables among adolescents over a 2-year follow-up period. Data were drawn from 521 socio-economically disadvantaged (maternal education ≤Year 10 of secondary school) Australian adolescents aged 12–15 years. Participants were recruited from 37 secondary schools and were asked to complete online surveys in 2004/2005 (baseline) and 2006/2007 (follow-up). Surveys comprised a 38-item FFQ and questions based on Social Ecological models examining intrapersonal, social and environmental influences on diet. At baseline and follow-up, respectively, 29% and 24% of adolescents frequently consumed vegetables (≥2 times/day); 33% and 36% frequently consumed fruit (≥1 time/day). In multivariable logistic regressions, baseline consumption strongly predicted consumption at follow-up. Frequently being served vegetables at dinner predicted frequent vegetable consumption. Female sex, rarely purchasing food or drink from school vending machines, and usually being expected to eat all foods served predicted frequent fruit consumption. Findings suggest nutrition promotion initiatives aimed at improving eating behaviours among this at-risk population and should focus on younger adolescents, particularly boys; improving adolescent eating behaviours at school; and encouraging families to increase home availability of healthy foods and to implement meal time rules.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To examine demographic and behavioural correlates of unhealthy snack-food consumption among Australian secondary-school students and the association between their perceptions of availability, convenience and intake with consumption.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sufficient dairy food consumption during adolescence is necessary for preventing disease. While socio-economically disadvantaged adolescents tend to consume few dairy foods, some eat quantities more in line with dietary recommendations despite socio-economic challenges. Socio-economic variations in factors supportive of adolescents' frequent dairy consumption remain unexplored. The present study aimed to identify cross-sectional and longitudinal associations between intrapersonal, social and environmental factors and adolescents' frequent dairy consumption at baseline and two years later across socio-economic strata, and to examine whether socio-economic position moderated observed effects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In big data analysis, frequent itemsets mining plays a key role in mining associations, correlations and causality. Since some traditional frequent itemsets mining algorithms are unable to handle massive small files datasets effectively, such as high memory cost, high I/O overhead, and low computing performance, we propose a novel parallel frequent itemsets mining algorithm based on the FP-Growth algorithm and discuss its applications in this paper. First, we introduce a small files processing strategy for massive small files datasets to compensate defects of low read-write speed and low processing efficiency in Hadoop. Moreover, we use MapReduce to redesign the FP-Growth algorithm for implementing parallel computing, thereby improving the overall performance of frequent itemsets mining. Finally, we apply the proposed algorithm to the association analysis of the data from the national college entrance examination and admission of China. The experimental results show that the proposed algorithm is feasible and valid for a good speedup and a higher mining efficiency, and can meet the actual requirements of frequent itemsets mining for massive small files datasets. © 2014 ISSN 2185-2766.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Time since last fire and fire frequency are strong determinants of plant community composition in fire-prone landscapes. Our study aimed to establish the influence of time since last fire and fire frequency on plant community composition and diversity of a south-west Australian semi-arid shrubland. We employed a space-for-time approach using four fire age classes: 'young', 8-15years since last fire; 'medium', 16-34; 'old', 35-50; and 'very old', 51-100; and three fire frequency classes: burnt once, twice and three times within the last 50years. Species diversity was compared using one-way ANOVA and species composition using PERMANOVA. Soil and climatic variables were included as covariables to partition underlying environmental drivers. We found that time since last fire influenced species richness, diversity and composition. Specifically, we recorded a late successional transition from woody seeders to long-lived, arid-zone, resprouting shrub species. Fire frequency did not influence species richness and diversity but did influence species composition via a reduction in cover of longer-lived resprouter species - presumably because of a reduced ability to replenish epicormic buds and/or sufficient starch stores. The distinct floristic composition of old and very old habitat, and the vulnerability of these areas to wildfires, indicate that these areas are ecologically important and management should seek to preserve them.