991 resultados para Frequent itemset


20.00% 20.00%



Many previous approaches to frequent episode discovery only accept simple sequences. Although a recent approach has been able to nd frequent episodes from complex sequences, the discovered sets are neither condensed nor accurate. This paper investigates the discovery of condensed sets of frequent episodes from complex sequences. We adopt a novel anti-monotonic frequency measure based on non-redundant occurrences, and dene a condensed set, nDaCF (the set of non-derivable approximately closed frequent episodes) within a given maximal error bound of support. We then introduce a series of effective pruning strategies, and develop a method, nDaCF-Miner, for discovering nDaCF sets. Experimental results show that, when the error bound is somewhat high, the discovered nDaCF sets are two orders of magnitude smaller than complete sets, and nDaCF-miner is more efficient than previous mining approaches. In addition, the nDaCF sets are more accurate than the sets found by previous approaches.


20.00% 20.00%



The knowledge embedded in an online data stream is likely to change over time due to the dynamic evolution of the stream. Consequently, infrequent episode mining over an online stream, frequent episodes should be adaptively extracted from recently generated stream segments instead of the whole stream. However, almost all existing frequent episode mining approaches find episodes frequently occurring over the whole sequence. This paper proposes and investigates a new problem: online mining of recently frequent episodes over data streams. In order to meet strict requirements of stream mining such as one-scan, adaptive result update and instant result return, we choose a novel frequency metric and define a highly condensed set called the base of recently frequent episodes. We then introduce a one-pass method for mining bases of recently frequent episodes. Experimental results show that the proposed method is capable of finding bases of recently frequent episodes quickly and adaptively. The proposed method outperforms the previous approaches with the advantages of one-pass, instant result update and return, more condensed resulting sets and less space usage.


20.00% 20.00%



We consider a CPU constrained environment for finding approximation of frequent sets in data streams using the landmark window. Our algorithm can detect overload situations, i.e., breaching the CPU capacity, and sheds data in the stream to “keep up”. This is done within a controlled error threshold by exploiting the Chernoff-bound. Empirical evaluation of the algorithm confirms the feasibility.


20.00% 20.00%



Discovering frequent patterns plays an essential role in many data mining applications. The aim of frequent patterns is to obtain the information about the most common patterns that appeared together. However, designing an efficient model to mine these patterns is still demanding due to the capacity of current database size. Therefore, we propose an Efficient Frequent Pattern Mining Model (EFP-M2) to mine the frequent patterns in timely manner. The result shows that the algorithm in EFP-M2l is outperformed at least at 2 orders of magnitudes against the benchmarked FP-Growth.


20.00% 20.00%



Adequate vegetable and fruit consumption is necessary for preventing nutrition-related diseases. Socio-economically disadvantaged adolescents tend to consume relatively few vegetables and fruits. However, despite nutritional challenges associated with socio-economic disadvantage, a minority of adolescents manage to eat vegetables and fruit in quantities that are more in line with dietary recommendations. This investigation aimed to identify predictors of more frequent intakes of fruits and vegetables among adolescents over a 2-year follow-up period. Data were drawn from 521 socio-economically disadvantaged (maternal education ≤Year 10 of secondary school) Australian adolescents aged 12–15 years. Participants were recruited from 37 secondary schools and were asked to complete online surveys in 2004/2005 (baseline) and 2006/2007 (follow-up). Surveys comprised a 38-item FFQ and questions based on Social Ecological models examining intrapersonal, social and environmental influences on diet. At baseline and follow-up, respectively, 29% and 24% of adolescents frequently consumed vegetables (≥2 times/day); 33% and 36% frequently consumed fruit (≥1 time/day). In multivariable logistic regressions, baseline consumption strongly predicted consumption at follow-up. Frequently being served vegetables at dinner predicted frequent vegetable consumption. Female sex, rarely purchasing food or drink from school vending machines, and usually being expected to eat all foods served predicted frequent fruit consumption. Findings suggest nutrition promotion initiatives aimed at improving eating behaviours among this at-risk population and should focus on younger adolescents, particularly boys; improving adolescent eating behaviours at school; and encouraging families to increase home availability of healthy foods and to implement meal time rules.


20.00% 20.00%



20.00% 20.00%



To examine demographic and behavioural correlates of unhealthy snack-food consumption among Australian secondary-school students and the association between their perceptions of availability, convenience and intake with consumption.


20.00% 20.00%



Sufficient dairy food consumption during adolescence is necessary for preventing disease. While socio-economically disadvantaged adolescents tend to consume few dairy foods, some eat quantities more in line with dietary recommendations despite socio-economic challenges. Socio-economic variations in factors supportive of adolescents' frequent dairy consumption remain unexplored. The present study aimed to identify cross-sectional and longitudinal associations between intrapersonal, social and environmental factors and adolescents' frequent dairy consumption at baseline and two years later across socio-economic strata, and to examine whether socio-economic position moderated observed effects.


20.00% 20.00%



In big data analysis, frequent itemsets mining plays a key role in mining associations, correlations and causality. Since some traditional frequent itemsets mining algorithms are unable to handle massive small files datasets effectively, such as high memory cost, high I/O overhead, and low computing performance, we propose a novel parallel frequent itemsets mining algorithm based on the FP-Growth algorithm and discuss its applications in this paper. First, we introduce a small files processing strategy for massive small files datasets to compensate defects of low read-write speed and low processing efficiency in Hadoop. Moreover, we use MapReduce to redesign the FP-Growth algorithm for implementing parallel computing, thereby improving the overall performance of frequent itemsets mining. Finally, we apply the proposed algorithm to the association analysis of the data from the national college entrance examination and admission of China. The experimental results show that the proposed algorithm is feasible and valid for a good speedup and a higher mining efficiency, and can meet the actual requirements of frequent itemsets mining for massive small files datasets. © 2014 ISSN 2185-2766.


20.00% 20.00%



Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)


20.00% 20.00%



Cancer is regarded as the abnormal cellular multiplication; it is not controlled by the organism; and its cells present a differentiated DNA. Initially, the disease does not show clinical signs, but it can be diagnosed by laboratorial examinations. When tumors are present in the maxillofacial area, they can entail the loss of these area organs, which become responsible for the carrier's social environment exclusion. This paper aimed at showing, through a literature review, the cancers that more commonly happen in the face and the possibilities of regenerating in the patient mutilated through surgical reconstruction and prostheses.


20.00% 20.00%



Background. Loss of heterozygosity (LOH) correlates with inactivated tumor suppressor genes. LOH at chromosome arm 22q has been found in a variety of human neoplasms, suggesting that this region contains a tumor suppressor gene(s) other than NF2 important to tumorigenesis. The aim of this study was to evaluate the presence of LOH on chromosome 22q11.2-13 and determine whether there was a relationship between loss in this genomic region and tumor histologic parameters, anatomic site, and survival in patients with squamous cell carcinoma of the head and neck (HNSCC).Methods. Fifty matched blood and HNSCC tumor samples taken at the time of surgical treatment were evaluated for LOH by use of four microsatellite markers mapping to 22q11.2-q13. Clinical information was available for all patients. The frequency and distribution of LOH was correlated with clinical (age, sex, use of tobacco and alcohol, site of primary tumor, clinical stage, adjuvant therapy and overall survival) and histologic parameters (histopathologic stage, tumor differentiation).Results. LOH at 22q was found in 19 of 50 (38%) informative tumors. The respective incidence of allelic loss for the patients was as follows: 28% at D22S421, 10% at D22S277, 8% at D22S44S, and 4% at D22S280. No statistical differences were apparent with a mean follow-up of 30 months. Laryngeal tumors showed a higher incidence of LOH compared with oral tumors.Conclusions. These results suggest that the D22S277 locus may be closely linked to a tumor suppressor gene (TSG) and involved in upper aerodigestive tract carcinogenesis. In particular, laryngeal tumors may harbor another putative TSG on 22q11.2-q12.3 that may play a role in aggressive stage III/IV disease. (C) 2000 John Wiley & Sons, Inc.


20.00% 20.00%



Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)