991 resultados para Frequent itemset


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Frequent episode discovery is a popular framework for pattern discovery from sequential data. It has found many applications in domains like alarm management in telecommunication networks, fault analysis in the manufacturing plants, predicting user behavior in web click streams and so on. In this paper, we address the discovery of serial episodes. In the episodes context, there have been multiple ways to quantify the frequency of an episode. Most of the current algorithms for episode discovery under various frequencies are apriori-based level-wise methods. These methods essentially perform a breadth-first search of the pattern space. However currently there are no depth-first based methods of pattern discovery in the frequent episode framework under many of the frequency definitions. In this paper, we try to bridge this gap. We provide new depth-first based algorithms for serial episode discovery under non-overlapped and total frequencies. Under non-overlapped frequency, we present algorithms that can take care of span constraint and gap constraint on episode occurrences. Under total frequency we present an algorithm that can handle span constraint. We provide proofs of correctness for the proposed algorithms. We demonstrate the effectiveness of the proposed algorithms by extensive simulations. We also give detailed run-time comparisons with the existing apriori-based methods and illustrate scenarios under which the proposed pattern-growth algorithms perform better than their apriori counterparts. (C) 2013 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Although Microcystis-based toxins have been intensively studied, previous studies using laboratory cultures of Microcystis strains are difficult to explain the phenomenon that microcystin concentrations and toxin variants in natural blooms differ widely and frequently within a short-term period. The present study was designed to unravel the mechanisms for the frequent variations of intracellular toxins related to the differences in cyanobacterial colonies during bloom seasons in Lake Taihu, China. Monitoring of Microcystis colonies during warm seasons indicated that the variations in microcystins in both concentrations and toxin species were associated with the frequent alteration of Microcystis colonies in Lake Taihu. High concentration of microcystins in the blooms was always associated with two Microcystis colonies, Microcystis flos-aquae and Microcystis aeruginosa, whereas when Microcystis wesenbergii was the dominant colonial type, the toxin production of the blooms was low. Additionally, environmental factors such as temperature and nutrition were also shown to have an effect on the toxin production of the blooms, and may also potentially influence the Microcystis species present. The results of the present study provides insight into a new consideration for quick water quality monitoring, assessment and risk alert in cyanobacterium- and toxin-contaminated freshwaters, which will be beneficial not only for water agencies but also for public health. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We grow In-GaAs quantum dot (QD) at low growth rate with 70 times insertion of growth interruption in MBE system. It is found that because of the extreme growth condition, QDs exhibit a thick wetting layer, large QD height value and special surface morphology which is attributed to the In segregation effect. Temperature dependence of photoluminescence measurement shows that this kind of QDs has a good thermal stability which is explained in terms of a "group coupling" model put forward by us. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We grow InGaAs quantum dot (QD) at low growth rate with 70 times insertion of growth interruption in MBE system. It is found that because of the extreme growth condition, QDs exhibit a thick wetting layer, large QD height value and special surface morphology which is attributed to the enhanced adatom surface diffusion and In-segregation effect. Temperature dependence of photoluminescence measurement from surface QD shows that this kind of QD has good thermal stability which is explained in terms of the presence of surface oxide. The special distribution of QD may also play a role in this thermal character. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of discovering frequent poly-regions (i.e. regions of high occurrence of a set of items or patterns of a given alphabet) in a sequence is studied, and three efficient approaches are proposed to solve it. The first one is entropy-based and applies a recursive segmentation technique that produces a set of candidate segments which may potentially lead to a poly-region. The key idea of the second approach is the use of a set of sliding windows over the sequence. Each sliding window covers a sequence segment and keeps a set of statistics that mainly include the number of occurrences of each item or pattern in that segment. Combining these statistics efficiently yields the complete set of poly-regions in the given sequence. The third approach applies a technique based on the majority vote, achieving linear running time with a minimal number of false negatives. After identifying the poly-regions, the sequence is converted to a sequence of labeled intervals (each one corresponding to a poly-region). An efficient algorithm for mining frequent arrangements of intervals is applied to the converted sequence to discover frequently occurring arrangements of poly-regions in different parts of DNA, including coding regions. The proposed algorithms are tested on various DNA sequences producing results of significant biological meaning.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of discovering frequent arrangements of regions of high occurrence of one or more items of a given alphabet in a sequence is studied, and two efficient approaches are proposed to solve it. The first approach is entropy-based and uses an existing recursive segmentation technique to split the input sequence into a set of homogeneous segments. The key idea of the second approach is to use a set of sliding windows over the sequence. Each sliding window keeps a set of statistics of a sequence segment that mainly includes the number of occurrences of each item in that segment. Combining these statistics efficiently yields the complete set of regions of high occurrence of the items of the given alphabet. After identifying these regions, the sequence is converted to a sequence of labeled intervals (each one corresponding to a region). An efficient algorithm for mining frequent arrangements of temporal intervals on a single sequence is applied on the converted sequence to discover frequently occurring arrangements of these regions. The proposed algorithms are tested on various DNA sequences producing results with significant biological meaning.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of discovering frequent arrangements of temporal intervals is studied. It is assumed that the database consists of sequences of events, where an event occurs during a time-interval. The goal is to mine temporal arrangements of event intervals that appear frequently in the database. The motivation of this work is the observation that in practice most events are not instantaneous but occur over a period of time and different events may occur concurrently. Thus, there are many practical applications that require mining such temporal correlations between intervals including the linguistic analysis of annotated data from American Sign Language as well as network and biological data. Two efficient methods to find frequent arrangements of temporal intervals are described; the first one is tree-based and uses depth first search to mine the set of frequent arrangements, whereas the second one is prefix-based. The above methods apply efficient pruning techniques that include a set of constraints consisting of regular expressions and gap constraints that add user-controlled focus into the mining process. Moreover, based on the extracted patterns a standard method for mining association rules is employed that applies different interestingness measures to evaluate the significance of the discovered patterns and rules. The performance of the proposed algorithms is evaluated and compared with other approaches on real (American Sign Language annotations and network data) and large synthetic datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aims: to study the gambling history and stories of participants, motivations, impact and helpseeking. Method: Details were advertised on websites and newspapers. 30 frequent gamblers were interviewed over the telephone for approximately one hour. Verbatim transcriptions were analysed using NVIVO and grounded theory. Results/conclusions: Not all women had gambled before. However, internet accessibility meant prolonged periods were spent gambling to the neglect of other life areas. Some were originally motivated by excitement but others gambled to escape from current difficulties. Depression, anxiety, panic attacks and suicide ideation were common. The women were ambivalent towards their gambling and receiving help.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Frequent locations of thermal fronts in UK shelf seas were identified using an archive of 30,000 satellite images acquired between 1999 and 2008, and applied as a proxy for pelagic diversity in the designation of Marine Protected Areas (MPAs). Networks of MPAs are required for conservation of critical marine habitats within Europe, and there are similar initiatives worldwide. Many pelagic biodiversity hotspots are related to fronts, for example cetaceans and basking sharks around the Isle of Man, Hebrides and Cornwall, and hence remote sensing can address this policy need in regions with insufficient species distribution data. This is the first study of UK Continental Shelf front locations to use a 10-year archive of full-resolution (1.1 km) AVHRR data, revealing new aspects of their spatial and seasonal variability. Frontal locations determined at sea or predicted by ocean models agreed closely with the new frequent front maps, which also identified many additional frontal zones. These front maps were among the most widely used datasets in the recommendation of UK MPAs, and would be applicable to other geographic regions and to other policy drivers such as facilitating the deployment of offshore renewable energy devices with minimal environmental impact.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The relatively high levels of cannabis use among young people is a cause of concern because of the positive relationship between its early onset use, antisocial behaviours and associated lifestyle. Amongst a survey of 3919 young people at school year 11 in Northern Ireland (aged 14/15 years) 142 reported daily cannabis use. These young people also reported particularly high levels of legal and illegal drug use and accounted for a high proportion of use of hard drugs such as cocaine and heroin for the full school cohort. Daily cannabis users also reported high levels of antisocial behaviour and disaffection with school. The findings perhaps raise questions about the existence of a potentially ‘hidden’ high risk school based group of young people during adolescence who require specific targeted prevention strategies.