5 resultados para Frequent itemset

em Boston University Digital Common


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of discovering frequent poly-regions (i.e. regions of high occurrence of a set of items or patterns of a given alphabet) in a sequence is studied, and three efficient approaches are proposed to solve it. The first one is entropy-based and applies a recursive segmentation technique that produces a set of candidate segments which may potentially lead to a poly-region. The key idea of the second approach is the use of a set of sliding windows over the sequence. Each sliding window covers a sequence segment and keeps a set of statistics that mainly include the number of occurrences of each item or pattern in that segment. Combining these statistics efficiently yields the complete set of poly-regions in the given sequence. The third approach applies a technique based on the majority vote, achieving linear running time with a minimal number of false negatives. After identifying the poly-regions, the sequence is converted to a sequence of labeled intervals (each one corresponding to a poly-region). An efficient algorithm for mining frequent arrangements of intervals is applied to the converted sequence to discover frequently occurring arrangements of poly-regions in different parts of DNA, including coding regions. The proposed algorithms are tested on various DNA sequences producing results of significant biological meaning.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of discovering frequent arrangements of regions of high occurrence of one or more items of a given alphabet in a sequence is studied, and two efficient approaches are proposed to solve it. The first approach is entropy-based and uses an existing recursive segmentation technique to split the input sequence into a set of homogeneous segments. The key idea of the second approach is to use a set of sliding windows over the sequence. Each sliding window keeps a set of statistics of a sequence segment that mainly includes the number of occurrences of each item in that segment. Combining these statistics efficiently yields the complete set of regions of high occurrence of the items of the given alphabet. After identifying these regions, the sequence is converted to a sequence of labeled intervals (each one corresponding to a region). An efficient algorithm for mining frequent arrangements of temporal intervals on a single sequence is applied on the converted sequence to discover frequently occurring arrangements of these regions. The proposed algorithms are tested on various DNA sequences producing results with significant biological meaning.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of discovering frequent arrangements of temporal intervals is studied. It is assumed that the database consists of sequences of events, where an event occurs during a time-interval. The goal is to mine temporal arrangements of event intervals that appear frequently in the database. The motivation of this work is the observation that in practice most events are not instantaneous but occur over a period of time and different events may occur concurrently. Thus, there are many practical applications that require mining such temporal correlations between intervals including the linguistic analysis of annotated data from American Sign Language as well as network and biological data. Two efficient methods to find frequent arrangements of temporal intervals are described; the first one is tree-based and uses depth first search to mine the set of frequent arrangements, whereas the second one is prefix-based. The above methods apply efficient pruning techniques that include a set of constraints consisting of regular expressions and gap constraints that add user-controlled focus into the mining process. Moreover, based on the extracted patterns a standard method for mining association rules is employed that applies different interestingness measures to evaluate the significance of the discovered patterns and rules. The performance of the proposed algorithms is evaluated and compared with other approaches on real (American Sign Language annotations and network data) and large synthetic datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Supported housing for individuals with severe mental illness strives to provide the services necessary to place and keep individuals in independent housing that is integrated into the community and in which the consumer has choice and control over his or her services and supports. Supported housing can be contrasted to an earlier model called the “linear residential approach” in which individuals are moved from the most restrictive settings (e.g., inpatient settings) through a series of more independent settings (e.g., group homes, supervised apartments) and then finally to independent housing. This approach has been criticized as punishing the client due to frequent moves, and as being less likely to result in independent housing. In the supported housing model (Anthony & Blanch, 1988) consumers have choice and control over their living environment, their treatment, and supports (e.g., case management, mental health and substance abuse services). Supports are flexible and faded in and out depending on needs. Results of this systematic review of supported housing suggest that there are several well-controlled studies of supported housing and several studies conducted with less rigorous designs. Overall, our synthesis suggests that supported housing can improve the living situation of individuals who are psychiatrically disabled, homeless and with substance abuse problems. Results show that supported housing can help people stay in apartments or homes up to about 80% of the time over an extended period. These results are contrary to concerns expressed by proponents of the linear residential model and housing models that espoused more restrictive environments. Results also show that housing subsidies or vouchers are helpful in getting and keeping individuals housed. Housing services appear to be cost effective and to reduce the costs of other social and clinical services. In order to be most effective, intensive case management services (rather than traditional case management) are needed and will generally lead to better housing outcomes. Having access to affordable housing and having a service system that is well-integrated is also important. Providing a person with supported housing reduces the likelihood that they will be re-hospitalized, although supported housing does not always lead to reduced psychiatric symptoms. Supported housing can improve clients’ quality of life and satisfaction with their living situation. Providing supported housing options that are of decent quality is important in order to keep people housed and satisfied with their housing. In addition, rapid entry into housing, with the provision of choices is critical. Program and clinical supports may be able to mitigate the social isolation that has sometimes been associated with supported housing.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article introduces a new neural network architecture, called ARTMAP, that autonomously learns to classify arbitrarily many, arbitrarily ordered vectors into recognition categories based on predictive success. This supervised learning system is built up from a pair of Adaptive Resonance Theory modules (ARTa and ARTb) that are capable of self-organizing stable recognition categories in response to arbitrary sequences of input patterns. During training trials, the ARTa module receives a stream {a^(p)} of input patterns, and ARTb receives a stream {b^(p)} of input patterns, where b^(p) is the correct prediction given a^(p). These ART modules are linked by an associative learning network and an internal controller that ensures autonomous system operation in real time. During test trials, the remaining patterns a^(p) are presented without b^(p), and their predictions at ARTb are compared with b^(p). Tested on a benchmark machine learning database in both on-line and off-line simulations, the ARTMAP system learns orders of magnitude more quickly, efficiently, and accurately than alternative algorithms, and achieves 100% accuracy after training on less than half the input patterns in the database. It achieves these properties by using an internal controller that conjointly maximizes predictive generalization and minimizes predictive error by linking predictive success to category size on a trial-by-trial basis, using only local operations. This computation increases the vigilance parameter ρa of ARTa by the minimal amount needed to correct a predictive error at ARTb· Parameter ρa calibrates the minimum confidence that ARTa must have in a category, or hypothesis, activated by an input a^(p) in order for ARTa to accept that category, rather than search for a better one through an automatically controlled process of hypothesis testing. Parameter ρa is compared with the degree of match between a^(p) and the top-down learned expectation, or prototype, that is read-out subsequent to activation of an ARTa category. Search occurs if the degree of match is less than ρa. ARTMAP is hereby a type of self-organizing expert system that calibrates the selectivity of its hypotheses based upon predictive success. As a result, rare but important events can be quickly and sharply distinguished even if they are similar to frequent events with different consequences. Between input trials ρa relaxes to a baseline vigilance pa When ρa is large, the system runs in a conservative mode, wherein predictions are made only if the system is confident of the outcome. Very few false-alarm errors then occur at any stage of learning, yet the system reaches asymptote with no loss of speed. Because ARTMAP learning is self stabilizing, it can continue learning one or more databases, without degrading its corpus of memories, until its full memory capacity is utilized.