119 resultados para Frequent Sequential Patterns
Resumo:
With the emergence of large-volume and high-speed streaming data, the recent techniques for stream mining of CFIpsilas (closed frequent itemsets) will become inefficient. When concept drift occurs at a slow rate in high speed data streams, the rate of change of information across different sliding windows will be negligible. So, the user wonpsilat be devoid of change in information if we slide window by multiple transactions at a time. Therefore, we propose a novel approach for mining CFIpsilas cumulatively by making sliding width(ges1) over high speed data streams. However, it is nontrivial to mine CFIpsilas cumulatively over stream, because such growth may lead to the generation of exponential number of candidates for closure checking. In this study, we develop an efficient algorithm, stream-close, for mining CFIpsilas over stream by exploring some interesting properties. Our performance study reveals that stream-close achieves good scalability and has promising results.
Resumo:
Frequent episode discovery is a popular framework for mining data available as a long sequence of events. An episode is essentially a short ordered sequence of event types and the frequency of an episode is some suitable measure of how often the episode occurs in the data sequence. Recently,we proposed a new frequency measure for episodes based on the notion of non-overlapped occurrences of episodes in the event sequence, and showed that, such a definition, in addition to yielding computationally efficient algorithms, has some important theoretical properties in connecting frequent episode discovery with HMM learning. This paper presents some new algorithms for frequent episode discovery under this non-overlapped occurrences-based frequency definition. The algorithms presented here are better (by a factor of N, where N denotes the size of episodes being discovered) in terms of both time and space complexities when compared to existing methods for frequent episode discovery. We show through some simulation experiments, that our algorithms are very efficient. The new algorithms presented here have arguably the least possible orders of spaceand time complexities for the task of frequent episode discovery.
Resumo:
Femtosecond spectroscopy carried out earlier on Monellin and some other systems has given insights into the hydration dynamics of the proteins. In the present work, molecular dynamics simulations have been performed on Monellin to study the hydration dynamics. A method has been described to follow up the molecular events of the protein–water interactions in detail. The time constants of the survival correlation function match well with the reported experimental values. This validates the procedure, adapted here for Monellin, to investigate the hydration dynamics in general.
Resumo:
We consider the classical problem of sequential detection of change in a distribution (from hypothesis 0 to hypothesis 1), where the fusion centre receives vectors of periodic measurements, with the measurements being i.i.d. over time and across the vector components, under each of the two hypotheses. In our problem, the sensor devices ("motes") that generate the measurements constitute an ad hoc wireless network. The motes contend using a random access protocol (such as CSMA/CA) to transmit their measurement packets to the fusion centre. The fusion centre waits for vectors of measurements to accumulate before taking decisions. We formulate the optimal detection problem, taking into account the network delay experienced by the vectors of measurements, and find that, under periodic sampling, the detection delay decouples into network delay and decision delay. We obtain a lower bound on the network delay, and propose a censoring scheme, where lagging sensors drop their delayed observations in order to mitigate network delay. We show that this scheme can achieve the lower bound. This approach is explored via simulation. We also use numerical evaluation and simulation to study issues such as: the optimal sampling rate for a given number of sensors, and the optimal number of sensors for a given measurement rate
Resumo:
How the brain maintains perceptual continuity across eye movements that yield discontinuous snapshots of the world is still poorly understood. In this study, we adapted a framework from the dual-task paradigm, well suited to reveal bottlenecks in mental processing, to study how information is processed across sequential saccades. The pattern of RTs allowed us to distinguish among three forms of trans-saccadic processing (no trans-saccadic processing, trans-saccadic visual processing and trans-saccadic visual processing and saccade planning models). Using a cued double-step saccade task, we show that even though saccade execution is a processing bottleneck, limiting access to incoming visual information, partial visual and motor processing that occur prior to saccade execution is used to guide the next eye movement. These results provide insights into how the oculomotor system is designed to process information across multiple fixations that occur during natural scanning.
Resumo:
Wavelet transform analysis of projected fringe pattern for phase recovery in 3-D shape measurement of objects is investigated. The present communication specifically outlines and evaluates the errors that creep in to the reconstructed profiles when fringe images do not satisfy periodicity. Three specific cases that give raise to non-periodicity of fringe image are simulated and leakage effects caused by each one of them are analyzed with continuous complex Morlet wavelet transform. Same images are analyzed with FFT method to make a comparison of the reconstructed profiles with both methods. Simulation results revealed a significant advantage of wavelet transform profilometry (WTP), that the distortions that arise due to leakage are confined to the locations of discontinuity and do not spread out over the entire projection as in the case of Fourier transform profilometry (FTP).
Resumo:
Frequent episode discovery framework is a popular framework in temporal data mining with many applications. Over the years, many different notions of frequencies of episodes have been proposed along with different algorithms for episode discovery. In this paper, we present a unified view of all the apriori-based discoverymethods for serial episodes under these different notions of frequencies. Specifically, we present a unified view of the various frequency counting algorithms. We propose a generic counting algorithm such that all current algorithms are special cases of it. This unified view allows one to gain insights into different frequencies, and we present quantitative relationships among different frequencies.Our unified view also helps in obtaining correctness proofs for various counting algorithms as we show here. It also aids in understanding and obtaining the anti-monotonicity properties satisfied by the various frequencies, the properties exploited by the candidate generation step of any apriori-based method. We also point out how our unified view of counting helps to consider generalization of the algorithm to count episodes with general partial orders.
Resumo:
We present reduced dimensionality (RD) 3D HN(CA)NH for efficient sequential assignment in proteins. The experiment correlates the N-15 and H-1 chemical shift of a residue ('i') with those of its immediate N-terminal (i - 1) and C-terminal (i + 1) neighbors and provides four-dimensional chemical shift correlations rapidly with high resolution. An assignment strategy is presented which combines the correlations observed in this experiment with amino acid type information obtained from 3D CBCA(CO)NH. By classifying the 20 amino acid types into seven distinct categories based on C-13(beta) chemical shifts, it is observed that a stretch of five sequentially connected residues is sufficient to map uniquely on to the polypeptide for sequence specific resonance assignments. This method is exemplified by application to three different systems: maltose binding protein (42 kDa), intrinsically disordered domain of insulin-like growth factor binding protein-2 and Ubiquitin. Fast data acquisition is demonstrated using longitudinal H-1 relaxation optimization. Overall, 3D HN(CA)NH is a powerful tool for high throughput resonance assignment, in particular for unfolded or intrinsically disordered polypeptides.