47 resultados para stream mining

em Indian Institute of Science - Bangalore - Índia


Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the emergence of large-volume and high-speed streaming data, the recent techniques for stream mining of CFIpsilas (closed frequent itemsets) will become inefficient. When concept drift occurs at a slow rate in high speed data streams, the rate of change of information across different sliding windows will be negligible. So, the user wonpsilat be devoid of change in information if we slide window by multiple transactions at a time. Therefore, we propose a novel approach for mining CFIpsilas cumulatively by making sliding width(ges1) over high speed data streams. However, it is nontrivial to mine CFIpsilas cumulatively over stream, because such growth may lead to the generation of exponential number of candidates for closure checking. In this study, we develop an efficient algorithm, stream-close, for mining CFIpsilas over stream by exploring some interesting properties. Our performance study reveals that stream-close achieves good scalability and has promising results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The impact of riparian land use on the stream insect communities was studied at Kudremukh National Park located within Western Ghats, a tropical biodiversity hotspot in India. The diversity and community composition of stream insects varied across streams with different riparian land use types. The rarefied family and generic richness was highest in streams with natural semi evergreen forests as riparian vegetation. However, when the streams had human habitations and areca nut plantations as riparian land use type, the rarefied richness was higher than that of streams with natural evergreen forests and grasslands. The streams with scrub lands and iron ore mining as the riparian land use had the lowest rarefied richness. Within a landscape, the streams with the natural riparian vegetation had similar community composition. However, streams with natural grasslands as the riparian vegetation, had low diversity and the community composition was similar to those of paddy fields. We discuss how stream insect assemblages differ due to varied riparian land use patterns, reflecting fundamental alterations in the functioning of stream ecosystems. This understanding is vital to conserve, manage and restore tropical riverine ecosystems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Automatic identification of software faults has enormous practical significance. This requires characterizing program execution behavior and the use of appropriate data mining techniques on the chosen representation. In this paper, we use the sequence of system calls to characterize program execution. The data mining tasks addressed are learning to map system call streams to fault labels and automatic identification of fault causes. Spectrum kernels and SVM are used for the former while latent semantic analysis is used for the latter The techniques are demonstrated for the intrusion dataset containing system call traces. The results show that kernel techniques are as accurate as the best available results but are faster by orders of magnitude. We also show that latent semantic indexing is capable of revealing fault-specific features.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The critical stream power criterion may be used to describe the incipient motion of cohesionless particles of plane sediment beds. The governing equation relating ``critical stream power'' to ``shear Reynolds number'' is developed by using the present experimental data as well as the data from several other sources. Simultaneously, a resistance equation, relating the ``particle Reynolds number'' to the``shear Reynolds number'' is developed for plane sediment beds in wide channels with little or no transport. By making use of these relations, a procedure is developed to design plane sediment beds such that any two of the four design variables, including particle size, energy/friction slope, flow depth, and discharge per unit width in the channel should be known to predict the remaining two variables. Finally, a straightforward design procedure using design tables/design curves and analytical methods is presented to solve six possible design problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The StreamIt programming model has been proposed to exploit parallelism in streaming applications oil general purpose multicore architectures. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on accelerators such as Graphics Processing Units (GPUs) or CellBE which support abundant parallelism in hardware. In this paper, we describe a novel method to orchestrate the execution of if StreamIt program oil a multicore platform equipped with an accelerator. The proposed approach identifies, using profiling, the relative benefits of executing a task oil the superscalar CPU cores and the accelerator. We formulate the problem of partitioning the work between the CPU cores and the GPU, taking into account the latencies for data transfers and the required buffer layout transformations associated with the partitioning, as all integrated Integer Linear Program (ILP) which can then be solved by an ILP solver. We also propose an efficient heuristic algorithm for the work-partitioning between the CPU and the GPU, which provides solutions which are within 9.05% of the optimal solution on an average across the benchmark Suite. The partitioned tasks are then software pipelined to execute oil the multiple CPU cores and the Streaming Multiprocessors (SMs) of the GPU. The software pipelining algorithm orchestrates the execution between CPU cores and the GPU by emitting the code for the CPU and the GPU, and the code for the required data transfers. Our experiments on a platform with 8 CPU cores and a GeForce 8800 GTS 512 GPU show a geometric mean speedup of 6.94X with it maximum of 51.96X over it single threaded CPU execution across the StreamIt benchmarks. This is a 18.9% improvement over it partitioning strategy that maps only the filters that cannot be executed oil the GPU - the filters with state that is persistent across firings - onto the CPU.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Understanding the functioning of a neural system in terms of its underlying circuitry is an important problem in neuroscience. Recent d evelopments in electrophysiology and imaging allow one to simultaneously record activities of hundreds of neurons. Inferring the underlying neuronal connectivity patterns from such multi-neuronal spike train data streams is a challenging statistical and computational problem. This task involves finding significant temporal patterns from vast amounts of symbolic time series data. In this paper we show that the frequent episode mining methods from the field of temporal data mining can be very useful in this context. In the frequent episode discovery framework, the data is viewed as a sequence of events, each of which is characterized by an event type and its time of occurrence and episodes are certain types of temporal patterns in such data. Here we show that, using the set of discovered frequent episodes from multi-neuronal data, one can infer different types of connectivity patterns in the neural system that generated it. For this purpose, we introduce the notion of mining for frequent episodes under certain temporal constraints; the structure of these temporal constraints is motivated by the application. We present algorithms for discovering serial and parallel episodes under these temporal constraints. Through extensive simulation studies we demonstrate that these methods are useful for unearthing patterns of neuronal network connectivity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The role of Acidithiobacillus group of bacteria in acid generation and heavy metal dissolution was studied with relevance to some Indian mines. Microorganisms implicated in acid generation such as Acidithiobacillus Acidithicibacillus thiooxidans and Leptospirillum ferrooxidans were isolated from abandoned mines, waste rocks and tailing dumps. Arsenite oxidizing Thiomonas and Bacillus group of bacteria were isolated and their ability to oxidize As (111) to As (V) established. Mine isolated Sulfate reducing bacteria were used to remove dissolved copper, zinc, iron and arsenic from solutions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data mining involves nontrivial process of extracting knowledge or patterns from large databases. Genetic Algorithms are efficient and robust searching and optimization methods that are used in data mining. In this paper we propose a Self-Adaptive Migration Model GA (SAMGA), where parameters of population size, the number of points of crossover and mutation rate for each population are adaptively fixed. Further, the migration of individuals between populations is decided dynamically. This paper gives a mathematical schema analysis of the method stating and showing that the algorithm exploits previously discovered knowledge for a more focused and concentrated search of heuristically high yielding regions while simultaneously performing a highly explorative search on the other regions of the search space. The effective performance of the algorithm is then shown using standard testbed functions and a set of actual classification datamining problems. Michigan style of classifier was used to build the classifier and the system was tested with machine learning databases of Pima Indian Diabetes database, Wisconsin Breast Cancer database and few others. The performance of our algorithm is better than others.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multi-core architectures. This model allows programmers to specify the structure of a program as a set of filters that act upon data, and a set of communication channels between them. The StreamIt graphs describe task, data and pipeline parallelism which can be exploited on modern Graphics Processing Units (GPUs), as they support abundant parallelism in hardware. In this paper, we describe the challenges in mapping StreamIt to GPUs and propose an efficient technique to software pipeline the execution of stream programs on GPUs. We formulate this problem - both scheduling and assignment of filters to processors - as an efficient Integer Linear Program (ILP), which is then solved using ILP solvers. We also describe a novel buffer layout technique for GPUs which facilitates exploiting the high memory bandwidth available in GPUs. The proposed scheduling utilizes both the scalar units in GPU, to exploit data parallelism, and multiprocessors, to exploit task and pipelin parallelism. Further it takes into consideration the synchronization and bandwidth limitations of GPUs, and yields speedups between 1.87X and 36.83X over a single threaded CPU.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classification of large datasets is a challenging task in Data Mining. In the current work, we propose a novel method that compresses the data and classifies the test data directly in its compressed form. The work forms a hybrid learning approach integrating the activities of data abstraction, frequent item generation, compression, classification and use of rough sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classification of large datasets is a challenging task in Data Mining. In the current work, we propose a novel method that compresses the data and classifies the test data directly in its compressed form. The work forms a hybrid learning approach integrating the activities of data abstraction, frequent item generation, compression, classification and use of rough sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of two-stream instability in plasma is studied by specifying the importance of initial magnetic field associated with the motion of the charged particles and the boundary effects. In Part I the accurate initial steady state is studied when the streams of electrons and ions move with different uniform speeds in plasmas with plane and cylindrical geometry. In Part II, in order to show the effects of finiteness and inhomogeneity of the system, small transverse plasma oscillations are studied in the case of plane plasmas. The role of plasma-sheath oscillations at the boundaries is found to be very important in driving the instabilities associated with the electromagnetic modes. The numerical estimates of the growth rates of the instability are given for the specific case of the physical data in discharge tubes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The effect of vibration on heat transfer from a horizontal copper cylinder, 0.344 in. in diameter and 6 in. long, was investigated. The cylinder was placed normal to an air stream and was sinusoidally vibrated in a direction perpendicular to the direction of the air stream. The flow velocity varied from 19 ft/s to 92 ft/s; the double amplitude of vibration from 0.75 cm to 3.2 cm, and the frequency of vibration from 200 to 2800 cycles/min. A transient technique was used to determine the heat transfer coefficients. The experimental data in the absence of vibration is expressed by NNu = 0.226 NRe0.6 in the range 2500 < NRe < 15 000. By imposing vibrational velocities as high as 20 per cent of the flow velocity, no appreciable change in the heat transfer coefficient was observed. An analysis using the resultant of the vibration and the flow velocity explains the observed phenomenon.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pt2+ ion dispersed in CeO2, Ce1-xTixO2-delta and TiO2 have been tested for preferential oxidation of carbon monoxide (PROX) in hydrogen rich stream. It is found that Pt2+ substituted CeO2 and Ce(1-x)TixO(2-delta) in the form of solid solution Ce0.98Pt0.02O2-delta and Ce0.83Ti0.15Pt0.02O2-delta are highly CO selective low temperature PROX catalysts in hydrogen rich stream. Just 15% of Ti substitution in CeO2 improves the overall PROX activity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The flow and heat transfer over an upstream moving non-isothermal wall with a parallel free stream have been considered. The magnetic field has been applied in the free stream parallel to the wall and the effect of induced magnetic field has been included in the analysis. The boundary layer equations governing the steady incompressible electrically conducting fluid flow have been solved numerically using a shooting method. This problem is interesting because a solution exists only when the ratio of the wall velocity does not exceed a certain critical value and this critical value depends on the magnetic field and magnetic Prandtl number. Also dual solutions exist for a certain range of wall velocity.