126 resultados para Emerging pattern mining


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Frequent episode discovery is a popular framework for pattern discovery from sequential data. It has found many applications in domains like alarm management in telecommunication networks, fault analysis in the manufacturing plants, predicting user behavior in web click streams and so on. In this paper, we address the discovery of serial episodes. In the episodes context, there have been multiple ways to quantify the frequency of an episode. Most of the current algorithms for episode discovery under various frequencies are apriori-based level-wise methods. These methods essentially perform a breadth-first search of the pattern space. However currently there are no depth-first based methods of pattern discovery in the frequent episode framework under many of the frequency definitions. In this paper, we try to bridge this gap. We provide new depth-first based algorithms for serial episode discovery under non-overlapped and total frequencies. Under non-overlapped frequency, we present algorithms that can take care of span constraint and gap constraint on episode occurrences. Under total frequency we present an algorithm that can handle span constraint. We provide proofs of correctness for the proposed algorithms. We demonstrate the effectiveness of the proposed algorithms by extensive simulations. We also give detailed run-time comparisons with the existing apriori-based methods and illustrate scenarios under which the proposed pattern-growth algorithms perform better than their apriori counterparts. (C) 2013 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we consider the setting of the pattern maximum likelihood (PML) problem studied by Orlitsky et al. We present a well-motivated heuristic algorithm for deciding the question of when the PML distribution of a given pattern is uniform. The algorithm is based on the concept of a ``uniform threshold''. This is a threshold at which the uniform distribution exhibits an interesting phase transition in the PML problem, going from being a local maximum to being a local minimum.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Electrophilic halogen-induced reactions of unactivated olefins are an important class of transformations, whose catalytic enantioselective variants have surfaced during the past few years as effective means of olefin heterodifunctionalization. This article covers important developments in the area of enantioselective halocyclizations, specifically in the context of the synthesis of nitrogenous heterocycles.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genetic mutations in microcephalinl (MCPH1) cause primary autosomal recessive microcephaly which is characterized by a marked reduction in brain size. MCPH1 encodes a centrosomal protein with three BRCT (BRCA1 C-terminal) domains. Also, it is a key regulator of DNA repair pathway and cell cycle checkpoints. Interestingly, in the past few years, many research studies have explored the role of MCPH1, a neurodevelopmental gene in several cancers and its tumor suppressor functions have been elucidated. Given the diverse new emerging roles, it becomes critical to review and summarize the multiple roles of MCPH1 that is currently lacking in the literature. In this review after systematic analysis of literature, we summarise the multiple functional roles of MCPH1 in centrosomal, DNA repair and apoptotic pathways. Additionally, we discuss the considerable efforts taken to understand the implications of MCPH1 in diseases such as primary microcephaly and its other emerging association with cancer and otitis media. The promising view is that MCPH1 has distinct roles and its clinical associations in various diseases makes it an attractive therapeutic target. (C) 2014 Elsevier GmbH. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of classification of time series data is an interesting problem in the field of data mining. Even though several algorithms have been proposed for the problem of time series classification we have developed an innovative algorithm which is computationally fast and accurate in several cases when compared with 1NN classifier. In our method we are calculating the fuzzy membership of each test pattern to be classified to each class. We have experimented with 6 benchmark datasets and compared our method with 1NN classifier.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Group VB and VIB M-Si systems are considered to show an interesting pattern in the diffusion of components with the change in atomic number in a particular group (M = V, Nb, Ta or M = Mo, W, respectively). Mainly two phases, MSi2 and M5Si3 are considered for this discussion. Except for Ta-silicides, the activation energy for the integrated diffusion of MSi2 is always lower than M5Si3. In both phases, the relative mobilities measured by the ratio of the tracer diffusion coefficients, , decrease with an increasing atomic number in the given group. If determined at the same homologous temperature, the interdiffusion coefficients increase with the atomic number of the refractory metal in the MSi2 phases and decrease in the M5Si3 ones. This behaviour features the basic changes in the defect concentrations on different sublattices with a change in the atomic number of the refractory components.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In today's API-rich world, programmer productivity depends heavily on the programmer's ability to discover the required APIs. In this paper, we present a technique and tool, called MATHFINDER, to discover APIs for mathematical computations by mining unit tests of API methods. Given a math expression, MATHFINDER synthesizes pseudo-code to compute the expression by mapping its subexpressions to API method calls. For each subexpression, MATHFINDER searches for a method such that there is a mapping between method inputs and variables of the subexpression. The subexpression, when evaluated on the test inputs of the method under this mapping, should produce results that match the method output on a large number of tests. We implemented MATHFINDER as an Eclipse plugin for discovery of third-party Java APIs and performed a user study to evaluate its effectiveness. In the study, the use of MATHFINDER resulted in a 2x improvement in programmer productivity. In 96% of the subexpressions queried for in the study, MATHFINDER retrieved the desired API methods as the top-most result. The top-most pseudo-code snippet to implement the entire expression was correct in 93% of the cases. Since the number of methods and unit tests to mine could be large in practice, we also implement MATHFINDER in a MapReduce framework and evaluate its scalability and response time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Today's programming languages are supported by powerful third-party APIs. For a given application domain, it is common to have many competing APIs that provide similar functionality. Programmer productivity therefore depends heavily on the programmer's ability to discover suitable APIs both during an initial coding phase, as well as during software maintenance. The aim of this work is to support the discovery and migration of math APIs. Math APIs are at the heart of many application domains ranging from machine learning to scientific computations. Our approach, called MATHFINDER, combines executable specifications of mathematical computations with unit tests (operational specifications) of API methods. Given a math expression, MATHFINDER synthesizes pseudo-code comprised of API methods to compute the expression by mining unit tests of the API methods. We present a sequential version of our unit test mining algorithm and also design a more scalable data-parallel version. We perform extensive evaluation of MATHFINDER (1) for API discovery, where math algorithms are to be implemented from scratch and (2) for API migration, where client programs utilizing a math API are to be migrated to another API. We evaluated the precision and recall of MATHFINDER on a diverse collection of math expressions, culled from algorithms used in a wide range of application areas such as control systems and structural dynamics. In a user study to evaluate the productivity gains obtained by using MATHFINDER for API discovery, the programmers who used MATHFINDER finished their programming tasks twice as fast as their counterparts who used the usual techniques like web and code search, IDE code completion, and manual inspection of library documentation. For the problem of API migration, as a case study, we used MATHFINDER to migrate Weka, a popular machine learning library. Overall, our evaluation shows that MATHFINDER is easy to use, provides highly precise results across several math APIs and application domains even with a small number of unit tests per method, and scales to large collections of unit tests.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The performance of prediction models is often based on ``abstract metrics'' that estimate the model's ability to limit residual errors between the observed and predicted values. However, meaningful evaluation and selection of prediction models for end-user domains requires holistic and application-sensitive performance measures. Inspired by energy consumption prediction models used in the emerging ``big data'' domain of Smart Power Grids, we propose a suite of performance measures to rationally compare models along the dimensions of scale independence, reliability, volatility and cost. We include both application independent and dependent measures, the latter parameterized to allow customization by domain experts to fit their scenario. While our measures are generalizable to other domains, we offer an empirical analysis using real energy use data for three Smart Grid applications: planning, customer education and demand response, which are relevant for energy sustainability. Our results underscore the value of the proposed measures to offer a deeper insight into models' behavior and their impact on real applications, which benefit both data mining researchers and practitioners.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During the early stages of operation, high-tech startups need to overcome the liability of newness and manage high degree of uncertainty. Several high-tech startups fail due to inability to deal with skeptical customers, underdeveloped markets and limited resources in selling an offering that has no precedent. This paper leverages the principles of effectuation (a logic of entrepreneurial decision making under uncertainty) to explain the journey from creation to survival of high-tech startups in an emerging economy. Based on the 99tests.com case study, this paper suggests that early stage high-tech startups in emerging economies can increase their probability of survival by adopting the principles of effectuation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We demonstrate a new technique to generate multiple light-sheets for fluorescence microscopy. This is possible by illuminating the cylindrical lens using multiple copies of Gaussian beams. A diffraction grating placed just before the cylindrical lens splits the incident Gaussian beam into multiple beams traveling at different angles. Subsequently, this gives rise to diffraction-limited light-sheets after the Gaussian beams pass through the combined cylindrical lens-objective sub-system. Direct measurement of field at and around the focus of objective lens shows multi-sheet pattern with an average thickness of 7.5 mu m and inter-sheet separation of 380 mu m. Employing an independent orthogonal detection sub-system, we successfully imaged fluorescently-coated yeast cells (approximate to 4 mu m) encaged in agarose gel-matrix. Such a diffraction-limited sheet-pattern equipped with dedicated detection system may find immediate applications in the field of optical microscopy and fluorescence imaging. (C) 2015 Optical Society of America

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The disclosure of information and its misuse in Privacy Preserving Data Mining (PPDM) systems is a concern to the parties involved. In PPDM systems data is available amongst multiple parties collaborating to achieve cumulative mining accuracy. The vertically partitioned data available with the parties involved cannot provide accurate mining results when compared to the collaborative mining results. To overcome the privacy issue in data disclosure this paper describes a Key Distribution-Less Privacy Preserving Data Mining (KDLPPDM) system in which the publication of local association rules generated by the parties is published. The association rules are securely combined to form the combined rule set using the Commutative RSA algorithm. The combined rule sets established are used to classify or mine the data. The results discussed in this paper compare the accuracy of the rules generated using the C4. 5 based KDLPPDM system and the CS. 0 based KDLPPDM system using receiver operating characteristics curves (ROC).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The remarkable capability of nature to design and create excellent self-assembled nano-structures, especially in the biological world, has motivated chemists to mimic such systems with synthetic molecular and supramolecular systems. The hierarchically organized self-assembly of low molecular weight gelators (LMWGs) based on non-covalent interactions has been proven to be a useful tool in the development of well-defined nanostructures. Among these, the self-assembly of sugar-derived LMWGs has received immense attention because of their propensity to furnish biocompatible, hierarchical, supramolecular architectures that are macroscopically expressed in gel formation. This review sheds light on various aspects of sugar-derived LMWGs, uncovering their mechanisms of gelation, structural analysis, and tailorable properties, and their diverse applications such as stimuli-responsiveness, sensing, self-healing, environmental problems, and nano and biomaterials synthesis.