984 resultados para Mining law


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of mining targeted association rules over multidimensional market-basket data. Here, each transaction has, in addition to the set of purchased items, ancillary dimension attributes associated with it. Based on these dimensions, transactions can be visualized as distributed over cells of an n-dimensional cube. In this framework, a targeted association rule is of the form {X -> Y} R, where R is a convex region in the cube and X. Y is a traditional association rule within region R. We first describe the TOARM algorithm, based on classical techniques, for identifying targeted association rules. Then, we discuss the concepts of bottom-up aggregation and cubing, leading to the CellUnion technique. This approach is further extended, using notions of cube-count interleaving and credit-based pruning, to derive the IceCube algorithm. Our experiments demonstrate that IceCube consistently provides the best execution time performance, especially for large and complex data cubes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we discuss the issues related to word recognition in born-digital word images. We introduce a novel method of power-law transformation on the word image for binarization. We show the improvement in image binarization and the consequent increase in the recognition performance of OCR engine on the word image. The optimal value of gamma for a word image is automatically chosen by our algorithm with fixed stroke width threshold. We have exhaustively experimented our algorithm by varying the gamma and stroke width threshold value. By varying the gamma value, we found that our algorithm performed better than the results reported in the literature. On the ICDAR Robust Reading Systems Challenge-1: Word Recognition Task on born digital dataset, as compared to the recognition rate of 61.5% achieved by TH-OCR after suitable pre-processing by Yang et. al. and 63.4% by ABBYY Fine Reader (used as baseline by the competition organizers without any preprocessing), we achieved 82.9% using Omnipage OCR applied on the images after being processed by our algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The rapid growth in the field of data mining has lead to the development of various methods for outlier detection. Though detection of outliers has been well explored in the context of numerical data, dealing with categorical data is still evolving. In this paper, we propose a two-phase algorithm for detecting outliers in categorical data based on a novel definition of outliers. In the first phase, this algorithm explores a clustering of the given data, followed by the ranking phase for determining the set of most likely outliers. The proposed algorithm is expected to perform better as it can identify different types of outliers, employing two independent ranking schemes based on the attribute value frequencies and the inherent clustering structure in the given data. Unlike some existing methods, the computational complexity of this algorithm is not affected by the number of outliers to be detected. The efficacy of this algorithm is demonstrated through experiments on various public domain categorical data sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of designing good space-time block codes (STBCs) with low maximum-likelihood (ML) decoding complexity has gathered much attention in the literature. All the known low ML decoding complexity techniques utilize the same approach of exploiting either the multigroup decodable or the fast-decodable (conditionally multigroup decodable) structure of a code. We refer to this well-known technique of decoding STBCs as conditional ML (CML) decoding. In this paper, we introduce a new framework to construct ML decoders for STBCs based on the generalized distributive law (GDL) and the factor-graph-based sum-product algorithm. We say that an STBC is fast GDL decodable if the order of GDL decoding complexity of the code, with respect to the constellation size, is strictly less than M-lambda, where lambda is the number of independent symbols in the STBC. We give sufficient conditions for an STBC to admit fast GDL decoding, and show that both multigroup and conditionally multigroup decodable codes are fast GDL decodable. For any STBC, whether fast GDL decodable or not, we show that the GDL decoding complexity is strictly less than the CML decoding complexity. For instance, for any STBC obtained from cyclic division algebras which is not multigroup or conditionally multigroup decodable, the GDL decoder provides about 12 times reduction in complexity compared to the CML decoder. Similarly, for the Golden code, which is conditionally multigroup decodable, the GDL decoder is only half as complex as the CML decoder.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper primarily intends to develop a GIS (geographical information system)-based data mining approach for optimally selecting the locations and determining installed capacities for setting up distributed biomass power generation systems in the context of decentralized energy planning for rural regions. The optimal locations within a cluster of villages are obtained by matching the installed capacity needed with the demand for power, minimizing the cost of transportation of biomass from dispersed sources to power generation system, and cost of distribution of electricity from the power generation system to demand centers or villages. The methodology was validated by using it for developing an optimal plan for implementing distributed biomass-based power systems for meeting the rural electricity needs of Tumkur district in India consisting of 2700 villages. The approach uses a k-medoid clustering algorithm to divide the total region into clusters of villages and locate biomass power generation systems at the medoids. The optimal value of k is determined iteratively by running the algorithm for the entire search space for different values of k along with demand-supply matching constraints. The optimal value of the k is chosen such that it minimizes the total cost of system installation, costs of transportation of biomass, and transmission and distribution. A smaller region, consisting of 293 villages was selected to study the sensitivity of the results to varying demand and supply parameters. The results of clustering are represented on a GIS map for the region.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mycobacterium tuberculosis owes its high pathogenic potential to its ability to evade host immune responses and thrive inside the macrophage. The outcome of infection is largely determined by the cellular response comprising a multitude of molecular events. The complexity and inter-relatedness in the processes makes it essential to adopt systems approaches to study them. In this work, we construct a comprehensive network of infection-related processes in a human macrophage comprising 1888 proteins and 14,016 interactions. We then compute response networks based on available gene expression profiles corresponding to states of health, disease and drug treatment. We use a novel formulation for mining response networks that has led to identifying highest activities in the cell. Highest activity paths provide mechanistic insights into pathogenesis and response to treatment. The approach used here serves as a generic framework for mining dynamic changes in genome-scale protein interaction networks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper aims at extending the universal erosive burning law developed by two of the present authors from axi-symmetric internally burning grains to partly symmetric burning grains. This extension revolves around three dimensional flow calculations inside highly loaded grain geometry and benefiting from an observation that the flow gradients normal to the surface in such geometries have a smooth behavior along the perimeter of the grain. These are used to help identify the diameter that gives the same perimeter the characteristic dimension rather than a mean hydraulic diameter chosen earlier. The predictions of highly loaded grains from the newly chosen dimension in the erosive burning law show better comparison with measured pressure-time curves while those with mean hydraulic diameter definitely over-predict the pressures. (c) 2013 IAA. Published by Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of designing good Space-Time Block Codes (STBCs) with low maximum-likelihood (ML) decoding complexity has gathered much attention in the literature. All the known low ML decoding complexity techniques utilize the same approach of exploiting either the multigroup decodable or the fast-decodable (conditionally multigroup decodable) structure of a code. We refer to this well known technique of decoding STBCs as Conditional ML (CML) decoding. In [1], we introduced a framework to construct ML decoders for STBCs based on the Generalized Distributive Law (GDL) and the Factor-graph based Sum-Product Algorithm, and showed that for two specific families of STBCs, the Toepltiz codes and the Overlapped Alamouti Codes (OACs), the GDL based ML decoders have strictly less complexity than the CML decoders. In this paper, we introduce a `traceback' step to the GDL decoding algorithm of STBCs, which enables roughly 4 times reduction in the complexity of the GDL decoders proposed in [1]. Utilizing this complexity reduction from `traceback', we then show that for any STBC (not just the Toeplitz and Overlapped Alamouti Codes), the GDL decoding complexity is strictly less than the CML decoding complexity. For instance, for any STBC obtained from Cyclic Division Algebras that is not multigroup or conditionally multigroup decodable, the GDL decoder provides approximately 12 times reduction in complexity compared to the CML decoder. Similarly, for the Golden code, which is conditionally multigroup decodable, the GDL decoder is only about half as complex as the CML decoder.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, the sliding mode control based guidance laws to intercept stationary targets at a desired impact time are proposed. Then, it is extended to constant velocity targets using the notion of predicted interception. The desired impact time is achieved by selecting the interceptor's lateral acceleration to enforce a sliding mode on a switching surface designed using non-linear engagement dynamics. Numerical simulation results are presented to validate the proposed guidance law for different initial engagement geometries, impact times and salvo attack scenarios

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper the cubic spline guidance law is presented for intercepting a stationary target at a desired impact angle. The guidance law is obtained from cubic spline curve based trajectory using an inverse method. The cubic spline t rajectory curve expresses the altitude as a cubic polynomial of the downrange. The guidance law is modified to achieve interception in the cases where impact angle is greater that or equal to 90◦. The guidance law is implemented in a feedback mode to maintain the desired impact angle and to reduce miss distance in the presence of lateral acceleration saturation and atmospheric distur- bances. The simulation results show that the guidance law fulfills all the requirements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel, soft computing based solution to a complex optimal control or dynamic optimization problem that requires the solution to be available in real-time. The complexities in this problem of optimal guidance of interceptors launched with high initial heading errors include the more involved physics of a three dimensional missile-target engagement, and those posed by the assumption of a realistic dynamic model such as time-varying missile speed, thrust, drag and mass, besides gravity, and upper bound on the lateral acceleration. The classic, pure proportional navigation law is augmented with a polynomial function of the heading error, and the values of the coefficients of the polynomial are determined using differential evolution (DE). The performance of the proposed DE enhanced guidance law is compared against the existing conventional laws in the literature, on the criteria of time and energy optimality, peak lateral acceleration demanded, terminal speed and robustness to unanticipated target maneuvers, to illustrate the superiority of the proposed law. (C) 2013 Elsevier B. V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Several time dependent fluorescence Stokes shift (TDFSS) experiments have reported a slow power law decay in the hydration dynamics of a DNA molecule. Such a power law has neither been observed in computer simulations nor in some other TDFSS experiments. Here we observe that a slow decay may originate from collective ion contribution because in experiments DNA is immersed in a buffer solution, and also from groove bound water and lastly from DNA dynamics itself. In this work we first express the solvation time correlation function in terms of dynamic structure factors of the solution. We use mode coupling theory to calculate analytically the time dependence of collective ionic contribution. A power law decay in seen to originate from an interplay between long-range probe-ion direct correlation function and ion-ion dynamic structure factor. Although the power law decay is reminiscent of Debye-Falkenhagen effect, yet solvation dynamics is dominated by ion atmosphere relaxation times at longer length scales (small wave number) than in electrolyte friction. We further discuss why this power law may not originate from water motions which have been computed by molecular dynamics simulations. Finally, we propose several experiments to check the prediction of the present theoretical work.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study models of interacting fermions in one dimension to investigate the crossover from integrability to nonintegrability, i.e., quantum chaos, as a function of system size. Using exact diagonalization of finite-sized systems, we study this crossover by obtaining the energy level statistics and Drude weight associated with transport. Our results reinforce the idea that for system size L -> infinity nonintegrability sets in for an arbitrarily small integrability-breaking perturbation. The crossover value of the perturbation scales as a power law similar to L-eta when the integrable system is gapless. The exponent eta approximate to 3 appears to be robust to microscopic details and the precise form of the perturbation. We conjecture that the exponent in the power law is characteristic of the random matrix ensemble describing the nonintegrable system. For systems with a gap, the crossover scaling appears to be faster than a power law.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In today's API-rich world, programmer productivity depends heavily on the programmer's ability to discover the required APIs. In this paper, we present a technique and tool, called MATHFINDER, to discover APIs for mathematical computations by mining unit tests of API methods. Given a math expression, MATHFINDER synthesizes pseudo-code to compute the expression by mapping its subexpressions to API method calls. For each subexpression, MATHFINDER searches for a method such that there is a mapping between method inputs and variables of the subexpression. The subexpression, when evaluated on the test inputs of the method under this mapping, should produce results that match the method output on a large number of tests. We implemented MATHFINDER as an Eclipse plugin for discovery of third-party Java APIs and performed a user study to evaluate its effectiveness. In the study, the use of MATHFINDER resulted in a 2x improvement in programmer productivity. In 96% of the subexpressions queried for in the study, MATHFINDER retrieved the desired API methods as the top-most result. The top-most pseudo-code snippet to implement the entire expression was correct in 93% of the cases. Since the number of methods and unit tests to mine could be large in practice, we also implement MATHFINDER in a MapReduce framework and evaluate its scalability and response time.