19 resultados para data warehouse tuning aggregato business intelligence performance
Resumo:
Practical usage of machine learning is gaining strategic importance in enterprises looking for business intelligence. However, most enterprise data is distributed in multiple relational databases with expert-designed schema. Using traditional single-table machine learning techniques over such data not only incur a computational penalty for converting to a flat form (mega-join), even the human-specified semantic information present in the relations is lost. In this paper, we present a practical, two-phase hierarchical meta-classification algorithm for relational databases with a semantic divide and conquer approach. We propose a recursive, prediction aggregation technique over heterogeneous classifiers applied on individual database tables. The proposed algorithm was evaluated on three diverse datasets. namely TPCH, PKDD and UCI benchmarks and showed considerable reduction in classification time without any loss of prediction accuracy. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
The spatial error structure of daily precipitation derived from the latest version 7 (v7) tropical rainfall measuring mission (TRMM) level 2 data products are studied through comparison with the Asian precipitation highly resolved observational data integration toward evaluation of the water resources (APHRODITE) data over a subtropical region of the Indian subcontinent for the seasonal rainfall over 6 years from June 2002 to September 2007. The data products examined include v7 data from the TRMM radiometer Microwave Imager (TMI) and radar precipitation radar (PR), namely, 2A12, 2A25, and 2B31 (combined data from PR and TMI). The spatial distribution of uncertainty from these data products were quantified based on performance metrics derived from the contingency table. For the seasonal daily precipitation over a subtropical basin in India, the data product of 2A12 showed greater skill in detecting and quantifying the volume of rainfall when compared with the 2A25 and 2B31 data products. Error characterization using various error models revealed that random errors from multiplicative error models were homoscedastic and that they better represented rainfall estimates from 2A12 algorithm. Error decomposition techniques performed to disentangle systematic and random errors verify that the multiplicative error model representing rainfall from 2A12 algorithm successfully estimated a greater percentage of systematic error than 2A25 or 2B31 algorithms. Results verify that although the radiometer derived 2A12 rainfall data is known to suffer from many sources of uncertainties, spatial analysis over the case study region of India testifies that the 2A12 rainfall estimates are in a very good agreement with the reference estimates for the data period considered.
Resumo:
Gaussian processes (GPs) are promising Bayesian methods for classification and regression problems. Design of a GP classifier and making predictions using it is, however, computationally demanding, especially when the training set size is large. Sparse GP classifiers are known to overcome this limitation. In this letter, we propose and study a validation-based method for sparse GP classifier design. The proposed method uses a negative log predictive (NLP) loss measure, which is easy to compute for GP models. We use this measure for both basis vector selection and hyperparameter adaptation. The experimental results on several real-world benchmark data sets show better orcomparable generalization performance over existing methods.
Resumo:
A completely automated temperature-programmed reaction (TPR) system for carrying out gas-solid catalytic reactions under atmospheric flow conditions is fabricated to study CO and hydrocarbon oxidation, and NO reduction. The system consists of an all-stainless steel UHV system, quadrupole mass spectrometer SX200 (VG Scientific), a tubular furnace and micro-reactor, a temperature controller, a versatile gas handling system, and a data acquisition and analysis system. The performance of the system has been tested under standard experimental conditions for CO oxidation over well-characterized Ce1-x-y(La/Y)(y)O2-delta catalysts. Testing of 3-way catalysis with CO, NO and C2H2 to convert to CO2, N-2 and H2O is done with this catalyst which shows complete removal of pollutants below 325 degrees C. Fixed oxide-ion defects in Pt substituted Ce1-y(La/Y)(y)O2-y/2 show higher catalytic activity than Pt ion-substituted CeO2
Resumo:
A completely automated temperature-programmed reaction (TPR) system for carrying out gas-solid catalytic reactions under atmospheric flow conditions is fabricated to study CO and hydrocarbon oxidation, and NO reduction. The system consists of an all-stainless steel UHV system, quadrupole mass spectrometer SX200 (VG Scientific), a tubular furnace and micro-reactor, a temperature controller, a versatile gas handling system, and a data acquisition and analysis system. The performance of the system has been tested under standard experimental conditions for CO oxidation over well-characterized Ce1-x-y(La/Y)(y)O2-delta catalysts. Testing of 3-way catalysis with CO, NO and C2H2 to convert to CO2, N-2 and H2O is done with this catalyst which shows complete removal of pollutants below 325 degrees C. Fixed oxide-ion defects in Pt substituted Ce1-y(La/Y)(y)O2-y/2 show higher catalytic activity than Pt ion-substituted CeO2.
Resumo:
Data flow computers are high-speed machines in which an instruction is executed as soon as all its operands are available. This paper describes the EXtended MANchester (EXMAN) data flow computer which incorporates three major extensions to the basic Manchester machine. As extensions we provide a multiple matching units scheme, an efficient, implementation of array data structure, and a facility to concurrently execute reentrant routines. A simulator for the EXMAN computer has been coded in the discrete event simulation language, SIMULA 67, on the DEC 1090 system. Performance analysis studies have been conducted on the simulated EXMAN computer to study the effectiveness of the proposed extensions. The performance experiments have been carried out using three sample problems: matrix multiplication, Bresenham's line drawing algorithm, and the polygon scan-conversion algorithm.
Resumo:
The swelling pressure of soil depends upon various soil parameters such as mineralogy, clay content, Atterberg's limits, dry density, moisture content, initial degree of saturation, etc. along with structural and environmental factors. It is very difficult to model and analyze swelling pressure effectively taking all the above aspects into consideration. Various statistical/empirical methods have been attempted to predict the swelling pressure based on index properties of soil. In this paper, the computational intelligence techniques artificial neural network and support vector machine have been used to develop models based on the set of available experimental results to predict swelling pressure from the inputs; natural moisture content, dry density, liquid limit, plasticity index, and clay fraction. The generalization of the model to new set of data other than the training set of data is discussed which is required for successful application of a model. A detailed study of the relative performance of the computational intelligence techniques has been carried out based on different statistical performance criteria.
Resumo:
The swelling pressure of soil depends upon various soil parameters such as mineralogy, clay content, Atterberg's limits, dry density, moisture content, initial degree of saturation, etc. along with structural and environmental factors. It is very difficult to model and analyze swelling pressure effectively taking all the above aspects into consideration. Various statistical/empirical methods have been attempted to predict the swelling pressure based on index properties of soil. In this paper, the computational intelligence techniques artificial neural network and support vector machine have been used to develop models based on the set of available experimental results to predict swelling pressure from the inputs; natural moisture content, dry density, liquid limit, plasticity index, and clay fraction. The generalization of the model to new set of data other than the training set of data is discussed which is required for successful application of a model. A detailed study of the relative performance of the computational intelligence techniques has been carried out based on different statistical performance criteria.
Resumo:
In the direction of arrival (DOA) estimation problem, we encounter both finite data and insufficient knowledge of array characterization. It is therefore important to study how subspace-based methods perform in such conditions. We analyze the finite data performance of the multiple signal classification (MUSIC) and minimum norm (min. norm) methods in the presence of sensor gain and phase errors, and derive expressions for the mean square error (MSE) in the DOA estimates. These expressions are first derived assuming an arbitrary array and then simplified for the special case of an uniform linear array with isotropic sensors. When they are further simplified for the case of finite data only and sensor errors only, they reduce to the recent results given in [9-12]. Computer simulations are used to verify the closeness between the predicted and simulated values of the MSE.
Resumo:
The paper examines the suitability of the generalized data rule in training artificial neural networks (ANN) for damage identification in structures. Several multilayer perceptron architectures are investigated for a typical bridge truss structure with simulated damage stares generated randomly. The training samples have been generated in terms of measurable structural parameters (displacements and strains) at suitable selected locations in the structure. Issues related to the performance of the network with reference to hidden layers and hidden. neurons are examined. Some heuristics are proposed for the design of neural networks for damage identification in structures. These are further supported by an investigation conducted on five other bridge truss configurations.
Resumo:
In this paper we propose a new method of data handling for web servers. We call this method Network Aware Buffering and Caching (NABC for short). NABC facilitates reduction of data copies in web server's data sending path, by doing three things: (1) Layout the data in main memory in a way that protocol processing can be done without data copies (2) Keep a unified cache of data in kernel and ensure safe access to it by various processes and kernel and (3) Pass only the necessary meta data between processes so that bulk data handling time spent during IPC can be reduced. We realize NABC by implementing a set of system calls and an user library. The end product of the implementation is a set of APIs specifically designed for use by the web servers. We port an in house web server called SWEET, to NABC APIs and evaluate performance using a range of workloads both simulated and real. The results show a very impressive gain of 12% to 21% in throughput for static file serving and 1.6 to 4 times gain in throughput for lightweight dynamic content serving for a server using NABC APIs over the one using UNIX APIs.
Resumo:
This paper presents a novel approach for designing a fixed gain robust power system stabilizer (PSS) with particu lar emphasis on achieving a minimum closed loop perfor mance, over a wide range of operating and system condi tion. The minimum performance requirements of the con troller has been decided apriori and obtained by using a genetic algorithm (GA) based power system stabilizer. The proposed PSS is robust to changes in the plant parameters brought about due to changes in system and operating con dition, guaranteeing a minimum performance. The efficacy of the proposed method has been tested on a multimachine system. The proposed method of tuning the PSS is an at tractive alternative to conventional fixed gain stabilizer de sign, as it retains the simplicity of the conventional PSS and still guarantees a robust acceptable performance over a wider range of operating and system condition.
Resumo:
This paper deals with the solution to the problem of multisensor data fusion for a single target scenario as detected by an airborne track-while-scan radar. The details of a neural network implementation, various training algorithms based on standard backpropagation, and the results of training and testing the neural network are presented. The promising capabilities of RPROP algorithm for multisensor data fusion for various parameters are shown in comparison to other adaptive techniques