13 resultados para data warehouse tuning aggregato business intelligence performance

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

During the last few years, the evolution of fieldbus and computers networks allowed the integration of different communication systems involving both production single cells and production cells, as well as other systems for business intelligence, supervision and control. Several well-adopted communication technologies exist today for public and non-public networks. Since most of the industrial applications are time-critical, the requirements of communication systems for remote control differ from common applications for computer networks accessing the Internet, such as Web, e-mail and file transfer. The solution proposed and outlined in this work is called CyberOPC. It includes the study and the implementation of a new open communication system for remote control of industrial CNC machines, making the transmission delay for time-critical control data shorter than other OPC-based solutions, and fulfilling cyber security requirements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a method to locate and track people by combining evidence from multiple cameras using the homography constraint. The proposed method use foreground pixels from simple background subtraction to compute evidence of the location of people on a reference ground plane. The algorithm computes the amount of support that basically corresponds to the ""foreground mass"" above each pixel. Therefore, pixels that correspond to ground points have more support. The support is normalized to compensate for perspective effects and accumulated on the reference plane for all camera views. The detection of people on the reference plane becomes a search for regions of local maxima in the accumulator. Many false positives are filtered by checking the visibility consistency of the detected candidates against all camera views. The remaining candidates are tracked using Kalman filters and appearance models. Experimental results using challenging data from PETS`06 show good performance of the method in the presence of severe occlusion. Ground truth data also confirms the robustness of the method. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Since the 1990s several large companies have been publishing nonfinancial performance reports. Focusing initially on the physical environment, these reports evolved to consider social relations, as well as data on the firm`s economic performance. A few mining companies pioneered this trend, and in the last years some of them incorporated the three dimensions of sustainable development, publishing so-called sustainability reports. This article reviews 31 reports published between 2001 and 2006 by four major mining companies. A set of 62 assessment items organized in six categories (namely context and commitment, management, environmental, social and economic performance, and accessibility and assurance) were selected to guide the review. The items were derived from international literature and recommended best practices, including the Global Reporting Initiative G3 framework. A content analysis was performed using the report as a sampling unit, and using phrases, graphics, or tables containing certain information as data collection units. A basic rating scale (0 or 1) was used for noting the presence or absence of information and a final percentage score was obtained for each report. Results show that there is a clear evolution in report`s comprehensiveness and depth. Categories ""accessibility and assurance"" and ""economic performance"" featured the lowest scores and do not present a clear evolution trend in the period, whereas categories ""context and commitment"" and ""social performance"" presented the best results and regular improvement; the category ""environmental performance,"" despite it not reaching the biggest scores, also featured constant evolution. Description of data measurement techniques, besides more comprehensive third-party verification are the items most in need of improvement.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Eight different models to represent the effect of friction in control valves are presented: four models based on physical principles and four empirical ones. The physical models, both static and dynamic, have the same structure. The models are implemented in Simulink/Matlab (R) and compared, using different friction coefficients and input signals. Three of the models were able to reproduce the stick-slip phenomenon and passed all the tests, which were applied following ISA standards. (C) 2008 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Leaf wetness duration (LWD) is a key parameter in agricultural meteorology since it is related to epidemiology of many important crops, controlling pathogen infection and development rates. Because LWD is not widely measured, several methods have been developed to estimate it from weather data. Among the models used to estimate LWD, those that use physical principles of dew formation and dew and/or rain evaporation have shown good portability and sufficiently accurate results, but their complexity is a disadvantage for operational use. Alternatively, empirical models have been used despite their limitations. The simplest empirical models use only relative humidity data. The objective of this study was to evaluate the performance of three RH-based empirical models to estimate LWD in four regions around the world that have different climate conditions. Hourly LWD, air temperature, and relative humidity data were obtained from Ames, IA (USA), Elora, Ontario (Canada), Florence, Toscany (Italy), and Piracicaba, Sao Paulo State (Brazil). These data were used to evaluate the performance of the following empirical LWD estimation models: constant RH threshold (RH >= 90%); dew point depression (DPD); and extended RH threshold (EXT_RH). Different performance of the models was observed in the four locations. In Ames, Elora and Piracicaba, the RH >= 90% and DPD models underestimated LWD, whereas in Florence these methods overestimated LWD, especially for shorter wet periods. When the EXT_RH model was used, LWD was overestimated for all locations, with a significant increase in the errors. In general, the RH >= 90% model performed best, presenting the highest general fraction of correct estimates (F(C)), between 0.87 and 0.92, and the lowest false alarm ratio (F(AR)), between 0.02 and 0.31. The use of specific thresholds for each location improved accuracy of the RH model substantially, even when independent data were used; MAE ranged from 1.23 to 1.89 h, which is very similar to errors obtained with published physical models for LWD estimation. Based on these results, we concluded that, if calibrated locally, LWD can be estimated with acceptable accuracy by RH above a specific threshold, and that the EXT_RH method was unsuitable for estimating LWD at the locations used in this study. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently, we have built a classification model that is capable of assigning a given sesquiterpene lactone (STL) into exactly one tribe of the plant family Asteraceae from which the STL has been isolated. Although many plant species are able to biosynthesize a set of peculiar compounds, the occurrence of the same secondary metabolites in more than one tribe of Asteraceae is frequent. Building on our previous work, in this paper, we explore the possibility of assigning an STL to more than one tribe (class) simultaneously. When an object may belong to more than one class simultaneously, it is called multilabeled. In this work, we present a general overview of the techniques available to examine multilabeled data. The problem of evaluating the performance of a multilabeled classifier is discussed. Two particular multilabeled classification methods-cross-training with support vector machines (ct-SVM) and multilabeled k-nearest neighbors (M-L-kNN)were applied to the classification of the STLs into seven tribes from the plant family Asteraceae. The results are compared to a single-label classification and are analyzed from a chemotaxonomic point of view. The multilabeled approach allowed us to (1) model the reality as closely as possible, (2) improve our understanding of the relationship between the secondary metabolite profiles of different Asteraceae tribes, and (3) significantly decrease the number of plant sources to be considered for finding a certain STL. The presented classification models are useful for the targeted collection of plants with the objective of finding plant sources of natural compounds that are biologically active or possess other specific properties of interest.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: The profile of blood donors changed dramatically in Brazil over the past 20 years, from remunerated to nonremunerated and then from replacement to community donors. Donor demographic data from three major blood centers establish current donation profiles in Brazil, serving as baseline for future analyses and tracking longitudinal changes in donor characteristics. STUDY DESIGN AND METHODS: Data were extracted from the blood center, compiled in a data warehouse, and analyzed. Population data were obtained from the Brazilian census. RESULTS: During 2007 to 2008, there were 615,379 blood donations from 410,423 donors. A total of 426,142 (69.2%) were from repeat (Rpt) donors and 189,237 (30.8%) were from first-time (FT) donors. Twenty percent of FT donors returned to donate in the period. FT donors were more likely to be younger, and Rpt donors were more likely to be community donors. All were predominantly male. Replacement donors still represent 50% of FT and 30% of Rpt donors. The mean percentage of the potentially general population who were donors was approximately 1.2% for the three centers (0.7, 1.5, and 3.1%). Adjusting for the catchment`s area, the first two were 2.1 and 1.6%. CONCLUSIONS: Donors in the three Brazilian centers tended to be younger with a higher proportion of males than in the general population. Donation rates were lower than desirable. There were substantial differences in sex, age, and community/replacement status by center. Studies on the safety, donation frequencies, and motivations of donors are in progress to orient efforts to enhance the availability of blood.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The integration of nanostructured films containing biomolecules and silicon-based technologies is a promising direction for reaching miniaturized biosensors that exhibit high sensitivity and selectivity. A challenge, however, is to avoid cross talk among sensing units in an array with multiple sensors located on a small area. In this letter, we describe an array of 16 sensing units, of a light-addressable potentiometric sensor (LAPS), which was made with layer-by-Layer (LbL) films of a poly(amidomine) dendrimer (PAMAM) and single-walled carbon nanotubes (SWNTs), coated with a layer of the enzyme penicillinase. A visual inspection of the data from constant-current measurements with liquid samples containing distinct concentrations of penicillin, glucose, or a buffer indicated a possible cross talk between units that contained penicillinase and those that did not. With the use of multidimensional data projection techniques, normally employed in information Visualization methods, we managed to distinguish the results from the modified LAPS, even in cases where the units were adjacent to each other. Furthermore, the plots generated with the interactive document map (IDMAP) projection technique enabled the distinction of the different concentrations of penicillin, from 5 mmol L(-1) down to 0.5 mmol L(-1). Data visualization also confirmed the enhanced performance of the sensing units containing carbon nanotubes, consistent with the analysis of results for LAPS sensors. The use of visual analytics, as with projection methods, may be essential to handle a large amount of data generated in multiple sensor arrays to achieve high performance in miniaturized systems.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We discuss the expectation propagation (EP) algorithm for approximate Bayesian inference using a factorizing posterior approximation. For neural network models, we use a central limit theorem argument to make EP tractable when the number of parameters is large. For two types of models, we show that EP can achieve optimal generalization performance when data are drawn from a simple distribution.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Electromagnetic induction (EMI) method results are shown for vertical magnetic dipole (VMD) configuration by using the EM38 equipment. Performance in the location of metallic pipes and electrical cables is compared as a function of instrumental drift correction by linear and quadratic adjusting under controlled conditions. Metallic pipes and electrical cables are buried at the IAG/USP shallow geophysical test site in Sao Paulo City. Brazil. Results show that apparent electrical conductivity and magnetic susceptibility data were affected by ambient temperature variation. In order to obtain better contrast between background and metallic targets it was necessary to correct the drift. This correction was accomplished by using linear and quadratic relation between conductivity/susceptibility and temperature intending comparative studies. The correction of temperature drift by using a quadratic relation was effective, showing that all metallic targets were located as well deeper targets were also improved. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Predictive performance evaluation is a fundamental issue in design, development, and deployment of classification systems. As predictive performance evaluation is a multidimensional problem, single scalar summaries such as error rate, although quite convenient due to its simplicity, can seldom evaluate all the aspects that a complete and reliable evaluation must consider. Due to this, various graphical performance evaluation methods are increasingly drawing the attention of machine learning, data mining, and pattern recognition communities. The main advantage of these types of methods resides in their ability to depict the trade-offs between evaluation aspects in a multidimensional space rather than reducing these aspects to an arbitrarily chosen (and often biased) single scalar measure. Furthermore, to appropriately select a suitable graphical method for a given task, it is crucial to identify its strengths and weaknesses. This paper surveys various graphical methods often used for predictive performance evaluation. By presenting these methods in the same framework, we hope this paper may shed some light on deciding which methods are more suitable to use in different situations.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Support vector machines (SVMs) were originally formulated for the solution of binary classification problems. In multiclass problems, a decomposition approach is often employed, in which the multiclass problem is divided into multiple binary subproblems, whose results are combined. Generally, the performance of SVM classifiers is affected by the selection of values for their parameters. This paper investigates the use of genetic algorithms (GAs) to tune the parameters of the binary SVMs in common multiclass decompositions. The developed GA may search for a set of parameter values common to all binary classifiers or for differentiated values for each binary classifier. (C) 2008 Elsevier B.V. All rights reserved.