893 resultados para data analysis: algorithms and implementation
Resumo:
This work presents two schemes of measuring the linear and angular kinematics of a rigid body using a kinematically redundant array of triple-axis accelerometers with potential applications in biomechanics. A novel angular velocity estimation algorithm is proposed and evaluated that can compensate for angular velocity errors using measurements of the direction of gravity. Analysis and discussion of optimal sensor array characteristics are provided. A damped 2 axis pendulum was used to excite all 6 DoF of the a suspended accelerometer array through determined complex motion and is the basis of both simulation and experimental studies. The relationship between accuracy and sensor redundancy is investigated for arrays of up to 100 triple axis (300 accelerometer axes) accelerometers in simulation and 10 equivalent sensors (30 accelerometer axes) in the laboratory test rig. The paper also reports on the sensor calibration techniques and hardware implementation.
Resumo:
In this paper, we develop a method, termed the Interaction Distribution (ID) method, for analysis of quantitative ecological network data. In many cases, quantitative network data sets are under-sampled, i.e. many interactions are poorly sampled or remain unobserved. Hence, the output of statistical analyses may fail to differentiate between patterns that are statistical artefacts and those which are real characteristics of ecological networks. The ID method can support assessment and inference of under-sampled ecological network data. In the current paper, we illustrate and discuss the ID method based on the properties of plant-animal pollination data sets of flower visitation frequencies. However, the ID method may be applied to other types of ecological networks. The method can supplement existing network analyses based on two definitions of the underlying probabilities for each combination of pollinator and plant species: (1), pi,j: the probability for a visit made by the i’th pollinator species to take place on the j’th plant species; (2), qi,j: the probability for a visit received by the j’th plant species to be made by the i’th pollinator. The method applies the Dirichlet distribution to estimate these two probabilities, based on a given empirical data set. The estimated mean values for pi,j and qi,j reflect the relative differences between recorded numbers of visits for different pollinator and plant species, and the estimated uncertainty of pi,j and qi,j decreases with higher numbers of recorded visits.
Resumo:
Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element
Resumo:
Strategic marketing planning is now widely adopted by business-to-business organizations. While marketing planning principles are well established, practitioners attempting to implement the process often find their progress impeded by a variety of barriers. These barriers are explored through a review of published evidence and case study analysis of several organizations. This analysis exposes three levels of barriers to effective business-to-business marketing planning, relating to (i) organizational infrastructure, (ii) the planning process and (iii) implementation. These barriers reflect the synoptic nature of planning in many organizations. The findings lead to the development of a practitioner-oriented diagnostic and treatment tool which guides managers through the marketing planning process. Although this diagnostic deals specifically with issues which are relevant to the marketing planner, its wider implications for strategic planning are also explored.
Resumo:
This work presents a novel approach in order to increase the recognition power of Multiscale Fractal Dimension (MFD) techniques, when applied to image classification. The proposal uses Functional Data Analysis (FDA) with the aim of enhancing the MFD technique precision achieving a more representative descriptors vector, capable of recognizing and characterizing more precisely objects in an image. FDA is applied to signatures extracted by using the Bouligand-Minkowsky MFD technique in the generation of a descriptors vector from them. For the evaluation of the obtained improvement, an experiment using two datasets of objects was carried out. A dataset was used of characters shapes (26 characters of the Latin alphabet) carrying different levels of controlled noise and a dataset of fish images contours. A comparison with the use of the well-known methods of Fourier and wavelets descriptors was performed with the aim of verifying the performance of FDA method. The descriptor vectors were submitted to Linear Discriminant Analysis (LDA) classification method and we compared the correctness rate in the classification process among the descriptors methods. The results demonstrate that FDA overcomes the literature methods (Fourier and wavelets) in the processing of information extracted from the MFD signature. In this way, the proposed method can be considered as an interesting choice for pattern recognition and image classification using fractal analysis.
Resumo:
This paper presents the groundwater favorability mapping on a fractured terrain in the eastern portion of Sao Paulo State, Brazil. Remote sensing, airborne geophysical data, photogeologic interpretation, geologic and geomorphologic maps and geographic information system (GIS) techniques have been used. The results of cross-tabulation between these maps and well yield data allowed groundwater prospective parameters in a fractured-bedrock aquifer. These prospective parameters are the base for the favorability analysis whose principle is based on the knowledge-driven method. The mutticriteria analysis (weighted linear combination) was carried out to give a groundwater favorabitity map, because the prospective parameters have different weights of importance and different classes of each parameter. The groundwater favorability map was tested by cross-tabulation with new well yield data and spring occurrence. The wells with the highest values of productivity, as well as all the springs occurrence are situated in the excellent and good favorabitity mapped areas. It shows good coherence between the prospective parameters and the well yield and the importance of GIS techniques for definition of target areas for detail study and wells location. (c) 2008 Elsevier B.V. All rights reserved.
Resumo:
Optimization of photo-Fenton degradation of copper phthalocyanine blue was achieved by response surface methodology (RSM) constructed with the aid of a sequential injection analysis (SIA) system coupled to a homemade photo-reactor. Highest degradation percentage was obtained at the following conditions [H(2)O(2)]/[phthalocyanine] = 7, [H(2)O(2)]/[FeSO(4)] = 10, pH = 2.5, and stopped flow time in the photo reactor = 30 s. The SIA system was designed to prepare a monosegment containing the reagents and sample, to pump it toward the photo-reactor for the specified time and send the products to a flow-through spectrophotometer for monitoring the color reduction of the dye. Changes in parameters such as reagent molar ratios. residence time and pH were made by modifications in the software commanding the SI system, without the need for physical reconfiguration of reagents around the selection valve. The proposed procedure and system fed the statistical program with degradation data for fast construction of response surface plots. After optimization, 97% of the dye was degraded. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
This work presents the use of sequential injection analysis (SIA) and the response surface methodology as a tool for optimization of Fenton-based processes. Alizarin red S dye (C.I. 58005) was used as a model compound for the anthraquinones family. whose pigments have a large use in coatings industry. The following factors were considered: [H(2)O(2)]:[Alizarin] and [H(2)O(2)]:[FeSO(4)] ratios and pH. The SIA system was designed to add reagents to the reactor and to perform on-line sampling of the reaction medium, sending the samples to a flow-through spectrophotometer for monitoring the color reduction of the dye. The proposed system fed the statistical program with degradation data for fast construction of response surface plots. After optimization, 99.7% of the dye was degraded and the TOC content was reduced to 35% of the original value. Low reagents consumption and high sampling throughput were the remarkable features of the SIA system. (C) 2008 Published by Elsevier B.V.
Resumo:
This paper describes a chemotaxonomic analysis of a database of triterpenoid compounds from the Celastraceae family using principal component analysis (PCA). The numbers of occurrences of thirty types of triterpene skeleton in different tribes of the family were used as variables. The study shows that PCA applied to chemical data can contribute to an intrafamilial classification of Celastraceae, once some questionable taxa affinity was observed, from chemotaxonomic inferences about genera and they are in agreement with the phylogeny previously proposed. The inclusion of Hippocrateaceae within Celastraceae is supported by the triterpene chemistry.
Resumo:
This paper investigates what factors affect the destination choice for Jordanian to 8 countries (Oman, Saudi Arabia, Syria, Tunisia, Yemen, Egypt, Lebanon and Bahrain) using panel data analysis. Number of outbound tourists is represented as dependent variable, which is regressed over five explanatory variables using fixed effect model. The finding of this paper is that tourists from Jordan have weak demand for outbound tourism; Jordanian decision of traveling abroad is determined by the cost of traveling to different places and choosing the cheapest alternative.
Resumo:
Excessive labor turnover may be considered, to a great extent, an undesirable feature of a given economy. This follows from considerations such as underinvestment in human capital by firms. Understanding the determinants and the evolution of turnover in a particular labor market is therefore of paramount importance, including policy considerations. The present paper proposes an econometric analysis of turnover in the Brazilian labor market, based on a partial observability bivariate probit model. This model considers the interdependence of decisions taken by workers and firms, helping to elucidate the causes that lead each of them to end an employment relationship. The Employment and Unemployment Survey (PED) conducted by the State System of Data Analysis (SEADE) and by the Inter-Union Department of Statistics and Socioeconomic Studies (DIEESE) provides data at the individual worker level, allowing for the estimation of the joint probabilities of decisions to quit or stay on the job on the worker’s side, and to maintain or fire the employee on the firm’s side, during a given time period. The estimated parameters relate these estimated probabilities to the characteristics of workers, job contracts, and to the potential macroeconomic determinants in different time periods. The results confirm the theoretical prediction that the probability of termination of an employment relationship tends to be smaller as the worker acquires specific skills. The results also show that the establishment of a formal employment relationship reduces the probability of a quit decision by the worker, and also the firm’s firing decision in non-industrial sectors. With regard to the evolution of quit probability over time, the results show that an increase in the unemployment rate inhibits quitting, although this tends to wane as the unemployment rate rises.
Resumo:
There are four different hypotheses analyzed in the literature that explain deunionization, namely: the decrease in the demand for union representation by the workers; the impaet of globalization over unionization rates; teehnieal ehange and ehanges in the legal and politieal systems against unions. This paper aims to test alI ofthem. We estimate a logistie regression using panel data proeedure with 35 industries from 1973 to 1999 and eonclude that the four hypotheses ean not be rejeeted by the data. We also use a varianee analysis deeomposition to study the impaet of these variables over the drop in unionization rates. In the model with no demographic variables the results show that these economic (tested) variables can account from 10% to 12% of the drop in unionization. However, when we include demographic variables these tested variables can account from 10% to 35% in the total variation of unionization rates. In this case the four hypotheses tested can explain up to 50% ofthe total drop in unionization rates explained by the model.
Resumo:
We investigate the issue of whether there was a stable money demand function for Japan in 1990's using both aggregate and disaggregate time series data. The aggregate data appears to support the contention that there was no stable money demand function. The disaggregate data shows that there was a stable money demand function. Neither was there any indication of the presence of liquidity trapo Possible sources of discrepancy are explored and the diametrically opposite results between the aggregate and disaggregate analysis are attributed to the neglected heterogeneity among micro units. We also conduct simulation analysis to show that when heterogeneity among micro units is present. The prediction of aggregate outcomes, using aggregate data is less accurate than the prediction based on micro equations. Moreover. policy evaluation based on aggregate data can be grossly misleading.
Resumo:
Sharing sensor data between multiple devices and users can be^challenging for naive users, and requires knowledge of programming and use of different communication channels and/or development tools, leading to non uniform solutions. This thesis proposes a system that allows users to access sensors, share sensor data and manage sensors. With this system we intent to manage devices, share sensor data, compare sensor data, and set policies to act based on rules. This thesis presents the design and implementation of the system, as well as three case studies of its use.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)