925 resultados para Data clustering. Fuzzy C-Means. Cluster centers initialization. Validation indices
Resumo:
Cluster scheduling and collision avoidance are crucial issues in large-scale cluster-tree Wireless Sensor Networks (WSNs). The paper presents a methodology that provides a Time Division Cluster Scheduling (TDCS) mechanism based on the cyclic extension of RCPS/TC (Resource Constrained Project Scheduling with Temporal Constraints) problem for a cluster-tree WSN, assuming bounded communication errors. The objective is to meet all end-to-end deadlines of a predefined set of time-bounded data flows while minimizing the energy consumption of the nodes by setting the TDCS period as long as possible. Sinceeach cluster is active only once during the period, the end-to-end delay of a given flow may span over several periods when there are the flows with opposite direction. The scheduling tool enables system designers to efficiently configure all required parameters of the IEEE 802.15.4/ZigBee beaconenabled cluster-tree WSNs in the network design time. The performance evaluation of thescheduling tool shows that the problems with dozens of nodes can be solved while using optimal solvers.
Resumo:
The simulation analysis is important approach to developing and evaluating the systems in terms of development time and cost. This paper demonstrates the application of Time Division Cluster Scheduling (TDCS) tool for the configuration of IEEE 802.15.4/ZigBee beaconenabled cluster-tree WSNs using the simulation analysis, as an illustrative example that confirms the practical applicability of the tool. The simulation study analyses how the number of retransmissions impacts the reliability of data transmission, the energy consumption of the nodes and the end-to-end communication delay, based on the simulation model that was implemented in the Opnet Modeler. The configuration parameters of the network are obtained directly from the TDCS tool. The simulation results show that the number of retransmissions impacts the reliability, the energy consumption and the end-to-end delay, in a way that improving the one may degrade the others.
Resumo:
This paper focus on a demand response model analysis in a smart grid context considering a contingency scenario. A fuzzy clustering technique is applied on the developed demand response model and an analysis is performed for the contingency scenario. Model considerations and architecture are described. The demand response developed model aims to support consumers decisions regarding their consumption needs and possible economic benefits.
Resumo:
Dissertation presented at the Faculty of Science and Technology of the New University of Lisbon in fulfillment of the requirements for the Masters degree in Electrical Engineering and Computers
Resumo:
In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion. © 2014 Springer-Verlag Berlin Heidelberg.
Resumo:
Conferência: CONTROLO’2012 - 16-18 July 2012 - Funchal
Resumo:
Data analytic applications are characterized by large data sets that are subject to a series of processing phases. Some of these phases are executed sequentially but others can be executed concurrently or in parallel on clusters, grids or clouds. The MapReduce programming model has been applied to process large data sets in cluster and cloud environments. For developing an application using MapReduce there is a need to install/configure/access specific frameworks such as Apache Hadoop or Elastic MapReduce in Amazon Cloud. It would be desirable to provide more flexibility in adjusting such configurations according to the application characteristics. Furthermore the composition of the multiple phases of a data analytic application requires the specification of all the phases and their orchestration. The original MapReduce model and environment lacks flexible support for such configuration and composition. Recognizing that scientific workflows have been successfully applied to modeling complex applications, this paper describes our experiments on implementing MapReduce as subworkflows in the AWARD framework (Autonomic Workflow Activities Reconfigurable and Dynamic). A text mining data analytic application is modeled as a complex workflow with multiple phases, where individual workflow nodes support MapReduce computations. As in typical MapReduce environments, the end user only needs to define the application algorithms for input data processing and for the map and reduce functions. In the paper we present experimental results when using the AWARD framework to execute MapReduce workflows deployed over multiple Amazon EC2 (Elastic Compute Cloud) instances.
Resumo:
This paper presents the preliminary work of an approach where Fuzzy Boolean Nets (FBN) are being used to extract qualitative knowledge regarding the effect of prescribed fire burning on soil chemical physical properties. FBN were chosen due to the scarcity on available quantitative data.
Resumo:
Beyond the classical statistical approaches (determination of basic statistics, regression analysis, ANOVA, etc.) a new set of applications of different statistical techniques has increasingly gained relevance in the analysis, processing and interpretation of data concerning the characteristics of forest soils. This is possible to be seen in some of the recent publications in the context of Multivariate Statistics. These new methods require additional care that is not always included or refered in some approaches. In the particular case of geostatistical data applications it is necessary, besides to geo-reference all the data acquisition, to collect the samples in regular grids and in sufficient quantity so that the variograms can reflect the spatial distribution of soil properties in a representative manner. In the case of the great majority of Multivariate Statistics techniques (Principal Component Analysis, Correspondence Analysis, Cluster Analysis, etc.) despite the fact they do not require in most cases the assumption of normal distribution, they however need a proper and rigorous strategy for its utilization. In this work, some reflections about these methodologies and, in particular, about the main constraints that often occur during the information collecting process and about the various linking possibilities of these different techniques will be presented. At the end, illustrations of some particular cases of the applications of these statistical methods will also be presented.
Resumo:
The Portuguese northern forests are often and severely affected by wildfires during the Summer season. These occurrences significantly affect and negatively impact all ecosystems, namely soil, fauna and flora. In order to reduce the occurrences of natural wildfires, some measures to control the availability of fuel mass are regularly implemented. Those preventive actions concern mainly prescribed burnings and vegetation pruning. This work reports on the impact of a prescribed burning on several forest soil properties, namely pH, soil moisture, organic matter content and iron content, by monitoring the soil self-recovery capabilities during a one year span. The experiments were carried out in soil cover over a natural site of Andaluzitic schist, in Gramelas, Caminha, Portugal, which was kept intact from prescribed burnings during a period of four years. Soil samples were collected from five plots at three different layers (0–3, 3–6 and 6–18) 1 day before prescribed fire and at regular intervals after the prescribed fire. This paper presents an approach where Fuzzy Boolean Nets (FBN) and Fuzzy reasoning are used to extract qualitative knowledge regarding the effect of prescribed fire burning on soil properties. FBN were chosen due to the scarcity on available quantitative data. The results showed that soil properties were affected by prescribed burning practice and were unable to recover their initial values after one year.
Resumo:
This study aims to optimize the water quality monitoring of a polluted watercourse (Leça River, Portugal) through the principal component analysis (PCA) and cluster analysis (CA). These statistical methodologies were applied to physicochemical, bacteriological and ecotoxicological data (with the marine bacterium Vibrio fischeri and the green alga Chlorella vulgaris) obtained with the analysis of water samples monthly collected at seven monitoring sites and during five campaigns (February, May, June, August, and September 2006). The results of some variables were assigned to water quality classes according to national guidelines. Chemical and bacteriological quality data led to classify Leça River water quality as “bad” or “very bad”. PCA and CA identified monitoring sites with similar pollution pattern, giving to site 1 (located in the upstream stretch of the river) a distinct feature from all other sampling sites downstream. Ecotoxicity results corroborated this classification thus revealing differences in space and time. The present study includes not only physical, chemical and bacteriological but also ecotoxicological parameters, which broadens new perspectives in river water characterization. Moreover, the application of PCA and CA is very useful to optimize water quality monitoring networks, defining the minimum number of sites and their location. Thus, these tools can support appropriate management decisions.
Resumo:
In cluster analysis, it can be useful to interpret the partition built from the data in the light of external categorical variables which are not directly involved to cluster the data. An approach is proposed in the model-based clustering context to select a number of clusters which both fits the data well and takes advantage of the potential illustrative ability of the external variables. This approach makes use of the integrated joint likelihood of the data and the partitions at hand, namely the model-based partition and the partitions associated to the external variables. It is noteworthy that each mixture model is fitted by the maximum likelihood methodology to the data, excluding the external variables which are used to select a relevant mixture model only. Numerical experiments illustrate the promising behaviour of the derived criterion.
Resumo:
A new algorithm for the velocity vector estimation of moving ships using Single Look Complex (SLC) SAR data in strip map acquisition mode is proposed. The algorithm exploits both amplitude and phase information of the Doppler decompressed data spectrum, with the aim to estimate both the azimuth antenna pattern and the backscattering coefficient as function of the look angle. The antenna pattern estimation provides information about the target velocity; the backscattering coefficient can be used for vessel classification. The range velocity is retrieved in the slow time frequency domain by estimating the antenna pattern effects induced by the target motion, while the azimuth velocity is calculated by the estimated range velocity and the ship orientation. Finally, the algorithm is tested on simulated SAR SLC data.
Resumo:
The solubilities of two C-tetraalkylcalix[4]resorcinarenes, namely C-tetramethylcalix[4]resorcinarene and C-tetrapentylcalix[4]resorcinarene, in supercritical carbon dioxide (SCCO2) were measured in a flow-type apparatus at a temperature range from (313.2 to 333.2) K and at pressures from (12.0 to 35.0) MPa. The C-tetraalkylcalix[4]resorcinarenes were synthesized applying our optimized procedure and fully characterized by means of gel permeation chromatography, infrared and nuclear magnetic resonance spectroscopy. The solubilities of the C-tetraalkylcalix[4]resorcinarenes in SCCO2 were determined by analysis of the extracts obtained by HPLC with ultraviolet (UV) detection methodology adapted by our team. Four semiempirical density-based models, and the SoaveRedlichKwong cubic equation of state (SRK CEoS) with classical mixing rules, were applied to correlate the solubility of the calix[4]resorcinarenes in the SC CO2. The physical properties required for the modeling were estimated and reported.
Resumo:
The Evidence Accumulation Clustering (EAC) paradigm is a clustering ensemble method which derives a consensus partition from a collection of base clusterings obtained using different algorithms. It collects from the partitions in the ensemble a set of pairwise observations about the co-occurrence of objects in a same cluster and it uses these co-occurrence statistics to derive a similarity matrix, referred to as co-association matrix. The Probabilistic Evidence Accumulation for Clustering Ensembles (PEACE) algorithm is a principled approach for the extraction of a consensus clustering from the observations encoded in the co-association matrix based on a probabilistic model for the co-association matrix parameterized by the unknown assignments of objects to clusters. In this paper we extend the PEACE algorithm by deriving a consensus solution according to a MAP approach with Dirichlet priors defined for the unknown probabilistic cluster assignments. In particular, we study the positive regularization effect of Dirichlet priors on the final consensus solution with both synthetic and real benchmark data.