845 resultados para distributed computation
Resumo:
Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.
While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.
For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.
Resumo:
Ocean acidification (OA), induced by rapid anthropogenic CO2 rise and its dissolution in seawater, is known to have consequences for marine organisms. However, knowledge on the evolutionary responses of phytoplankton to OA has been poorly studied. Here we examined the coccolithophore Gephyrocapsa oceanica, while growing it for 2000 generations under ambient and elevated CO2 levels. While OA stimulated growth in the earlier selection period (from generations 700 to 1550), it reduced it in the later selection period up to 2000 generations. Similarly, stimulated production of particulate organic carbon and nitrogen reduced with increasing selection period and decreased under OA up to 2000 generations. The specific adaptation of growth to OA disappeared in generations 1700 to 2000 when compared with that at 1000 generations. Both phenotypic plasticity and fitness decreased within selection time, suggesting that the species' resilience to OA decreased after 2000 generations under high CO2 selection.
Resumo:
In this paper, we describe a decentralized privacy-preserving protocol for securely casting trust ratings in distributed reputation systems. Our protocol allows n participants to cast their votes in a way that preserves the privacy of individual values against both internal and external attacks. The protocol is coupled with an extensive theoretical analysis in which we formally prove that our protocol is resistant to collusion against as many as n-1 corrupted nodes in the semi-honest model. The behavior of our protocol is tested in a real P2P network by measuring its communication delay and processing overhead. The experimental results uncover the advantages of our protocol over previous works in the area; without sacrificing security, our decentralized protocol is shown to be almost one order of magnitude faster than the previous best protocol for providing anonymous feedback.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
As the complexity of parallel applications increase, the performance limitations resulting from computational load imbalance become dominant. Mapping the problem space to the processors in a parallel machine in a manner that balances the workload of each processors will typically reduce the run-time. In many cases the computation time required for a given calculation cannot be predetermined even at run-time and so static partition of the problem returns poor performance. For problems in which the computational load across the discretisation is dynamic and inhomogeneous, for example multi-physics problems involving fluid and solid mechanics with phase changes, the workload for a static subdomain will change over the course of a computation and cannot be estimated beforehand. For such applications the mapping of loads to process is required to change dynamically, at run-time in order to maintain reasonable efficiency. The issue of dynamic load balancing are examined in the context of PHYSICA, a three dimensional unstructured mesh multi-physics continuum mechanics computational modelling code.
Resumo:
Several numerical methods for boundary value problems use integral and differential operational matrices, expressed in polynomial bases in a Hilbert space of functions. This work presents a sequence of matrix operations allowing a direct computation of operational matrices for polynomial bases, orthogonal or not, starting with any previously known reference matrix. Furthermore, it shows how to obtain the reference matrix for a chosen polynomial base. The results presented here can be applied not only for integration and differentiation, but also for any linear operation.
Resumo:
Background: Recent studies have reported the clinical importance of CYP2C19 and ABCB1 polymorphisms in an individualized approach to clopidogrel treatment. The aims of this study were to evaluate the frequencies of CYP2C19 and ABCB1 polymorphisms and to identify the clopidogrel-predicted metabolic phenotypes according to ethnic groups in a sample of individuals representative of a highly admixtured population. Methods: One hundred and eighty-three Amerindians and 1,029 subjects of the general population of 4 regions of the country were included. Genotypes for the ABCB1c.C3435T (rs1045642), CYP2C19*2 (rs4244285), CYP2C19*3 (rs4986893), CYP2C19*4 (rs28399504), CYP2C19*5 (rs56337013), and CYP2C19*17 (rs12248560) polymorphisms were detected by polymerase chain reaction followed by high resolution melting analysis. The CYP2C19*3, CYP2C19*4 and CYP2C19*5 variants were genotyped in a subsample of subjects (300 samples randomly selected). Results: The CYP2C19*3 and CYP2C19*5 variant alleles were not detected and the CYP2C19*4 variant allele presented a frequency of 0.3%. The allelic frequencies for the ABCB1c.C3435T, CYP2C19*2 and CYP2C19*17 polymorphisms were differently distributed according to ethnicity: Amerindian (51.4%, 10.4%, 15.8%); Caucasian descent (43.2%, 16.9%, 18.0%); Mulatto (35.9%, 16.5%, 21.3%); and African descent (32.8%, 20.2%, 26.3%) individuals, respectively. As a result, self-referred ethnicity was able to predict significantly different clopidogrel-predicted metabolic phenotypes prevalence even for a highly admixtured population. Conclusion: Our findings indicate the existence of inter-ethnic differences in the ABCB1 and CYP2C19 variant allele frequencies in the Brazilian general population plus Amerindians. This information could help in stratifying individuals from this population regarding clopidogrel-predicted metabolic phenotypes and design more cost-effective programs towards individualization of clopidogrel therapy.
Resumo:
Background: Various neuroimaging studies, both structural and functional, have provided support for the proposal that a distributed brain network is likely to be the neural basis of intelligence. The theory of Distributed Intelligent Processing Systems (DIPS), first developed in the field of Artificial Intelligence, was proposed to adequately model distributed neural intelligent processing. In addition, the neural efficiency hypothesis suggests that individuals with higher intelligence display more focused cortical activation during cognitive performance, resulting in lower total brain activation when compared with individuals who have lower intelligence. This may be understood as a property of the DIPS. Methodology and Principal Findings: In our study, a new EEG brain mapping technique, based on the neural efficiency hypothesis and the notion of the brain as a Distributed Intelligence Processing System, was used to investigate the correlations between IQ evaluated with WAIS (Whechsler Adult Intelligence Scale) and WISC (Wechsler Intelligence Scale for Children), and the brain activity associated with visual and verbal processing, in order to test the validity of a distributed neural basis for intelligence. Conclusion: The present results support these claims and the neural efficiency hypothesis.
Resumo:
Recent advances in energy technology generation and new directions in electricity regulation have made distributed generation (DG) more widespread, with consequent significant impacts on the operational characteristics of distribution networks. For this reason, new methods for identifying such impacts are needed, together with research and development of new tools and resources to maintain and facilitate continued expansion towards DG. This paper presents a study aimed at determining appropriate DG sites for distribution systems. The main considerations which determine DG sites are also presented, together with an account of the advantages gained from correct DG placement. The paper intends to define some quantitative and qualitative parameters evaluated by Digsilent (R), GARP3 (R) and DSA-GD software. A multi-objective approach based on the Bellman-Zadeh algorithm and fuzzy logic is used to determine appropriate DG sites. The study also aims to find acceptable DG locations both for distribution system feeders, as well as for nodes inside a given feeder. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
This paper presents a novel graphical approach to adjust and evaluate frequency-based relays employed in anti-islanding protection schemes of distributed synchronous generators, in order to meet the anti-islanding and abnormal frequency variation requirements, simultaneously. The proposed method defines a region in the power mismatch space, inside which the relay non-detection zone should be located, if the above-mentioned requirements must be met. Such region is called power imbalance application region. Results show that this method can help protection engineers to adjust frequency-based relays to improve the anti-islanding capability and to minimize false operation occurrences, keeping the abnormal frequency variation utility requirements satisfied. Moreover, the proposed method can be employed to coordinate different types of frequency-based relays, aiming at improving overall performance of the distributed generator frequency protection scheme. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Wireless Sensor Networks (WSNs) have a vast field of applications, including deployment in hostile environments. Thus, the adoption of security mechanisms is fundamental. However, the extremely constrained nature of sensors and the potentially dynamic behavior of WSNs hinder the use of key management mechanisms commonly applied in modern networks. For this reason, many lightweight key management solutions have been proposed to overcome these constraints. In this paper, we review the state of the art of these solutions and evaluate them based on metrics adequate for WSNs. We focus on pre-distribution schemes well-adapted for homogeneous networks (since this is a more general network organization), thus identifying generic features that can improve some of these metrics. We also discuss some challenges in the area and future research directions. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Distributed control systems consist of sensors, actuators and controllers, interconnected by communication networks and are characterized by a high number of concurrent process. This work presents a proposal for a procedure to model and analyze communication networks for distributed control systems in intelligent building. The approach considered for this purpose is based on the characterization of the control system as a discrete event system and application of coloured Petri net as a formal method for specification, analysis and verification of control solutions. With this approach, we develop the models that compose the communication networks for the control systems of intelligent building, which are considered the relationships between the various buildings systems. This procedure provides a structured development of models, facilitating the process of specifying the control algorithm. An application example is presented in order to illustrate the main features of this approach.
Distributed Estimation Over an Adaptive Incremental Network Based on the Affine Projection Algorithm
Resumo:
We study the problem of distributed estimation based on the affine projection algorithm (APA), which is developed from Newton`s method for minimizing a cost function. The proposed solution is formulated to ameliorate the limited convergence properties of least-mean-square (LMS) type distributed adaptive filters with colored inputs. The analysis of transient and steady-state performances at each individual node within the network is developed by using a weighted spatial-temporal energy conservation relation and confirmed by computer simulations. The simulation results also verify that the proposed algorithm provides not only a faster convergence rate but also an improved steady-state performance as compared to an LMS-based scheme. In addition, the new approach attains an acceptable misadjustment performance with lower computational and memory cost, provided the number of regressor vectors and filter length parameters are appropriately chosen, as compared to a distributed recursive-least-squares (RLS) based method.
Resumo:
Scheduling parallel and distributed applications efficiently onto grid environments is a difficult task and a great variety of scheduling heuristics has been developed aiming to address this issue. A successful grid resource allocation depends, among other things, on the quality of the available information about software artifacts and grid resources. In this article, we propose a semantic approach to integrate selection of equivalent resources and selection of equivalent software artifacts to improve the scheduling of resources suitable for a given set of application execution requirements. We also describe a prototype implementation of our approach based on the Integrade grid middleware and experimental results that illustrate its benefits. Copyright (C) 2009 John Wiley & Sons, Ltd.
Resumo:
In this paper the continuous Verhulst dynamic model is used to synthesize a new distributed power control algorithm (DPCA) for use in direct sequence code division multiple access (DS-CDMA) systems. The Verhulst model was initially designed to describe the population growth of biological species under food and physical space restrictions. The discretization of the corresponding differential equation is accomplished via the Euler numeric integration (ENI) method. Analytical convergence conditions for the proposed DPCA are also established. Several properties of the proposed recursive algorithm, such as Euclidean distance from optimum vector after convergence, convergence speed, normalized mean squared error (NSE), average power consumption per user, performance under dynamics channels, and implementation complexity aspects, are analyzed through simulations. The simulation results are compared with two other DPCAs: the classic algorithm derived by Foschini and Miljanic and the sigmoidal of Uykan and Koivo. Under estimated errors conditions, the proposed DPCA exhibits smaller discrepancy from the optimum power vector solution and better convergence (under fixed and adaptive convergence factor) than the classic and sigmoidal DPCAs. (C) 2010 Elsevier GmbH. All rights reserved.