76 resultados para Virtualizzazione, Nested Virtualization, IaaS, Virtualbox, Okeanos
em CentAUR: Central Archive University of Reading - UK
Resumo:
The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
An unbalanced nested sampling design was used to investigate the spatial scale of soil and herbicide interactions at the field scale. A hierarchical analysis of variance based on residual maximum likelihood (REML) was used to analyse the data and provide a first estimate of the variogram. Soil samples were taken at 108 locations at a range of separating distances in a 9 ha field to explore small and medium scale spatial variation. Soil organic matter content, pH, particle size distribution, microbial biomass and the degradation and sorption of the herbicide, isoproturon, were determined for each soil sample. A large proportion of the spatial variation in isoproturon degradation and sorption occurred at sampling intervals less than 60 m, however, the sampling design did not resolve the variation present at scales greater than this. A sampling interval of 20-25 m should ensure that the main spatial structures are identified for isoproturon degradation rate and sorption without too great a loss of information in this field.
Resumo:
The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
A full assessment of para-virtualization is important, because without knowledge about the various overheads, users can not understand whether using virtualization is a good idea or not. In this paper we are very interested in assessing the overheads of running various benchmarks on bare-‐metal, as well as on para-‐virtualization. The idea is to see what the overheads of para-‐ virtualization are, as well as looking at the overheads of turning on monitoring and logging. The knowledge from assessing various benchmarks on these different systems will help a range of users understand the use of virtualization systems. In this paper we assess the overheads of using Xen, VMware, KVM and Citrix, see Table 1. These different virtualization systems are used extensively by cloud-‐users. We are using various Netlib1 benchmarks, which have been developed by the University of Tennessee at Knoxville (UTK), and Oak Ridge National Laboratory (ORNL). In order to assess these virtualization systems, we run the benchmarks on bare-‐metal, then on the para-‐virtualization, and finally we turn on monitoring and logging. The later is important as users are interested in Service Level Agreements (SLAs) used by the Cloud providers, and the use of logging is a means of assessing the services bought and used from commercial providers. In this paper we assess the virtualization systems on three different systems. We use the Thamesblue supercomputer, the Hactar cluster and IBM JS20 blade server (see Table 2), which are all servers available at the University of Reading. A functional virtualization system is multi-‐layered and is driven by the privileged components. Virtualization systems can host multiple guest operating systems, which run on its own domain, and the system schedules virtual CPUs and memory within each Virtual Machines (VM) to make the best use of the available resources. The guest-‐operating system schedules each application accordingly. You can deploy virtualization as full virtualization or para-‐virtualization. Full virtualization provides a total abstraction of the underlying physical system and creates a new virtual system, where the guest operating systems can run. No modifications are needed in the guest OS or application, e.g. the guest OS or application is not aware of the virtualized environment and runs normally. Para-‐virualization requires user modification of the guest operating systems, which runs on the virtual machines, e.g. these guest operating systems are aware that they are running on a virtual machine, and provide near-‐native performance. You can deploy both para-‐virtualization and full virtualization across various virtualized systems. Para-‐virtualization is an OS-‐assisted virtualization; where some modifications are made in the guest operating system to enable better performance. In this kind of virtualization, the guest operating system is aware of the fact that it is running on the virtualized hardware and not on the bare hardware. In para-‐virtualization, the device drivers in the guest operating system coordinate the device drivers of host operating system and reduce the performance overheads. The use of para-‐virtualization [0] is intended to avoid the bottleneck associated with slow hardware interrupts that exist when full virtualization is employed. It has revealed [0] that para-‐ virtualization does not impose significant performance overhead in high performance computing, and this in turn this has implications for the use of cloud computing for hosting HPC applications. The “apparent” improvement in virtualization has led us to formulate the hypothesis that certain classes of HPC applications should be able to execute in a cloud environment, with minimal performance degradation. In order to support this hypothesis, first it is necessary to define exactly what is meant by a “class” of application, and secondly it will be necessary to observe application performance, both within a virtual machine and when executing on bare hardware. A further potential complication is associated with the need for Cloud service providers to support Service Level Agreements (SLA), so that system utilisation can be audited.
Resumo:
Nested clade phylogeographic analysis (NCPA) is a popular method for reconstructing the demographic history of spatially distributed populations from genetic data. Although some parts of the analysis are automated, there is no unique and widely followed algorithm for doing this in its entirety, beginning with the data, and ending with the inferences drawn from the data. This article describes a method that automates NCPA, thereby providing a framework for replicating analyses in an objective way. To do so, a number of decisions need to be made so that the automated implementation is representative of previous analyses. We review how the NCPA procedure has evolved since its inception and conclude that there is scope for some variability in the manual application of NCPA. We apply the automated software to three published datasets previously analyzed manually and replicate many details of the manual analyses, suggesting that the current algorithm is representative of how a typical user will perform NCPA. We simulate a large number of replicate datasets for geographically distributed, but entirely random-mating, populations. These are then analyzed using the automated NCPA algorithm. Results indicate that NCPA tends to give a high frequency of false positives. In our simulations we observe that 14% of the clades give a conclusive inference that a demographic event has occurred, and that 75% of the datasets have at least one clade that gives such an inference. This is mainly due to the generation of multiple statistics per clade, of which only one is required to be significant to apply the inference key. We survey the inferences that have been made in recent publications and show that the most commonly inferred processes (restricted gene flow with isolation by distance and contiguous range expansion) are those that are commonly inferred in our simulations. However, published datasets typically yield a richer set of inferences with NCPA than obtained in our random-mating simulations, and further testing of NCPA with models of structured populations is necessary to examine its accuracy.
Resumo:
ANeCA is a fully automated implementation of Nested Clade Phylogeographic Analysis. This was originally developed by Templeton and colleagues, and has been used to infer, from the pattern of gene sequence polymorphisms in a geographically structured population, the historical demographic processes that have shaped its evolution. Until now it has been necessary to perform large parts of the procedure manually. We provide a program that will take data in Nexus sequential format, and directly output a set of inferences. The software also includes TCS v1.18 and GeoDis v2.2 as part of automation.
Resumo:
Since its introduction in 1993, the Message Passing Interface (MPI) has become a de facto standard for writing High Performance Computing (HPC) applications on clusters and Massively Parallel Processors (MPPs). The recent emergence of multi-core processor systems presents a new challenge for established parallel programming paradigms, including those based on MPI. This paper presents a new Java messaging system called MPJ Express. Using this system, we exploit multiple levels of parallelism - messaging and threading - to improve application performance on multi-core processors. We refer to our approach as nested parallelism. This MPI-like Java library can support nested parallelism by using Java or Java OpenMP (JOMP) threads within an MPJ Express process. Practicality of this approach is assessed by porting to Java a massively parallel structure formation code from Cosmology called Gadget-2. We introduce nested parallelism in the Java version of the simulation code and report good speed-ups. To the best of our knowledge it is the first time this kind of hybrid parallelism is demonstrated in a high performance Java application. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Processor virtualization for process migration in distributed parallel computing systems has formed a significant component of research on load balancing. In contrast, the potential of processor virtualization for fault tolerance has been addressed minimally. The work reported in this paper is motivated towards extending concepts of processor virtualization towards ‘intelligent cores’ as a means to achieve fault tolerance in distributed parallel computing systems. Intelligent cores are an abstraction of the hardware processing cores, with the incorporation of cognitive capabilities, on which parallel tasks can be executed and migrated. When a processing core executing a task is predicted to fail the task being executed is proactively transferred onto another core. A parallel reduction algorithm incorporating concepts of intelligent cores is implemented on a computer cluster using Adaptive MPI and Charm ++. Preliminary results confirm the feasibility of the approach.
Resumo:
Four-dimensional variational data assimilation (4D-Var) is used in environmental prediction to estimate the state of a system from measurements. When 4D-Var is applied in the context of high resolution nested models, problems may arise in the representation of spatial scales longer than the domain of the model. In this paper we study how well 4D-Var is able to estimate the whole range of spatial scales present in one-way nested models. Using a model of the one-dimensional advection–diffusion equation we show that small spatial scales that are observed can be captured by a 4D-Var assimilation, but that information in the larger scales may be degraded. We propose a modification to 4D-Var which allows a better representation of these larger scales.
Resumo:
The temporal variability of the atmosphere through which radio waves pass in the technique of differential radar interferometry can seriously limit the accuracy with which the method can measure surface motion. A forward, nested mesoscale model of the atmosphere can be used to simulate the variable water content along the radar path and the resultant phase delays. Using this approach we demonstrate how to correct an interferogram of Mount Etna in Sicily associated with an eruption in 2004-5. The regional mesoscale model (Unified Model) used to simulate the atmosphere at higher resolutions consists of four nested domains increasing in resolution (12, 4, 1, 0.3 km), sitting within the analysis version of a global numerical model that is used to initiate the simulation. Using the high resolution 3D model output we compute the surface pressure, temperature and the water vapour, liquid and solid water contents, enabling the dominant hydrostatic and wet delays to be calculated at specific times corresponding to the acquisition of the radar data. We can also simulate the second-order delay effects due to liquid water and ice.