743 resultados para Grid Computing
Resumo:
The arbitrarily structured C-grid, TRiSK (Thuburn, Ringler, Skamarock and Klemp, 2009, 2010) is being used in the ``Model for Prediction Across Scales'' (MPAS) and is being considered by the UK Met Office for their next dynamical core. However the hexagonal C-grid supports a branch of spurious Rossby modes which lead to erroneous grid-scale oscillations of potential vorticity (PV). It is shown how these modes can be harmlessly controlled by using upwind-biased interpolation schemes for PV. A number of existing advection schemes for PV are tested, including that used in MPAS, and none are found to give adequate results for all grids and all cases. Therefore a new scheme is proposed; continuous, linear-upwind stabilised transport (CLUST), a blend between centred and linear-upwind with the blend dependent on the flow direction with respect to the cell edge. A diagnostic of grid-scale oscillations is proposed which gives further discrimination between schemes than using potential enstrophy alone and indeed some schemes are found to destroy potential enstrophy while grid-scale oscillations grow. CLUST performs well on hexagonal-icosahedral grids and unrotated skipped latitude-longitude grids of the sphere for various shallow water test cases. Despite the computational modes, the hexagonal icosahedral grid performs well since these modes are easy and harmless to filter. As a result TRiSK appears to perform better than a spectral shallow water model.
Resumo:
The impending threat of global climate change and its regional manifestations is among the most important and urgent problems facing humanity. Society needs accurate and reliable estimates of changes in the probability of regional weather variations to develop science-based adaptation and mitigation strategies. Recent advances in weather prediction and in our understanding and ability to model the climate system suggest that it is both necessary and possible to revolutionize climate prediction to meet these societal needs. However, the scientific workforce and the computational capability required to bring about such a revolution is not available in any single nation. Motivated by the success of internationally funded infrastructure in other areas of science, this paper argues that, because of the complexity of the climate system, and because the regional manifestations of climate change are mainly through changes in the statistics of regional weather variations, the scientific and computational requirements to predict its behavior reliably are so enormous that the nations of the world should create a small number of multinational high-performance computing facilities dedicated to the grand challenges of developing the capabilities to predict climate variability and change on both global and regional scales over the coming decades. Such facilities will play a key role in the development of next-generation climate models, build global capacity in climate research, nurture a highly trained workforce, and engage the global user community, policy-makers, and stakeholders. We recommend the creation of a small number of multinational facilities with computer capability at each facility of about 20 peta-flops in the near term, about 200 petaflops within five years, and 1 exaflop by the end of the next decade. Each facility should have sufficient scientific workforce to develop and maintain the software and data analysis infrastructure. Such facilities will enable questions of what resolution, both horizontal and vertical, in atmospheric and ocean models, is necessary for more confident predictions at the regional and local level. Current limitations in computing power have placed severe limitations on such an investigation, which is now badly needed. These facilities will also provide the world's scientists with the computational laboratories for fundamental research on weather–climate interactions using 1-km resolution models and on atmospheric, terrestrial, cryospheric, and oceanic processes at even finer scales. Each facility should have enabling infrastructure including hardware, software, and data analysis support, and scientific capacity to interact with the national centers and other visitors. This will accelerate our understanding of how the climate system works and how to model it. It will ultimately enable the climate community to provide society with climate predictions, which are based on our best knowledge of science and the most advanced technology.
Resumo:
OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.
Resumo:
Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.
Resumo:
The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.
Resumo:
High spatial resolution environmental data gives us a better understanding of the environmental factors affecting plant distributions at fine spatial scales. However, large environmental datasets dramatically increase compute times and output species model size stimulating the need for an alternative computing solution. Cluster computing offers such a solution, by allowing both multiple plant species Environmental Niche Models (ENMs) and individual tiles of high spatial resolution models to be computed concurrently on the same compute cluster. We apply our methodology to a case study of 4,209 species of Mediterranean flora (around 17% of species believed present in the biome). We demonstrate a 16 times speed-up of ENM computation time when 16 CPUs were used on the compute cluster. Our custom Java ‘Merge’ and ‘Downsize’ programs reduce ENM output files sizes by 94%. The median 0.98 test AUC score of species ENMs is aided by various species occurrence data filtering techniques. Finally, by calculating the percentage change of individual grid cell values, we map the projected percentages of plant species vulnerable to climate change in the Mediterranean region between 1950–2000 and 2020.
Resumo:
Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.
Resumo:
Although the tube theory is successful in describing entangled polymers qualitatively, a more quantitative description requires precise and consistent definitions of its parameters. Here we investigate the simplest model of entangled polymers, namely a single Rouse chain in a cubic lattice of line obstacles, and illustrate the typical problems and uncertainties of the tube theory. In particular we show that in general one needs 3 entanglement related parameters, but only 2 combinations of them are relevant for the long-time dynamics. Conversely, the plateau modulus can not be determined from these two parameters and requires a more detailed model of entanglements with explicit entanglement forces, such as the slipsprings model. It is shown that for the grid model the Rouse time within the tube is larger than the Rouse time of the free chain, in contrast to what the standard tube theory assumes.
Resumo:
Purpose: This paper aims to design an evaluation method that enables an organization to assess its current IT landscape and provide readiness assessment prior to Software as a Service (SaaS) adoption. Design/methodology/approach: The research employs a mixed of quantitative and qualitative approaches for conducting an IT application assessment. Quantitative data such as end user’s feedback on the IT applications contribute to the technical impact on efficiency and productivity. Qualitative data such as business domain, business services and IT application cost drivers are used to determine the business value of the IT applications in an organization. Findings: The assessment of IT applications leads to decisions on suitability of each IT application that can be migrated to cloud environment. Research limitations/implications: The evaluation of how a particular IT application impacts on a business service is done based on the logical interpretation. Data mining method is suggested in order to derive the patterns of the IT application capabilities. Practical implications: This method has been applied in a local council in UK. This helps the council to decide the future status of the IT applications for cost saving purpose.
Resumo:
Stakeholder analysis plays a critical role in business analysis. However, the majority of the stakeholder identification and analysis methods focus on the activities and processes and ignore the artefacts being processed by human beings. By focusing on the outputs of the organisation, an artefact-centric view helps create a network of artefacts, and a component-based structure of the organisation and its supply chain participants. Since the relationship is based on the components, i.e. after the stakeholders are identified, the interdependency between stakeholders and the focal organisation can be measured. Each stakeholder is associated with two types of dependency, namely the stakeholder’s dependency on the focal organisation and the focal organisation’s dependency on the stakeholder. We identify three factors for each type of dependency and propose the equations that calculate the dependency indexes. Once both types of the dependency indexes are calculated, each stakeholder can be placed and categorised into one of the four groups, namely critical stakeholder, mutual benefits stakeholder, replaceable stakeholder, and easy care stakeholder. The mutual dependency grid and the dependency gap analysis, which further investigates the priority of each stakeholder by calculating the weighted dependency gap between the focal organisation and the stakeholder, subsequently help the focal organisation to better understand its stakeholders and manage its stakeholder relationships.
Resumo:
We consider the numerical treatment of second kind integral equations on the real line of the form ∅(s) = ∫_(-∞)^(+∞)▒〖κ(s-t)z(t)ϕ(t)dt,s=R〗 (abbreviated ϕ= ψ+K_z ϕ) in which K ϵ L_1 (R), z ϵ L_∞ (R) and ψ ϵ BC(R), the space of bounded continuous functions on R, are assumed known and ϕ ϵ BC(R) is to be determined. We first derive sharp error estimates for the finite section approximation (reducing the range of integration to [-A, A]) via bounds on (1-K_z )^(-1)as an operator on spaces of weighted continuous functions. Numerical solution by a simple discrete collocation method on a uniform grid on R is then analysed: in the case when z is compactly supported this leads to a coefficient matrix which allows a rapid matrix-vector multiply via the FFT. To utilise this possibility we propose a modified two-grid iteration, a feature of which is that the coarse grid matrix is approximated by a banded matrix, and analyse convergence and computational cost. In cases where z is not compactly supported a combined finite section and two-grid algorithm can be applied and we extend the analysis to this case. As an application we consider acoustic scattering in the half-plane with a Robin or impedance boundary condition which we formulate as a boundary integral equation of the class studied. Our final result is that if z (related to the boundary impedance in the application) takes values in an appropriate compact subset Q of the complex plane, then the difference between ϕ(s)and its finite section approximation computed numerically using the iterative scheme proposed is ≤C_1 [kh log〖(1⁄kh)+(1-Θ)^((-1)⁄2) (kA)^((-1)⁄2) 〗 ] in the interval [-ΘA,ΘA](Θ<1) for kh sufficiently small, where k is the wavenumber and h the grid spacing. Moreover this numerical approximation can be computed in ≤C_2 N logN operations, where N = 2A/h is the number of degrees of freedom. The values of the constants C1 and C2 depend only on the set Q and not on the wavenumber k or the support of z.
Resumo:
Details are given of the development and application of a 2D depth-integrated, conformal boundary-fitted, curvilinear model for predicting the depth-mean velocity field and the spatial concentration distribution in estuarine and coastal waters. A numerical method for conformal mesh generation, based on a boundary integral equation formulation, has been developed. By this method a general polygonal region with curved edges can be mapped onto a regular polygonal region with the same number of horizontal and vertical straight edges and a multiply connected region can be mapped onto a regular region with the same connectivity. A stretching transformation on the conformally generated mesh has also been used to provide greater detail where it is needed close to the coast, with larger mesh sizes further offshore, thereby minimizing the computing effort whilst maximizing accuracy. The curvilinear hydrodynamic and solute model has been developed based on a robust rectilinear model. The hydrodynamic equations are approximated using the ADI finite difference scheme with a staggered grid and the solute transport equation is approximated using a modified QUICK scheme. Three numerical examples have been chosen to test the curvilinear model, with an emphasis placed on complex practical applications