681 resultados para cluster computing
Resumo:
Extending previous studies, a full-circle investigation of the ring current has been made using Cluster 4-spacecraft observations near perigee, at times when the Cluster array had relatively small separations and nearly regular tetrahedral configurations, and when the Dst index was greater than −30 nT (non-storm conditions). These observations result in direct estimations of the near equatorial current density at all magnetic local times (MLT) for the first time and with sufficient accuracy, for the following observations. The results confirm that the ring current flows westward and show that the in situ average measured current density (sampled in the radial range accessed by Cluster 4–4.5RE) is asymmetric in MLT, ranging from 9 to 27 nAm−2. The direction of current is shown to be very well ordered for the whole range of MLT. Both of these results are in line with previous studies on partial ring extent. The magnitude of the current density, however, reveals a distinct asymmetry: growing from 10 to 27 nAm−2 as azimuth reduces from about 12:00MLT to 03:00 and falling from 20 to 10 nAm−2 less steadily as azimuth reduces from 24:00 to 12:00MLT. This result has not been reported before and we suggest it could reflect a number of effects. Firstly, we argue it is consistent with the operation of region-2 field aligned-currents (FACs), which are expected to flow upward into the ring current around 09:00MLT and downward out of the ring current around 14:00MLT. Secondly, we note that it is also consistent with a possible asymmetry in the radial distribution profile of current density (resulting in higher peak at 4– 4.5RE). We note that part of the enhanced current could reflect an increase in the mean AE activity (during the periods in which Cluster samples those MLT).
Resumo:
Background: Medication errors are common in primary care and are associated with considerable risk of patient harm. We tested whether a pharmacist-led, information technology-based intervention was more effective than simple feedback in reducing the number of patients at risk of measures related to hazardous prescribing and inadequate blood-test monitoring of medicines 6 months after the intervention. Methods: In this pragmatic, cluster randomised trial general practices in the UK were stratified by research site and list size, and randomly assigned by a web-based randomisation service in block sizes of two or four to one of two groups. The practices were allocated to either computer-generated simple feedback for at-risk patients (control) or a pharmacist-led information technology intervention (PINCER), composed of feedback, educational outreach, and dedicated support. The allocation was masked to general practices, patients, pharmacists, researchers, and statisticians. Primary outcomes were the proportions of patients at 6 months after the intervention who had had any of three clinically important errors: non-selective non-steroidal anti-inflammatory drugs (NSAIDs) prescribed to those with a history of peptic ulcer without co-prescription of a proton-pump inhibitor; β blockers prescribed to those with a history of asthma; long-term prescription of angiotensin converting enzyme (ACE) inhibitor or loop diuretics to those 75 years or older without assessment of urea and electrolytes in the preceding 15 months. The cost per error avoided was estimated by incremental cost-eff ectiveness analysis. This study is registered with Controlled-Trials.com, number ISRCTN21785299. Findings: 72 general practices with a combined list size of 480 942 patients were randomised. At 6 months’ follow-up, patients in the PINCER group were significantly less likely to have been prescribed a non-selective NSAID if they had a history of peptic ulcer without gastroprotection (OR 0∙58, 95% CI 0∙38–0∙89); a β blocker if they had asthma (0∙73, 0∙58–0∙91); or an ACE inhibitor or loop diuretic without appropriate monitoring (0∙51, 0∙34–0∙78). PINCER has a 95% probability of being cost eff ective if the decision-maker’s ceiling willingness to pay reaches £75 per error avoided at 6 months. Interpretation: The PINCER intervention is an effective method for reducing a range of medication errors in general practices with computerised clinical records. Funding: Patient Safety Research Portfolio, Department of Health, England.
Resumo:
Markowitz showed that assets can be combined to produce an 'Efficient' portfolio that will give the highest level of portfolio return for any level of portfolio risk, as measured by the variance or standard deviation. These portfolios can then be connected to generate what is termed an 'Efficient Frontier' (EF). In this paper we discuss the calculation of the Efficient Frontier for combinations of assets, again using the spreadsheet Optimiser. To illustrate the derivation of the Efficient Frontier, we use the data from the Investment Property Databank Long Term Index of Investment Returns for the period 1971 to 1993. Many investors might require a certain specific level of holding or a restriction on holdings in at least some of the assets. Such additional constraints may be readily incorporated into the model to generate a constrained EF with upper and/or lower bounds. This can then be compared with the unconstrained EF to see whether the reduction in return is acceptable. To see the effect that these additional constraints may have, we adopt a fairly typical pension fund profile, with no more than 20% of the total held in Property. The paper shows that it is now relatively easy to use the Optimiser available in at least one spreadsheet (EXCEL) to calculate efficient portfolios for various levels of risk and return, both constrained and unconstrained, so as to be able to generate any number of Efficient Frontiers.
Resumo:
The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale.
Resumo:
The impending threat of global climate change and its regional manifestations is among the most important and urgent problems facing humanity. Society needs accurate and reliable estimates of changes in the probability of regional weather variations to develop science-based adaptation and mitigation strategies. Recent advances in weather prediction and in our understanding and ability to model the climate system suggest that it is both necessary and possible to revolutionize climate prediction to meet these societal needs. However, the scientific workforce and the computational capability required to bring about such a revolution is not available in any single nation. Motivated by the success of internationally funded infrastructure in other areas of science, this paper argues that, because of the complexity of the climate system, and because the regional manifestations of climate change are mainly through changes in the statistics of regional weather variations, the scientific and computational requirements to predict its behavior reliably are so enormous that the nations of the world should create a small number of multinational high-performance computing facilities dedicated to the grand challenges of developing the capabilities to predict climate variability and change on both global and regional scales over the coming decades. Such facilities will play a key role in the development of next-generation climate models, build global capacity in climate research, nurture a highly trained workforce, and engage the global user community, policy-makers, and stakeholders. We recommend the creation of a small number of multinational facilities with computer capability at each facility of about 20 peta-flops in the near term, about 200 petaflops within five years, and 1 exaflop by the end of the next decade. Each facility should have sufficient scientific workforce to develop and maintain the software and data analysis infrastructure. Such facilities will enable questions of what resolution, both horizontal and vertical, in atmospheric and ocean models, is necessary for more confident predictions at the regional and local level. Current limitations in computing power have placed severe limitations on such an investigation, which is now badly needed. These facilities will also provide the world's scientists with the computational laboratories for fundamental research on weather–climate interactions using 1-km resolution models and on atmospheric, terrestrial, cryospheric, and oceanic processes at even finer scales. Each facility should have enabling infrastructure including hardware, software, and data analysis support, and scientific capacity to interact with the national centers and other visitors. This will accelerate our understanding of how the climate system works and how to model it. It will ultimately enable the climate community to provide society with climate predictions, which are based on our best knowledge of science and the most advanced technology.
Resumo:
Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.
Resumo:
The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data and a data warehouse. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular we look at two aspects, first how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories --- this is an important and challenging aspect of P-found because the data volumes involved are too large to be centralised. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling new scientific discoveries.
Resumo:
Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.
Resumo:
Purpose: This paper aims to design an evaluation method that enables an organization to assess its current IT landscape and provide readiness assessment prior to Software as a Service (SaaS) adoption. Design/methodology/approach: The research employs a mixed of quantitative and qualitative approaches for conducting an IT application assessment. Quantitative data such as end user’s feedback on the IT applications contribute to the technical impact on efficiency and productivity. Qualitative data such as business domain, business services and IT application cost drivers are used to determine the business value of the IT applications in an organization. Findings: The assessment of IT applications leads to decisions on suitability of each IT application that can be migrated to cloud environment. Research limitations/implications: The evaluation of how a particular IT application impacts on a business service is done based on the logical interpretation. Data mining method is suggested in order to derive the patterns of the IT application capabilities. Practical implications: This method has been applied in a local council in UK. This helps the council to decide the future status of the IT applications for cost saving purpose.
Resumo:
Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element
Resumo:
Objective To undertake a process evaluation of pharmacists' recommendations arising in the context of a complex IT-enabled pharmacist-delivered randomised controlled trial (PINCER trial) to reduce the risk of hazardous medicines management in general practices. Methods PINCER pharmacists manually recorded patients’ demographics, details of interventions recommended, actions undertaken by practice staff and time taken to manage individual cases of hazardous medicines management. Data were coded and double entered into SPSS v15, and then summarised using percentages for categorical data (with 95% CI) and, as appropriate, means (SD) or medians (IQR) for continuous data. Key findings Pharmacists spent a median of 20 minutes (IQR 10, 30) reviewing medical records, recommending interventions and completing actions in each case of hazardous medicines management. Pharmacists judged 72% (95%CI 70, 74) (1463/2026) of cases of hazardous medicines management to be clinically relevant. Pharmacists recommended 2105 interventions in 74% (95%CI 73, 76) (1516/2038) of cases and 1685 actions were taken in 61% (95%CI 59, 63) (1246/2038) of cases; 66% (95%CI 64, 68) (1383/2105) of interventions recommended by pharmacists were completed and 5% (95%CI 4, 6) (104/2105) of recommendations were accepted by general practitioners (GPs), but not completed at the end of the pharmacists’ placement; the remaining recommendations were rejected or considered not relevant by GPs. Conclusions The outcome measures were used to target pharmacist activity in general practice towards patients at risk from hazardous medicines management. Recommendations from trained PINCER pharmacists were found to be broadly acceptable to GPs and led to ameliorative action in the majority of cases. It seems likely that the approach used by the PINCER pharmacists could be employed by other practice pharmacists following appropriate training.