730 resultados para Interactive computing.
Resumo:
Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.
Resumo:
Markowitz showed that assets can be combined to produce an 'Efficient' portfolio that will give the highest level of portfolio return for any level of portfolio risk, as measured by the variance or standard deviation. These portfolios can then be connected to generate what is termed an 'Efficient Frontier' (EF). In this paper we discuss the calculation of the Efficient Frontier for combinations of assets, again using the spreadsheet Optimiser. To illustrate the derivation of the Efficient Frontier, we use the data from the Investment Property Databank Long Term Index of Investment Returns for the period 1971 to 1993. Many investors might require a certain specific level of holding or a restriction on holdings in at least some of the assets. Such additional constraints may be readily incorporated into the model to generate a constrained EF with upper and/or lower bounds. This can then be compared with the unconstrained EF to see whether the reduction in return is acceptable. To see the effect that these additional constraints may have, we adopt a fairly typical pension fund profile, with no more than 20% of the total held in Property. The paper shows that it is now relatively easy to use the Optimiser available in at least one spreadsheet (EXCEL) to calculate efficient portfolios for various levels of risk and return, both constrained and unconstrained, so as to be able to generate any number of Efficient Frontiers.
Resumo:
Dynamic multi-user interactions in a single networked virtual environment suffer from abrupt state transition problems due to communication delays arising from network latency--an action by one user only becoming apparent to another user after the communication delay. This results in a temporal suspension of the environment for the duration of the delay--the virtual world `hangs'--followed by an abrupt jump to make up for the time lost due to the delay so that the current state of the virtual world is displayed. These discontinuities appear unnatural and disconcerting to the users. This paper proposes a novel method of warping times associated with users to ensure that each user views a continuous version of the virtual world, such that no hangs or jumps occur despite other user interactions. Objects passed between users within the environment are parameterized, not by real time, but by a virtual local time, generated by continuously warping real time. This virtual time periodically realigns itself with real time as the virtual environment evolves. The concept of a local user dynamically warping the local time is also introduced. As a result, the users are shielded from viewing discontinuities within their virtual worlds, consequently enhancing the realism of the virtual environment.
Resumo:
The impending threat of global climate change and its regional manifestations is among the most important and urgent problems facing humanity. Society needs accurate and reliable estimates of changes in the probability of regional weather variations to develop science-based adaptation and mitigation strategies. Recent advances in weather prediction and in our understanding and ability to model the climate system suggest that it is both necessary and possible to revolutionize climate prediction to meet these societal needs. However, the scientific workforce and the computational capability required to bring about such a revolution is not available in any single nation. Motivated by the success of internationally funded infrastructure in other areas of science, this paper argues that, because of the complexity of the climate system, and because the regional manifestations of climate change are mainly through changes in the statistics of regional weather variations, the scientific and computational requirements to predict its behavior reliably are so enormous that the nations of the world should create a small number of multinational high-performance computing facilities dedicated to the grand challenges of developing the capabilities to predict climate variability and change on both global and regional scales over the coming decades. Such facilities will play a key role in the development of next-generation climate models, build global capacity in climate research, nurture a highly trained workforce, and engage the global user community, policy-makers, and stakeholders. We recommend the creation of a small number of multinational facilities with computer capability at each facility of about 20 peta-flops in the near term, about 200 petaflops within five years, and 1 exaflop by the end of the next decade. Each facility should have sufficient scientific workforce to develop and maintain the software and data analysis infrastructure. Such facilities will enable questions of what resolution, both horizontal and vertical, in atmospheric and ocean models, is necessary for more confident predictions at the regional and local level. Current limitations in computing power have placed severe limitations on such an investigation, which is now badly needed. These facilities will also provide the world's scientists with the computational laboratories for fundamental research on weather–climate interactions using 1-km resolution models and on atmospheric, terrestrial, cryospheric, and oceanic processes at even finer scales. Each facility should have enabling infrastructure including hardware, software, and data analysis support, and scientific capacity to interact with the national centers and other visitors. This will accelerate our understanding of how the climate system works and how to model it. It will ultimately enable the climate community to provide society with climate predictions, which are based on our best knowledge of science and the most advanced technology.
Resumo:
Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.
Resumo:
The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data and a data warehouse. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular we look at two aspects, first how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories --- this is an important and challenging aspect of P-found because the data volumes involved are too large to be centralised. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling new scientific discoveries.
Resumo:
Purpose: This paper aims to design an evaluation method that enables an organization to assess its current IT landscape and provide readiness assessment prior to Software as a Service (SaaS) adoption. Design/methodology/approach: The research employs a mixed of quantitative and qualitative approaches for conducting an IT application assessment. Quantitative data such as end user’s feedback on the IT applications contribute to the technical impact on efficiency and productivity. Qualitative data such as business domain, business services and IT application cost drivers are used to determine the business value of the IT applications in an organization. Findings: The assessment of IT applications leads to decisions on suitability of each IT application that can be migrated to cloud environment. Research limitations/implications: The evaluation of how a particular IT application impacts on a business service is done based on the logical interpretation. Data mining method is suggested in order to derive the patterns of the IT application capabilities. Practical implications: This method has been applied in a local council in UK. This helps the council to decide the future status of the IT applications for cost saving purpose.
Resumo:
This chapter introduces the latest practices and technologies in the interactive interpretation of environmental data. With environmental data becoming ever larger, more diverse and more complex, there is a need for a new generation of tools that provides new capabilities over and above those of the standard workhorses of science. These new tools aid the scientist in discovering interesting new features (and also problems) in large datasets by allowing the data to be explored interactively using simple, intuitive graphical tools. In this way, new discoveries are made that are commonly missed by automated batch data processing. This chapter discusses the characteristics of environmental science data, common current practice in data analysis and the supporting tools and infrastructure. New approaches are introduced and illustrated from the points of view of both the end user and the underlying technology. We conclude by speculating as to future developments in the field and what must be achieved to fulfil this vision.
Resumo:
Soluble reactive phosphorus (SRP) plays a key role in eutrophication, a global problem decreasing habitat quality and in-stream biodiversity. Mitigation strategies are required to prevent SRP fluxes from exceeding critical levels, and must be robust in the face of potential changes in climate, land use and a myriad of other influences. To establish the longevity of these strategies it is therefore crucial to consider the sensitivity of catchments to multiple future stressors. This study evaluates how the water quality and hydrology of a major river system in the UK (the River Thames) respond to alterations in climate, land use and water resource allocations, and investigates how these changes impact the relative performance of management strategies over an 80-year period. In the River Thames, the relative contributions of SRP from diffuse and point sources vary seasonally. Diffuse sources of SRP from agriculture dominate during periods of high runoff, and point sources during low flow periods. SRP concentrations rose under any future scenario which either increased a) surface runoff or b) the area of cultivated land. Under these conditions, SRP was sourced from agriculture, and the most effective single mitigation measures were those which addressed diffuse SRP sources. Conversely, where future scenarios reduced flow e.g. during winters of reservoir construction, the significance of point source inputs increased, and mitigation measures addressing these issues became more effective. In catchments with multiple point and diffuse sources of SRP, an all-encompassing effective mitigation approach is difficult to achieve with a single strategy. In order to attain maximum efficiency, multiple strategies might therefore be employed at different times and locations, to target the variable nature of dominant SRP sources and pathways.
Resumo:
The climatology of ozone produced by the Canadian Middle Atmosphere Model (CMAM) is presented. This three-dimensional global model incorporates the radiative feedbacks of ozone and water vapor calculated on-line with a photochemical module. This module includes a comprehensive gas-phase reaction set and a limited set of heterogeneous reactions to account for processes occurring on background sulphate aerosols. While transport is global, photochemistry is solved from about 400 hPa to the top of the model at ∼95 km. This approach provides a complete and comprehensive representation of transport, emission, and photochemistry of various constituents from the surface to the mesopause region. A comparison of model results with observations indicates that the ozone distribution and variability are in agreement with observations throughout most of the model domain. Column ozone annual variation is represented to within 5–10% of the observations except in the Southern Hemisphere for springtime high latitudes. The vertical ozone distribution is generally well represented by the model up to the mesopause region. Nevertheless, in the upper stratosphere, the model generally underestimates the amount of ozone as well as the latitudinal tilting of ozone isopleths at high latitude. Ozone variability is analyzed and compared with measurements. The comparison shows that the phase and amplitude of the seasonal variation as well as shorter timescale variations are well represented by the model at various latitudes and heights. Finally, the impact of incorporating ozone radiative feedback on the model climatology is isolated. It is found that the incorporation of ozone radiative feedback results in a cooling of ∼8 K in the summer stratopause region, which corrects a warm bias that results when climatological ozone is used.