811 resultados para Recommended Systems, Collaborative Filtering, Customization, Distributed Recommender
Resumo:
High rates of nutrient loading from agricultural and urban development have resulted in surface water eutrophication and groundwater contamination in regions of Ontario. In Lake Simcoe (Ontario, Canada), anthropogenic nutrient contributions have contributed to increased algal growth, low hypolimnetic oxygen concentrations, and impaired fish reproduction. An ambitious programme has been initiated to reduce phosphorus loads to the lake, aiming to achieve at least a 40% reduction in phosphorus loads by 2045. Achievement of this target necessitates effective remediation strategies, which will rely upon an improved understanding of controls on nutrient export from tributaries of Lake Simcoe as well as improved understanding of the importance of phosphorus cycling within the lake. In this paper, we describe a new model structure for the integrated dynamic and process-based model INCA-P, which allows fully-distributed applications, suited to branched river networks. We demonstrate application of this model to the Black River, a tributary of Lake Simcoe, and use INCA-P to simulate the fluxes of P entering the lake system, apportion phosphorus among different sources in the catchment, and explore future scenarios of land-use change and nutrient management to identify high priority sites for implementation of watershed best management practises.
Resumo:
Oxford University Press’s response to technological change in printing and publishing processes in this period can be considered in three phases: an initial period when the computerization of typesetting was seen as offering both cost savings and the ability to produce new editions of existing works more quickly; an intermediate phase when the emergence of standards in desktop computing allowed experiments with the sale of software as well as packaged electronic publications; and a third phase when the availability of the world wide web as a means of distribution allowed OUP to return to publishing in its traditional areas of strength albeit in new formats. Each of these phases demonstrates a tension between a desire to develop centralized systems and expertise, and a recognition that dynamic publishing depends on distributed decision-making and innovation. Alongside these developments in production and distribution lay developments in computer support for managerial and collaborative publishing processes, often involving the same personnel and sometimes the same equipment.
Resumo:
Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.
Resumo:
One goal in the development of distributed virtual environments (DVEs) is to create a system such that users are unaware of the distribution-the distribution should be transparent. The paper begins by discussing the general issues in DVEs that might make this possible, and a system that allows some level of distribution transparency is described. The system described suffers from effects of inconsistency, which in turn cause undesirable visual effects. The causal surface is introduced as a solution that removes these visual effects. The paper then introduces two determining factors of distribution transparency relating to user perception and performance. With regard to these factors, two hypotheses are stated relating to the causal surface. A user-trial on forty-five subjects is used to validate the hypotheses. A discussion of the results of the trial concludes that the causal surface solution does significantly improve the distribution transparency in a DVE.
Resumo:
Research to date has tended to concentrate on bandwidth considerations to increase scalability in distributed interactive simulation and virtual reality systems. This paper proposes that the major concern for latency in user interaction is that of the fundamental limit of communication rate due to the speed of light. Causal volumes and surfaces are introduced as a model of the limitations of causality caused by this fundamental delay. The concept of virtual world critical speed is introduced, which can be determined from the causal surface. The implications of the critical speed are discussed, and relativistic dynamics are used to constrain the object speed, in the same way speeds are bounded in the real world.
Resumo:
User interaction within a virtual environment may take various forms: a teleconferencing application will require users to speak to each other (Geak, 1993), with computer supported co-operative working; an Engineer may wish to pass an object to another user for examination; in a battle field simulation (McDonough, 1992), users might exchange fire. In all cases it is necessary for the actions of one user to be presented to the others sufficiently quickly to allow realistic interaction. In this paper we take a fresh look at the approach of virtual reality operating systems by tackling the underlying issues of creating real-time multi-user environments.
Resumo:
Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.
Resumo:
Pocket Data Mining (PDM) describes the full process of analysing data streams in mobile ad hoc distributed environments. Advances in mobile devices like smart phones and tablet computers have made it possible for a wide range of applications to run in such an environment. In this paper, we propose the adoption of data stream classification techniques for PDM. Evident by a thorough experimental study, it has been proved that running heterogeneous/different, or homogeneous/similar data stream classification techniques over vertically partitioned data (data partitioned according to the feature space) results in comparable performance to batch and centralised learning techniques.
Resumo:
The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.
Resumo:
Reduced flexibility of low carbon generation could pose new challenges for future energy systems. Both demand response and distributed storage may have a role to play in supporting future system balancing. This paper reviews how these technically different, but functionally similar approaches compare and compete with one another. Household survey data is used to test the effectiveness of price signals to deliver demand responses for appliances with a high degree of agency. The underlying unit of storage for different demand response options is discussed, with particular focus on the ability to enhance demand side flexibility in the residential sector. We conclude that a broad range of options, with different modes of storage, may need to be considered, if residential demand flexibility is to be maximised.
Resumo:
We consider the problem of discrete time filtering (intermittent data assimilation) for differential equation models and discuss methods for its numerical approximation. The focus is on methods based on ensemble/particle techniques and on the ensemble Kalman filter technique in particular. We summarize as well as extend recent work on continuous ensemble Kalman filter formulations, which provide a concise dynamical systems formulation of the combined dynamics-assimilation problem. Possible extensions to fully nonlinear ensemble/particle based filters are also outlined using the framework of optimal transportation theory.
Resumo:
BACKGROUND: Reduction of vegetation height is recommended as a management strategy for controlling rodent pests of rice in South-east Asia, but there are limited field data to assess its effectiveness. The breeding biology of the main pest species of rodent in the Philippines, Rattus tanezumi, suggests that habitat manipulation in irrigated rice–coconut cropping systems may be an effective strategy to limit the quality and availability of their nesting habitat. The authors imposed a replicated manipulation of vegetation cover in adjacent coconut groves during a single rice-cropping season, and added artificial nest sites to facilitate capture and culling of young. RESULTS: Three trapping sessions in four rice fields (two treatments, two controls) adjacent to coconut groves led to the capture of 176 R. tanezumi, 12Rattus exulans and seven Chrotomysmindorensis individuals. There was no significant difference in overall abundance between crop stages or between treatments, and there was no treatment effect on damage to tillers or rice yield. Only two R. tanezumi were caught at the artificial nest sites. CONCLUSION: Habitat manipulation to reduce the quality of R. tanezumi nesting habitat adjacent to rice fields is not effective as a lone rodent management tool in rice–coconut cropping systems.
Resumo:
A new model has been developed for assessing multiple sources of nitrogen in catchments. The model (INCA) is process based and uses reaction kinetic equations to simulate the principal mechanisms operating. The model allows for plant uptake, surface and sub-surface pathways and can simulate up to six land uses simultaneously. The model can be applied to catchment as a semi-distributed simulation and has an inbuilt multi-reach structure for river systems. Sources of nitrogen can be from atmospheric deposition, from the terrestrial environment (e.g. agriculture, leakage from forest systems etc.), from urban areas or from direct discharges via sewage or intensive farm units. The model is a daily simulation model and can provide information in the form of time series at key sites, or as profiles down river systems or as statistical distributions. The process model is described and in a companion paper the model is applied to the River Tywi catchment in South Wales and the Great Ouse in Bedfordshire.
Resumo:
Tagging provides support for retrieval and categorization of online content depending on users' tag choice. A number of models of tagging behaviour have been proposed to identify factors that are considered to affect taggers, such as users' tagging history. In this paper, we use Semiotics Analysis and Activity theory, to study the effect the system designer has over tagging behaviour. The framework we use shows the components that comprise the tagging system and how they interact together to direct tagging behaviour. We analysed two collaborative tagging systems: CiteULike and Delicious by studying their components by applying our framework. Using datasets from both systems, we found that 35% of CiteULike users did not provide tags compared to only 0.1% of Delicious users. This was directly linked to the type of tools used by the system designer to support tagging.
Resumo:
Surface temperature is a key aspect of weather and climate, but the term may refer to different quantities that play interconnected roles and are observed by different means. In a community-based activity in June 2012, the EarthTemp Network brought together 55 researchers from five continents to improve the interaction between scientific communities who focus on surface temperature in particular domains, to exploit the strengths of different observing systems and to better meet the needs of different communities. The workshop identified key needs for progress towards meeting scientific and societal requirements for surface temperature understanding and information, which are presented in this community paper. A "whole-Earth" perspective is required with more integrated, collaborative approaches to observing and understanding Earth's various surface temperatures. It is necessary to build understanding of the relationships between different surface temperatures, where presently inadequate, and undertake large-scale systematic intercomparisons. Datasets need to be easier to obtain and exploit for a wide constituency of users, with the differences and complementarities communicated in readily understood terms, and realistic and consistent uncertainty information provided. Steps were also recommended to curate and make available data that are presently inaccessible, develop new observing systems and build capacities to accelerate progress in the accuracy and usability of surface temperature datasets.