874 resultados para Distributed data access
Resumo:
This paper argues for the importance of retaining a map library presence on UK university campuses at a time when many are under threat of closure, and access to geospatial data is increasingly moving to web-based services. It is suggested that the need for local expertise is undiminished and map curators need to redefine themselves as geoinformation specialists, preserving their paper map collections, but also meeting some of the challenges of GIS, and contributing to national developments in the construction of distributed geolibraries and the provision of metadata, especially with regard to local data sets.
Resumo:
In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids.
Resumo:
Recently, two approaches have been introduced that distribute the molecular fragment mining problem. The first approach applies a master/worker topology, the second approach, a completely distributed peer-to-peer system, solves the scalability problem due to the bottleneck at the master node. However, in many real world scenarios the participating computing nodes cannot communicate directly due to administrative policies such as security restrictions. Thus, potential computing power is not accessible to accelerate the mining run. To solve this shortcoming, this work introduces a hierarchical topology of computing resources, which distributes the management over several levels and adapts to the natural structure of those multi-domain architectures. The most important aspect is the load balancing scheme, which has been designed and optimized for the hierarchical structure. The approach allows dynamic aggregation of heterogenous computing resources and is applied to wide area network scenarios.
Resumo:
Routine milk recording data, often covering many years, are available for approximately half the dairy herds of England and Wales. In addition to milk yield and quality, these data include production events that can be used to derive objective Key Performance Indicators (KPI) describing a herd's fertility and production. Recent developments in information systems give veterinarians and other technical advisers access to these KPIs on-line. In addition to reviewing individual herd performance, advisers can establish local benchmark groups to demonstrate the relative performance of similar herds in the vicinity. The use of existing milk recording data places no additional demands on farmer's time or resources. These developments could also readily be exploited by universities to introduce veterinary undergraduates to the realities of commercial dairy production.
Resumo:
This paper presents a simple Bayesian approach to sample size determination in clinical trials. It is required that the trial should be large enough to ensure that the data collected will provide convincing evidence either that an experimental treatment is better than a control or that it fails to improve upon control by some clinically relevant difference. The method resembles standard frequentist formulations of the problem, and indeed in certain circumstances involving 'non-informative' prior information it leads to identical answers. In particular, unlike many Bayesian approaches to sample size determination, use is made of an alternative hypothesis that an experimental treatment is better than a control treatment by some specified magnitude. The approach is introduced in the context of testing whether a single stream of binary observations are consistent with a given success rate p(0). Next the case of comparing two independent streams of normally distributed responses is considered, first under the assumption that their common variance is known and then for unknown variance. Finally, the more general situation in which a large sample is to be collected and analysed according to the asymptotic properties of the score statistic is explored. Copyright (C) 2007 John Wiley & Sons, Ltd.
Resumo:
The monophyly of the Peltophorum group, one of nine informal groups recognized by Polhill in the Caesalpinieae, was tested using sequence data from the trnL-F, rbcL, and rps16 regions of the chloroplast genome. Exemplars were included from all 16 genera of the Peltophorum group, and from 15 genera representing seven of the other eight informal groups in the tribe. The data were analyzed separately and in combined analyses using parsimony and Bayesian methods. The analysis method had little effect on the topology of well-supported relationships. The molecular data recovered a generally well-supported phylogeny with many intergeneric relationships resolved. Results show that the Peltophorum group as currently delimited is polyphyletic, but that eight genera plus one undescribed genus form a core Peltophorum group, which is referred to here as the Peltophorum group sensu stricto. These genera are Bussea, Conzattia, Colvillea, Delonix, Heteroflorum (inedit.), Lemuropisum, Parkinsonia, Peltophorum, and Schizolobium. The remaining eight genera of the Peltophorum group s.l. are distributed across the Caesalpinieae. Morphological support for the redelimited Peltophorum group and the other recovered clades was assessed, and no unique synapomorphy was found for the Peltophorum group s.s. A proposal for the reclassification of the Peltophorum group s.l. is presented.
Resumo:
Heterogeneity in lifetime data may be modelled by multiplying an individual's hazard by an unobserved frailty. We test for the presence of frailty of this kind in univariate and bivariate data with Weibull distributed lifetimes, using statistics based on the ordered Cox-Snell residuals from the null model of no frailty. The form of the statistics is suggested by outlier testing in the gamma distribution. We find through simulation that the sum of the k largest or k smallest order statistics, for suitably chosen k , provides a powerful test when the frailty distribution is assumed to be gamma or positive stable, respectively. We provide recommended values of k for sample sizes up to 100 and simple formulae for estimated critical values for tests at the 5% level.
Resumo:
Physiological parameters measured by an embedded body sensor system were demonstrated to respond to changes of the air temperature in an office environment. The thermal parameters were monitored with the use of a wireless sensor system that made possible to turn any existing room into a field laboratory. Two human subjects were monitored over daily activities and at various steady-state thermal conditions when the air temperature of the room was altered from 22-23°C to 25-28°C. The subjects indicated their thermal feeling on questionnaires. The measured skin temperature was distributed close to the calculated mean skin temperature corresponding to the given activity level. The variation of Galvanic Skin Response (GSR) reflected the evaporative heat loss through the body surfaces and indicated whether sweating occurred on the subjects. Further investigations are needed to fully evaluate the influence of thermal and other factors on the output given by the investigated body sensor system.
Resumo:
A wireless sensor network (WSN) is a group of sensors linked by wireless medium to perform distributed sensing tasks. WSNs have attracted a wide interest from academia and industry alike due to their diversity of applications, including home automation, smart environment, and emergency services, in various buildings. The primary goal of a WSN is to collect data sensed by sensors. These data are characteristic of being heavily noisy, exhibiting temporal and spatial correlation. In order to extract useful information from such data, as this paper will demonstrate, people need to utilise various techniques to analyse the data. Data mining is a process in which a wide spectrum of data analysis methods is used. It is applied in the paper to analyse data collected from WSNs monitoring an indoor environment in a building. A case study is given to demonstrate how data mining can be used to optimise the use of the office space in a building.
Resumo:
Distributed computing paradigms for sharing resources such as Clouds, Grids, Peer-to-Peer systems, or voluntary computing are becoming increasingly popular. While there are some success stories such as PlanetLab, OneLab, BOINC, BitTorrent, and SETI@home, a widespread use of these technologies for business applications has not yet been achieved. In a business environment, mechanisms are needed to provide incentives to potential users for participating in such networks. These mechanisms may range from simple non-monetary access rights, monetary payments to specific policies for sharing. Although a few models for a framework have been discussed (in the general area of a "Grid Economy"), none of these models has yet been realised in practice. This book attempts to fill this gap by discussing the reasons for such limited take-up and exploring incentive mechanisms for resource sharing in distributed systems. The purpose of this book is to identify research challenges in successfully using and deploying resource sharing strategies in open-source and commercial distributed systems.
Resumo:
In order to organize distributed educational resources efficiently, to provide active learners an integrated, extendible and cohesive interface to share the dynamically growing multimedia learning materials on the Internet, this paper proposes a generic resource organization model with semantic structures to improve expressiveness, scalability and cohesiveness. We developed an active learning system with semantic support for learners to access and navigate through efficient and flexible manner. We learning resources in an efficient and flexible manner. We provide facilities for instructors to manipulate the structured educational resources via a convenient visual interface. We also developed a resource discovering and gathering engine based on complex semantic associations for several specific topics.
Resumo:
This paper analyzes the performance of Enhanced relay-enabled Distributed Coordination Function (ErDCF) for wireless ad hoc networks under transmission errors. The idea of ErDCF is to use high data rate nodes to work as relays for the low data rate nodes. ErDCF achieves higher throughput and reduces energy consumption compared to IEEE 802.11 Distributed Coordination Function (DCF) in an ideal channel environment. However, there is a possibility that this expected gain may decrease in the presence of transmission errors. In this work, we modify the saturation throughput model of ErDCF to accurately reflect the impact of transmission errors under different rate combinations. It turns out that the throughput gain of ErDCF can still be maintained under reasonable link quality and distance.
Resumo:
In this paper we evaluate the performance of our earlier proposed enhanced relay-enabled distributed coordination function (ErDCF) for wireless ad hoc networks. The idea of ErDCF is to use high data rate nodes to work as relays for the low data rate nodes. ErDCF achieves higher throughput and reduced energy consumption compared to IEEE 802.11 distributed coordination function (DCF). This is a result of. 1) using relay which helps to increase the throughput and lower overall blocking time of nodes due to faster dual-hop transmission, 2) using dynamic preamble (i.e. using short preamble for the relay transmission) which further increases the throughput and lower overall blocking time and also by 3) reducing unnecessary overhearing (by other nodes not involved in transmission). We evaluate the throughput and energy performance of the ErDCF with different rate combinations. ErDCF (11,11) (ie. R1=R2=11 Mbps) yields a throughput improvement of 92.9% (at the packet length of 1000 bytes) and an energy saving of 72.2% at 50 nodes.
Resumo:
This paper analyzes the performance of enhanced relay-enabled distributed coordination function (ErDCF) for wireless ad hoc networks under transmission errors. The idea of ErDCF is to use high data rate nodes to work as relays for the low data rate nodes. ErDCF achieves higher throughput and reduces energy consumption compared to IEEE 802.11 distributed coordination function (DCF) in an ideal channel environment. However, there is a possibility that this expected gain may decrease in the presence of transmission errors. In this work, we modify the saturation throughput model of ErDCF to accurately reflect the impact of transmission errors under different rate combinations. It turns out that the throughput gain of ErDCF can still be maintained under reasonable link quality and distance.
Resumo:
Wireless local area networks (WLANs) have changed the way many of us communicate, work, play and live. Due to its popularity, dense deployments are becoming a norm in many cities around the world. However, increased interference and traffic demands can severely limit the aggregate throughput achievable if an effective channel assignment scheme is not used. In this paper, we propose an enhanced asynchronous distributed and dynamic channel assignment scheme that is simple to implement, does not require any knowledge of the throughput function, allows asynchronous channel switching by each access point (AP) and is superior in performance. Simulation results show that our proposed scheme converges much faster than previously reported synchronous schemes, with a reduction in convergence time and channel switches by tip to 73.8% and 30.0% respectively.