Biblioteca Digital

146 resultados para Client-server distributed databases

Dietary intakes and food sources of phytoestrogens in the European Prospective Investigation into Cancer and Nutrition (EPIC) 24-hour dietary recall cohort

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND/OBJECTIVES: Phytoestrogens are estradiol-like natural compounds found in plants that have been associated with protective effects against chronic diseases, including some cancers, cardiovascular diseases and osteoporosis. The purpose of this study was to estimate the dietary intake of phytoestrogens, identify their food sources and their association with lifestyle factors in the European Prospective Investigation into Cancer and Nutrition (EPIC) cohort. SUBJECTS/METHODS: Single 24-hour dietary recalls were collected from 36 037 individuals from 10 European countries, aged 35–74 years using a standardized computerized interview programe (EPIC-Soft). An ad hoc food composition database on phytoestrogens (isoflavones, lignans, coumestans, enterolignans and equol) was compiled using data from available databases, in order to obtain and describe phytoestrogen intakes and their food sources across 27 redefined EPIC centres. RESULTS: Mean total phytoestrogen intake was the highest in the UK health-conscious group (24.9 mg/day in men and 21.1 mg/day in women) whereas lowest in Greece (1.3 mg/day) in men and Spain-Granada (1.0 mg/day) in women. Northern European countries had higher intakes than southern countries. The main phytoestrogen contributors were isoflavones in both UK centres and lignans in the other EPIC cohorts. Age, body mass index, educational level, smoking status and physical activity were related to increased intakes of lignans, enterolignans and equol, but not to total phytoestrogen, isoflavone or coumestan intakes. In the UK cohorts, the major food sources of phytoestrogens were soy products. In the other EPIC cohorts the dietary sources were more distributed, among fruits, vegetables, soy products, cereal products, non-alcoholic and alcoholic beverages. CONCLUSIONS: There was a high variability in the dietary intake of total and phytoestrogen subclasses and their food sources across European regions.

The Climate-G testbed: towards large scale distributed data management for climate change

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Climate-G is a large scale distributed testbed devoted to climate change research. It is an unfunded effort started in 2008 and involving a wide community both in Europe and US. The testbed is an interdisciplinary effort involving partners from several institutions and joining expertise in the field of climate change and computational science. Its main goal is to allow scientists carrying out geographical and cross-institutional data discovery, access, analysis, visualization and sharing of climate data. It represents an attempt to address, in a real environment, challenging data and metadata management issues. This paper presents a complete overview about the Climate-G testbed highlighting the most important results that have been achieved since the beginning of this project.

P-Prism: a computationally efficient approach to scaling up classification rule induction

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unseen data. Alternative algorithms have been developed such as the Prism algorithm. Prism constructs modular rules which produce qualitatively better rules than rules induced by TDIDT. However, along with the increasing size of databases, many existing rule learning algorithms have proved to be computational expensive on large datasets. To tackle the problem of scalability, parallel classification rule induction algorithms have been introduced. As TDIDT is the most popular classifier, even though there are strongly competitive alternative algorithms, most parallel approaches to inducing classification rules are based on TDIDT. In this paper we describe work on a distributed classifier that induces classification rules in a parallel manner based on Prism.

Grid computing solutions for distributed repositories of protein folding and unfolding simulations

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data and a data warehouse. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular we look at two aspects, first how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories --- this is an important and challenging aspect of P-found because the data volumes involved are too large to be centralised. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling new scientific discoveries.

Distributed hoeffding trees for pocket data mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Collaborative mining of distributed data streams in a mobile computing environment is referred to as Pocket Data Mining PDM. Hoeffding trees techniques have been experimentally and analytically validated for data stream classification. In this paper, we have proposed, developed and evaluated the adoption of distributed Hoeffding trees for classifying streaming data in PDM applications. We have identified a realistic scenario in which different users equipped with smart mobile devices run a local Hoeffding tree classifier on a subset of the attributes. Thus, we have investigated the mining of vertically partitioned datasets with possible overlap of attributes, which is the more likely case. Our experimental results have validated the efficiency of our proposed model achieving promising accuracy for real deployment.

Distributed classification for pocket data mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Distributed and collaborative data stream mining in a mobile computing environment is referred to as Pocket Data Mining PDM. Large amounts of available data streams to which smart phones can subscribe to or sense, coupled with the increasing computational power of handheld devices motivates the development of PDM as a decision making system. This emerging area of study has shown to be feasible in an earlier study using technological enablers of mobile software agents and stream mining techniques [1]. A typical PDM process would start by having mobile agents roam the network to discover relevant data streams and resources. Then other (mobile) agents encapsulating stream mining techniques visit the relevant nodes in the network in order to build evolving data mining models. Finally, a third type of mobile agents roam the network consulting the mining agents for a final collaborative decision, when required by one or more users. In this paper, we propose the use of distributed Hoeffding trees and Naive Bayes classifers in the PDM framework over vertically partitioned data streams. Mobile policing, health monitoring and stock market analysis are among the possible applications of PDM. An extensive experimental study is reported showing the effectiveness of the collaborative data mining with the two classifers.

Homogeneous and heterogeneous distributed classification for pocket data mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pocket Data Mining (PDM) describes the full process of analysing data streams in mobile ad hoc distributed environments. Advances in mobile devices like smart phones and tablet computers have made it possible for a wide range of applications to run in such an environment. In this paper, we propose the adoption of data stream classification techniques for PDM. Evident by a thorough experimental study, it has been proved that running heterogeneous/different, or homogeneous/similar data stream classification techniques over vertically partitioned data (data partitioned according to the feature space) results in comparable performance to batch and centralised learning techniques.

P-found: grid-enabling distributed repositories of protein folding and unfolding simulations for data mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.

Demand response — a different form of distributed storage?

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reduced flexibility of low carbon generation could pose new challenges for future energy systems. Both demand response and distributed storage may have a role to play in supporting future system balancing. This paper reviews how these technically different, but functionally similar approaches compare and compete with one another. Household survey data is used to test the effectiveness of price signals to deliver demand responses for appliances with a high degree of agency. The underlying unit of storage for different demand response options is discussed, with particular focus on the ability to enhance demand side flexibility in the residential sector. We conclude that a broad range of options, with different modes of storage, may need to be considered, if residential demand flexibility is to be maximised.

Scaling up data mining techniques to large datasets using parallel and distributed processing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.

Power allocation strategies for distributed space-time codes in two-way relay networks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study a two-way relay network (TWRN), where distributed space-time codes are constructed across multiple relay terminals in an amplify-and-forward mode. Each relay transmits a scaled linear combination of its received symbols and their conjugates,with the scaling factor chosen based on automatic gain control. We consider equal power allocation (EPA) across the relays, as well as the optimal power allocation (OPA) strategy given access to instantaneous channel state information (CSI). For EPA, we derive an upper bound on the pairwise-error-probability (PEP), from which we prove that full diversity is achieved in TWRNs. This result is in contrast to one-way relay networks, in which case a maximum diversity order of only unity can be obtained. When instantaneous CSI is available at the relays, we show that the OPA which minimizes the conditional PEP of the worse link can be cast as a generalized linear fractional program, which can be solved efficiently using the Dinkelback-type procedure.We also prove that, if the sum-power of the relay terminals is constrained, then the OPA will activate at most two relays.

MICPA: a client-assisted channel assignment scheme for throughput enhancement in WLANs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The emergence of high-density wireless local area network (WLAN) deployments in recent years is a testament to the insatiable demands for wireless broadband services. The increased density of WLAN deployments brings with it the potential of increased capacity, extended coverage, and exciting new applications. However, the corresponding increase in contention and interference can significantly degrade throughputs, unless new challenges in channel assignment are effectively addressed. In this paper, a client-assisted channel assignment scheme that can provide enhanced throughput is proposed. A study on the impact of interference on throughput with multiple access points (APs)is first undertaken using a novel approach that determines the possibility of parallel transmissions. A metric with a good correlation to the throughput, i.e., the number of conflict pairs, is used in the client-assisted minimum conflict pairs (MICPA) scheme. In this scheme, measurements from clients are used to assist the AP in determining the channel with the minimum number of conflict pairs to maximize its expected throughput. Simulation results show that the client-assisted MICPA scheme can provide meaningful throughput improvements over other schemes that only utilize the AP’s measurements.

Real-time identification and modelling in pervasive mining

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces an architecture for identifying and modelling in real-time at a copper mine using new technologies as M2M and cloud computing with a server in the cloud and an Android client inside the mine. The proposed design brings up pervasive mining, a system with wider coverage, higher communication efficiency, better fault-tolerance, and anytime anywhere availability. This solution was designed for a plant inside the mine which cannot tolerate interruption and for which their identification in situ, in real time, is an essential part of the system to control aspects such as instability by adjusting their corresponding parameters without stopping the process.

The ModFOLD4 server for the quality assessment of 3D protein models

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Once you have generated a 3D model of a protein, how do you know whether it bears any resemblance to the actual structure? To determine the usefulness of 3D models of proteins, they must be assessed in terms of their quality by methods that predict their similarity to the native structure. The ModFOLD4 server is the latest version of our leading independent server for the estimation of both the global and local (per-residue) quality of 3D protein models. The server produces both machine readable and graphical output, providing users with intuitive visual reports on the quality of predicted protein tertiary structures. The ModFOLD4 server is freely available to all at: http://www.reading.ac.uk/bioinf/ModFOLD/.

Large databases in economic history: research methods and case studies

Relevância:

20.00% 20.00%

Publicador:

«
1
2
3
4
5
6
7
8
9
10
»