995 resultados para Harrison, Frederic,
Resumo:
The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform data mining and other analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data that is used to populate the second component, and a data warehouse that contains important molecular properties. These properties may be used for data mining studies. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular, we look at two aspects: firstly, how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories — this is an important and challenging aspect of P-found, due to the large data volumes involved and the desire of scientists to maintain control of their own data. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling scientific discovery.
Resumo:
Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.
Resumo:
Generally classifiers tend to overfit if there is noise in the training data or there are missing values. Ensemble learning methods are often used to improve a classifier's classification accuracy. Most ensemble learning approaches aim to improve the classification accuracy of decision trees. However, alternative classifiers to decision trees exist. The recently developed Random Prism ensemble learner for classification aims to improve an alternative classification rule induction approach, the Prism family of algorithms, which addresses some of the limitations of decision trees. However, Random Prism suffers like any ensemble learner from a high computational overhead due to replication of the data and the induction of multiple base classifiers. Hence even modest sized datasets may impose a computational challenge to ensemble learners such as Random Prism. Parallelism is often used to scale up algorithms to deal with large datasets. This paper investigates parallelisation for Random Prism, implements a prototype and evaluates it empirically using a Hadoop computing cluster.
Resumo:
Although neurokinin 1 receptor antagonists prevent ethanol (EtOH)-induced gastric lesions, the mechanisms by which EtOH releases substance P (SP) and SP damages the mucosa are unknown. We hypothesized that EtOH activates transient receptor potential vanilloid 1 (TRPV1) on sensory nerves to release SP, which stimulates epithelial neurokinin 1 receptors to generate damaging reactive oxygen species (ROS). SP release was assayed in the mouse stomach, ROS were detected using dichlorofluorescein diacetate, and neurokinin 1 receptors were localized by immunofluorescence. EtOH-induced SP release was prevented by TRPV1 antagonism. High dose EtOH caused lesions, and TRPV1 or neurokinin 1 receptor antagonism and neurokinin 1 receptor deletion inhibited lesion formation. Coadministration of low, innocuous doses of EtOH and SP caused lesions by a TRPV1-independent but neurokinin 1 receptor-dependent process. EtOH, capsaicin, and SP stimulated generation of ROS by superficial gastric epithelial cells expressing neurokinin 1 receptors by a neurokinin 1 receptor-dependent mechanism. ROS scavengers prevented lesions induced by a high EtOH dose or a low EtOH dose plus SP. Gastric lesions are caused by an initial detrimental effect of EtOH, which is damaging only if associated with TRPV1 activation, SP release from sensory nerves, stimulation of neurokinin 1 receptors on epithelial cells, and ROS generation.
Resumo:
Soils most obviously contribute to food security in their essential role in crop and fodder production, so affecting the local availability of particular foods. They also have a direct influence on the ability to distribute food, the nutritional value of some foods and, in some societies, the access to certain foods through local processes of allocation and preferences. The inherent fertility of some soils is greater than that of others, so that crop yields vary greatly under semi-natural conditions. Husbandry practices, including the use of manures and fertilisers, have evolved to improve biological, chemical and physical components of soil fertility and thereby increase crop production. The challenge for the future is to sustain soil fertility in ways that increase the yield per unit area while simultaneously avoiding other detrimental environmental consequences. This will require increased effort to develop practices that use inputs such as nutrients, water and energy more efficiently. Opportunities to achieve this include adopting more effective ways to apply water and nutrients, adopting tillage practices that promote water infiltration and increase of organic matter, and breeding to improve the effectiveness of root systems in utilising soil-based resources.
Resumo:
Climate is an important control on biomass burning, but the sensitivity of fire to changes in temperature and moisture balance has not been quantified. We analyze sedimentary charcoal records to show that the changes in fire regime over the past 21,000 yrs are predictable from changes in regional climates. Analyses of paleo- fire data show that fire increases monotonically with changes in temperature and peaks at intermediate moisture levels, and that temperature is quantitatively the most important driver of changes in biomass burning over the past 21,000 yrs. Given that a similar relationship between climate drivers and fire emerges from analyses of the interannual variability in biomass burning shown by remote-sensing observations of month-by-month burnt area between 1996 and 2008, our results signal a serious cause for concern in the face of continuing global warming.
Resumo:
Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.
Resumo:
Layer clouds are globally extensive. Their lower edges are charged negatively by the fair weather atmospheric electricity current flowing vertically through them. Using polar winter surface meteorological data from Sodankyla ̈ (Finland) and Halley (Antarctica), we find that when meteorological diurnal variations are weak, an appreciable diurnal cycle, on average, persists in the cloud base heights, detected using a laser ceilometer. The diurnal cloud base heights from both sites correlate more closely with the Carnegie curve of global atmospheric electricity than with local meteorological measurements. The cloud base sensitivities are indistinguishable between the northern and southern hemispheres, averaging a (4.0 ± 0.5) m rise for a 1% change in the fair weather electric current density. This suggests that the global fair weather current, which is affected by space weather, cosmic rays and the El Nin ̃o Southern Oscillation, is linked with layer cloud properties.
Resumo:
The global atmospheric electric circuit is driven by thunderstorms and electrified rain/shower clouds and is also influenced by energetic charged particles from space. The global circuit maintains the ionosphere as an equipotential at∼+250 kV with respect to the good conducting Earth (both land and oceans). Its “load” is the fair weather atmosphere and semi-fair weather atmosphere at large distances from the disturbed weather “generator” regions. The main solar-terrestrial (or space weather) influence on the global circuit arises from spatially and temporally varying fluxes of galactic cosmic rays (GCRs) and energetic electrons precipitating from the magnetosphere. All components of the circuit exhibit much variability in both space and time. Global circuit variations between solar maximum and solar minimum are considered together with Forbush decrease and solar flare effects. The variability in ion concentration and vertical current flow are considered in terms of radiative effects in the troposphere, through infra-red absorption, and cloud effects, in particular possible cloud microphysical effects from charging at layer cloud edges. The paper identifies future research areas in relation to Task Group 4 of the Climate and Weather of the Sun-Earth System (CAWSES-II) programme.
Resumo:
This work presents two schemes of measuring the linear and angular kinematics of a rigid body using a kinematically redundant array of triple-axis accelerometers with potential applications in biomechanics. A novel angular velocity estimation algorithm is proposed and evaluated that can compensate for angular velocity errors using measurements of the direction of gravity. Analysis and discussion of optimal sensor array characteristics are provided. A damped 2 axis pendulum was used to excite all 6 DoF of the a suspended accelerometer array through determined complex motion and is the basis of both simulation and experimental studies. The relationship between accuracy and sensor redundancy is investigated for arrays of up to 100 triple axis (300 accelerometer axes) accelerometers in simulation and 10 equivalent sensors (30 accelerometer axes) in the laboratory test rig. The paper also reports on the sensor calibration techniques and hardware implementation.
Resumo:
Sustainable intensification is seen as the main route for meeting the world’s increasing demands for food and fibre. As demands mount for greater efficiency in the use of resources to achieve this goal, so the focus on roots and rootstocks and their role in acquiring water and nutrients, and overcoming pests and pathogens, is increasing. The purpose of this review is to explore some of the ways in which understanding root systems and their interactions with soils could contribute to the development of more sustainable systems of intensive production. Physical interactions with soil particles limit root growth if soils are dense, but root–soil contact is essential for optimal growth and uptake of water and nutrients. X-ray microtomography demonstrated that maize roots elongated more rapidly with increasing root–soil contact, as long as mechanical impedance was not limiting root elongation, while lupin was less sensitive to changes in root–soil contact. In addition to selecting for root architecture and rhizosphere properties, the growth of many plants in cultivated systems is profoundly affected by selection of an appropriate rootstock. Several mechanisms for scion control by rootstocks have been suggested, but the causal signals are still uncertain and may differ between crop species. Linkage map locations for quantitative trait loci for disease resistance and other traits of interest in rootstock breeding are becoming available. Designing root systems and rootstocks for specific environments is becoming a feasible target.
Resumo:
The Twitter network has been labelled the most commonly used microblogging application around today. With about 500 million estimated registered users as of June, 2012, Twitter has become a credible medium of sentiment/opinion expression. It is also a notable medium for information dissemination; including breaking news on diverse issues since it was launched in 2007. Many organisations, individuals and even government bodies follow activities on the network in order to obtain knowledge on how their audience reacts to tweets that affect them. We can use postings on Twitter (known as tweets) to analyse patterns associated with events by detecting the dynamics of the tweets. A common way of labelling a tweet is by including a number of hashtags that describe its contents. Association Rule Mining can find the likelihood of co-occurrence of hashtags. In this paper, we propose the use of temporal Association Rule Mining to detect rule dynamics, and consequently dynamics of tweets. We coined our methodology Transaction-based Rule Change Mining (TRCM). A number of patterns are identifiable in these rule dynamics including, new rules, emerging rules, unexpected rules and ?dead' rules. Also the linkage between the different types of rule dynamics is investigated experimentally in this paper.