32 resultados para Adaptive Expandable Data-Pump

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Bollène-2002 Experiment was aimed at developing the use of a radar volume-scanning strategy for conducting radar rainfall estimations in the mountainous regions of France. A developmental radar processing system, called Traitements Régionalisés et Adaptatifs de Données Radar pour l’Hydrologie (Regionalized and Adaptive Radar Data Processing for Hydrological Applications), has been built and several algorithms were specifically produced as part of this project. These algorithms include 1) a clutter identification technique based on the pulse-to-pulse variability of reflectivity Z for noncoherent radar, 2) a coupled procedure for determining a rain partition between convective and widespread rainfall R and the associated normalized vertical profiles of reflectivity, and 3) a method for calculating reflectivity at ground level from reflectivities measured aloft. Several radar processing strategies, including nonadaptive, time-adaptive, and space–time-adaptive variants, have been implemented to assess the performance of these new algorithms. Reference rainfall data were derived from a careful analysis of rain gauge datasets furnished by the Cévennes–Vivarais Mediterranean Hydrometeorological Observatory. The assessment criteria for five intense and long-lasting Mediterranean rain events have proven that good quantitative precipitation estimates can be obtained from radar data alone within 100-km range by using well-sited, well-maintained radar systems and sophisticated, physically based data-processing systems. The basic requirements entail performing accurate electronic calibration and stability verification, determining the radar detection domain, achieving efficient clutter elimination, and capturing the vertical structure(s) of reflectivity for the target event. Radar performance was shown to depend on type of rainfall, with better results obtained with deep convective rain systems (Nash coefficients of roughly 0.90 for point radar–rain gauge comparisons at the event time step), as opposed to shallow convective and frontal rain systems (Nash coefficients in the 0.6–0.8 range). In comparison with time-adaptive strategies, the space–time-adaptive strategy yields a very significant reduction in the radar–rain gauge bias while the level of scatter remains basically unchanged. Because the Z–R relationships have not been optimized in this study, results are attributed to an improved processing of spatial variations in the vertical profile of reflectivity. The two main recommendations for future work consist of adapting the rain separation method for radar network operations and documenting Z–R relationships conditional on rainfall type.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In a world of almost permanent and rapidly increasing electronic data availability, techniques of filtering, compressing, and interpreting this data to transform it into valuable and easily comprehensible information is of utmost importance. One key topic in this area is the capability to deduce future system behavior from a given data input. This book brings together for the first time the complete theory of data-based neurofuzzy modelling and the linguistic attributes of fuzzy logic in a single cohesive mathematical framework. After introducing the basic theory of data-based modelling, new concepts including extended additive and multiplicative submodels are developed and their extensions to state estimation and data fusion are derived. All these algorithms are illustrated with benchmark and real-life examples to demonstrate their efficiency. Chris Harris and his group have carried out pioneering work which has tied together the fields of neural networks and linguistic rule-based algortihms. This book is aimed at researchers and scientists in time series modeling, empirical data modeling, knowledge discovery, data mining, and data fusion.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The simulation and development work that has been undertaken to produce a signal equaliser used to improve the data rates from oil well logging instruments is presented. The instruments are lowered into the drill bore hole suspended by a cable which has poor electrical characteristics. The equaliser described in the paper corrects for the distortions that occur from the cable (dispersion and attenuation) with the result that the instrument can send data at 100 K.bits/second down its own suspension cable of 12 Km in length. The use of simulation techniques and tools were invaluable in generating a model for the distortions and proved to be a useful tool when site testing was not available.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper extensions to an existing tracking algorithm are described. These extensions implement adaptive tracking constraints in the form of regional upper-bound displacements and an adaptive track smoothness constraint. Together, these constraints make the tracking algorithm more flexible than the original algorithm (which used fixed tracking parameters) and provide greater confidence in the tracking results. The result of applying the new algorithm to high-resolution ECMWF reanalysis data is shown as an example of its effectiveness.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a cell by cell anisotropic adaptive mesh technique is added to an existing staggered mesh Lagrange plus remap finite element ALE code for the solution of the Euler equations. The quadrilateral finite elements may be subdivided isotropically or anisotropically and a hierarchical data structure is employed. An efficient computational method is proposed, which only solves on the finest level of resolution that exists for each part of the domain with disjoint or hanging nodes being used at resolution transitions. The Lagrangian, equipotential mesh relaxation and advection (solution remapping) steps are generalised so that they may be applied on the dynamic mesh. It is shown that for a radial Sod problem and a two-dimensional Riemann problem the anisotropic adaptive mesh method runs over eight times faster.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Planning a project with proper considerations of all necessary factors and managing a project to ensure its successful implementation will face a lot of challenges. Initial stage in planning a project for bidding a project is costly, time consuming and usually with poor accuracy on cost and effort predictions. On the other hand, detailed information for previous projects may be buried in piles of archived documents which can be increasingly difficult to learn from the previous experiences. Project portfolio has been brought into this field aiming to improve the information sharing and management among different projects. However, the amount of information that could be shared is still limited to generic information. This paper, we report a recently developed software system COBRA to automatically generate a project plan with effort estimation of time and cost based on data collected from previous completed projects. To maximise the data sharing and management among different projects, we proposed a method of using product based planning from PRINCE2 methodology. (Automated Project Information Sharing and Management System -�COBRA) Keywords: project management, product based planning, best practice, PRINCE2

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In clinical trials, situations often arise where more than one response from each patient is of interest; and it is required that any decision to stop the study be based upon some or all of these measures simultaneously. Theory for the design of sequential experiments with simultaneous bivariate responses is described by Jennison and Turnbull (Jennison, C., Turnbull, B. W. (1993). Group sequential tests for bivariate response: interim analyses of clinical trials with both efficacy and safety endpoints. Biometrics 49:741-752) and Cook and Farewell (Cook, R. J., Farewell, V. T. (1994). Guidelines for monitoring efficacy and toxicity responses in clinical trials. Biometrics 50:1146-1152) in the context of one efficacy and one safety response. These expositions are in terms of normally distributed data with known covariance. The methods proposed require specification of the correlation, ρ between test statistics monitored as part of the sequential test. It can be difficult to quantify ρ and previous authors have suggested simply taking the lowest plausible value, as this will guarantee power. This paper begins with an illustration of the effect that inappropriate specification of ρ can have on the preservation of trial error rates. It is shown that both the type I error and the power can be adversely affected. As a possible solution to this problem, formulas are provided for the calculation of correlation from data collected as part of the trial. An adaptive approach is proposed and evaluated that makes use of these formulas and an example is provided to illustrate the method. Attention is restricted to the bivariate case for ease of computation, although the formulas derived are applicable in the general multivariate case.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sequential methods provide a formal framework by which clinical trial data can be monitored as they accumulate. The results from interim analyses can be used either to modify the design of the remainder of the trial or to stop the trial as soon as sufficient evidence of either the presence or absence of a treatment effect is available. The circumstances under which the trial will be stopped with a claim of superiority for the experimental treatment, must, however, be determined in advance so as to control the overall type I error rate. One approach to calculating the stopping rule is the group-sequential method. A relatively recent alternative to group-sequential approaches is the adaptive design method. This latter approach provides considerable flexibility in changes to the design of a clinical trial at an interim point. However, a criticism is that the method by which evidence from different parts of the trial is combined means that a final comparison of treatments is not based on a sufficient statistic for the treatment difference, suggesting that the method may lack power. The aim of this paper is to compare two adaptive design approaches with the group-sequential approach. We first compare the form of the stopping boundaries obtained using the different methods. We then focus on a comparison of the power of the different trials when they are designed so as to be as similar as possible. We conclude that all methods acceptably control type I error rate and power when the sample size is modified based on a variance estimate, provided no interim analysis is so small that the asymptotic properties of the test statistic no longer hold. In the latter case, the group-sequential approach is to be preferred. Provided that asymptotic assumptions hold, the adaptive design approaches control the type I error rate even if the sample size is adjusted on the basis of an estimate of the treatment effect, showing that the adaptive designs allow more modifications than the group-sequential method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The identification of signatures of natural selection in genomic surveys has become an area of intense research, stimulated by the increasing ease with which genetic markers can be typed. Loci identified as subject to selection may be functionally important, and hence (weak) candidates for involvement in disease causation. They can also be useful in determining the adaptive differentiation of populations, and exploring hypotheses about speciation. Adaptive differentiation has traditionally been identified from differences in allele frequencies among different populations, summarised by an estimate of F-ST. Low outliers relative to an appropriate neutral population-genetics model indicate loci subject to balancing selection, whereas high outliers suggest adaptive (directional) selection. However, the problem of identifying statistically significant departures from neutrality is complicated by confounding effects on the distribution of F-ST estimates, and current methods have not yet been tested in large-scale simulation experiments. Here, we simulate data from a structured population at many unlinked, diallelic loci that are predominantly neutral but with some loci subject to adaptive or balancing selection. We develop a hierarchical-Bayesian method, implemented via Markov chain Monte Carlo (MCMC), and assess its performance in distinguishing the loci simulated under selection from the neutral loci. We also compare this performance with that of a frequentist method, based on moment-based estimates of F-ST. We find that both methods can identify loci subject to adaptive selection when the selection coefficient is at least five times the migration rate. Neither method could reliably distinguish loci under balancing selection in our simulations, even when the selection coefficient is twenty times the migration rate.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents in detail a theoretical adaptive model of thermal comfort based on the “Black Box” theory, taking into account factors such as culture, climate, social, psychological and behavioural adaptations, which have an impact on the senses used to detect thermal comfort. The model is called the Adaptive Predicted Mean Vote (aPMV) model. The aPMV model explains, by applying the cybernetics concept, the phenomena that the Predicted Mean Vote (PMV) is greater than the Actual Mean Vote (AMV) in free-running buildings, which has been revealed by many researchers in field studies. An Adaptive coefficient (λ) representing the adaptive factors that affect the sense of thermal comfort has been proposed. The empirical coefficients in warm and cool conditions for the Chongqing area in China have been derived by applying the least square method to the monitored onsite environmental data and the thermal comfort survey results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Self-organizing neural networks have been implemented in a wide range of application areas such as speech processing, image processing, optimization and robotics. Recent variations to the basic model proposed by the authors enable it to order state space using a subset of the input vector and to apply a local adaptation procedure that does not rely on a predefined test duration limit. Both these variations have been incorporated into a new feature map architecture that forms an integral part of an Hybrid Learning System (HLS) based on a genetic-based classifier system. Problems are represented within HLS as objects characterized by environmental features. Objects controlled by the system have preset targets set against a subset of their features. The system's objective is to achieve these targets by evolving a behavioural repertoire that efficiently explores and exploits the problem environment. Feature maps encode two types of knowledge within HLS — long-term memory traces of useful regularities within the environment and the classifier performance data calibrated against an object's feature states and targets. Self-organization of these networks constitutes non-genetic-based (experience-driven) learning within HLS. This paper presents a description of the HLS architecture and an analysis of the modified feature map implementing associative memory. Initial results are presented that demonstrate the behaviour of the system on a simple control task.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Autism spectrum disorder (ASD) was once considered to be highly associated with intellectual disability and to show a characteristic IQ profile, with strengths in performance over verbal abilities and a distinctive pattern of ‘peaks’ and ‘troughs’ at the subtest level. However, there are few data from epidemiological studies. Method Comprehensive clinical assessments were conducted with 156 children aged 10–14 years [mean (s.d.)=11.7 (0.9)], seen as part of an epidemiological study (81 childhood autism, 75 other ASD). A sample weighting procedure enabled us to estimate characteristics of the total ASD population. Results Of the 75 children with ASD, 55% had an intellectual disability (IQ<70) but only 16% had moderate to severe intellectual disability (IQ<50); 28% had average intelligence (115>IQ>85) but only 3% were of above average intelligence (IQ>115). There was some evidence for a clinically significant Performance/Verbal IQ (PIQ/VIQ) discrepancy but discrepant verbal versus performance skills were not associated with a particular pattern of symptoms, as has been reported previously. There was mixed evidence of a characteristic subtest profile: whereas some previously reported patterns were supported (e.g. poor Comprehension), others were not (e.g. no ‘peak’ in Block Design). Adaptive skills were significantly lower than IQ and were associated with severity of early social impairment and also IQ. Conclusions In this epidemiological sample, ASD was less strongly associated with intellectual disability than traditionally held and there was only limited evidence of a distinctive IQ profile. Adaptive outcome was significantly impaired even for those children of average intelligence.