891 resultados para Application of Data-driven Modelling in Water Sciences
Resumo:
With the service life of water supply network (WSN) growth, the growing phenomenon of aging pipe network has become exceedingly serious. As urban water supply network is hidden underground asset, it is difficult for monitoring staff to make a direct classification towards the faults of pipe network by means of the modern detecting technology. In this paper, based on the basic property data (e.g. diameter, material, pressure, distance to pump, distance to tank, load, etc.) of water supply network, decision tree algorithm (C4.5) has been carried out to classify the specific situation of water supply pipeline. Part of the historical data was used to establish a decision tree classification model, and the remaining historical data was used to validate this established model. Adopting statistical methods were used to access the decision tree model including basic statistical method, Receiver Operating Characteristic (ROC) and Recall-Precision Curves (RPC). These methods has been successfully used to assess the accuracy of this established classification model of water pipe network. The purpose of classification model was to classify the specific condition of water pipe network. It is important to maintain the pipeline according to the classification results including asset unserviceable (AU), near perfect condition (NPC) and serious deterioration (SD). Finally, this research focused on pipe classification which plays a significant role in maintaining water supply networks in the future.
Resumo:
In this research the 3DVAR data assimilation scheme is implemented in the numerical model DIVAST in order to optimize the performance of the numerical model by selecting an appropriate turbulence scheme and tuning its parameters. Two turbulence closure schemes: the Prandtl mixing length model and the two-equation k-ε model were incorporated into DIVAST and examined with respect to their universality of application, complexity of solutions, computational efficiency and numerical stability. A square harbour with one symmetrical entrance subject to tide-induced flows was selected to investigate the structure of turbulent flows. The experimental part of the research was conducted in a tidal basin. A significant advantage of such laboratory experiment is a fully controlled environment where domain setup and forcing are user-defined. The research shows that the Prandtl mixing length model and the two-equation k-ε model, with default parameterization predefined according to literature recommendations, overestimate eddy viscosity which in turn results in a significant underestimation of velocity magnitudes in the harbour. The data assimilation of the model-predicted velocity and laboratory observations significantly improves model predictions for both turbulence models by adjusting modelled flows in the harbour to match de-errored observations. 3DVAR allows also to identify and quantify shortcomings of the numerical model. Such comprehensive analysis gives an optimal solution based on which numerical model parameters can be estimated. The process of turbulence model optimization by reparameterization and tuning towards optimal state led to new constants that may be potentially applied to complex turbulent flows, such as rapidly developing flows or recirculating flows.
Resumo:
This study presents an approach to combine uncertainties of the hydrological model outputs predicted from a number of machine learning models. The machine learning based uncertainty prediction approach is very useful for estimation of hydrological models' uncertainty in particular hydro-metrological situation in real-time application [1]. In this approach the hydrological model realizations from Monte Carlo simulations are used to build different machine learning uncertainty models to predict uncertainty (quantiles of pdf) of the a deterministic output from hydrological model . Uncertainty models are trained using antecedent precipitation and streamflows as inputs. The trained models are then employed to predict the model output uncertainty which is specific for the new input data. We used three machine learning models namely artificial neural networks, model tree, locally weighted regression to predict output uncertainties. These three models produce similar verification results, which can be improved by merging their outputs dynamically. We propose an approach to form a committee of the three models to combine their outputs. The approach is applied to estimate uncertainty of streamflows simulation from a conceptual hydrological model in the Brue catchment in UK and the Bagmati catchment in Nepal. The verification results show that merged output is better than an individual model output. [1] D. L. Shrestha, N. Kayastha, and D. P. Solomatine, and R. Price. Encapsulation of parameteric uncertainty statistics by various predictive machine learning models: MLUE method, Journal of Hydroinformatic, in press, 2013.
Resumo:
When an accurate hydraulic network model is available, direct modeling techniques are very straightforward and reliable for on-line leakage detection and localization applied to large class of water distribution networks. In general, this type of techniques based on analytical models can be seen as an application of the well-known fault detection and isolation theory for complex industrial systems. Nonetheless, the assumption of single leak scenarios is usually made considering a certain leak size pattern which may not hold in real applications. Upgrading a leak detection and localization method based on a direct modeling approach to handle multiple-leak scenarios can be, on one hand, quite straightforward but, on the other hand, highly computational demanding for large class of water distribution networks given the huge number of potential water loss hotspots. This paper presents a leakage detection and localization method suitable for multiple-leak scenarios and large class of water distribution networks. This method can be seen as an upgrade of the above mentioned method based on a direct modeling approach in which a global search method based on genetic algorithms has been integrated in order to estimate those network water loss hotspots and the size of the leaks. This is an inverse / direct modeling method which tries to take benefit from both approaches: on one hand, the exploration capability of genetic algorithms to estimate network water loss hotspots and the size of the leaks and on the other hand, the straightforwardness and reliability offered by the availability of an accurate hydraulic model to assess those close network areas around the estimated hotspots. The application of the resulting method in a DMA of the Barcelona water distribution network is provided and discussed. The obtained results show that leakage detection and localization under multiple-leak scenarios may be performed efficiently following an easy procedure.
Resumo:
Distributed energy and water balance models require time-series surfaces of the meteorological variables involved in hydrological processes. Most of the hydrological GIS-based models apply simple interpolation techniques to extrapolate the point scale values registered at weather stations at a watershed scale. In mountainous areas, where the monitoring network ineffectively covers the complex terrain heterogeneity, simple geostatistical methods for spatial interpolation are not always representative enough, and algorithms that explicitly or implicitly account for the features creating strong local gradients in the meteorological variables must be applied. Originally developed as a meteorological pre-processing tool for a complete hydrological model (WiMMed), MeteoMap has become an independent software. The individual interpolation algorithms used to approximate the spatial distribution of each meteorological variable were carefully selected taking into account both, the specific variable being mapped, and the common lack of input data from Mediterranean mountainous areas. They include corrections with height for both rainfall and temperature (Herrero et al., 2007), and topographic corrections for solar radiation (Aguilar et al., 2010). MeteoMap is a GIS-based freeware upon registration. Input data include weather station records and topographic data and the output consists of tables and maps of the meteorological variables at hourly, daily, predefined rainfall event duration or annual scales. It offers its own pre and post-processing tools, including video outlook, map printing and the possibility of exporting the maps to images or ASCII ArcGIS formats. This study presents the friendly user interface of the software and shows some case studies with applications to hydrological modeling.
Resumo:
The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.
Resumo:
The development of new, health supporting food of high quality and the optimization of food technological processes today require the application of statistical methods of experimental design. The principles and steps of statistical planning and evaluation of experiments will be explained. By example of the development of a gluten-free rusk (zwieback), which is enriched by roughage compounds the application of a simplex-centroid mixture design will be shown. The results will be illustrated by different graphics.
Resumo:
Streamflow forecasts at daily time scale are necessary for effective management of water resources systems. Typical applications include flood control, water quality management, water supply to multiple stakeholders, hydropower and irrigation systems. Conventionally physically based conceptual models and data-driven models are used for forecasting streamflows. Conceptual models require detailed understanding of physical processes governing the system being modeled. Major constraints in developing effective conceptual models are sparse hydrometric gauge network and short historical records that limit our understanding of physical processes. On the other hand, data-driven models rely solely on previous hydrological and meteorological data without directly taking into account the underlying physical processes. Among various data driven models Auto Regressive Integrated Moving Average (ARIMA), Artificial Neural Networks (ANNs) are most widely used techniques. The present study assesses performance of ARIMA and ANNs methods in arriving at one-to seven-day ahead forecast of daily streamflows at Basantpur streamgauge site that is situated at upstream of Hirakud Dam in Mahanadi river basin, India. The ANNs considered include Feed-Forward back propagation Neural Network (FFNN) and Radial Basis Neural Network (RBNN). Daily streamflow forecasts at Basantpur site find use in management of water from Hirakud reservoir. (C) 2015 The Authors. Published by Elsevier B.V.
Resumo:
25 p.
Resumo:
Simulations of forest stand dynamics in a modelling framework including Forest Vegetation Simulator (FVS) are diameter driven, thus the diameter or basal area increment model needs a special attention. This dissertation critically evaluates diameter or basal area increment models and modelling approaches in the context of the Great Lakes region of the United States and Canada. A set of related studies are presented that critically evaluate the sub-model for change in individual tree basal diameter used in the Forest Vegetation Simulator (FVS), a dominant forestry model in the Great Lakes region. Various historical implementations of the STEMS (Stand and Tree Evaluation and Modeling System) family of diameter increment models, including the current public release of the Lake States variant of FVS (LS-FVS), were tested for the 30 most common tree species using data from the Michigan Forest Inventory and Analysis (FIA) program. The results showed that current public release of the LS-FVS diameter increment model over-predicts 10-year diameter increment by 17% on average. Also the study affirms that a simple adjustment factor as a function of a single predictor, dbh (diameter at breast height) used in the past versions, provides an inadequate correction of model prediction bias. In order to re-engineer the basal diameter increment model, the historical, conceptual and philosophical differences among the individual tree increment model families and their modelling approaches were analyzed and discussed. Two underlying conceptual approaches toward diameter or basal area increment modelling have been often used: the potential-modifier (POTMOD) and composite (COMP) approaches, which are exemplified by the STEMS/TWIGS and Prognosis models, respectively. It is argued that both approaches essentially use a similar base function and neither is conceptually different from a biological perspective, even though they look different in their model forms. No matter what modelling approach is used, the base function is the foundation of an increment model. Two base functions – gamma and Box-Lucas – were identified as candidate base functions for forestry applications. The results of a comparative analysis of empirical fits showed that quality of fit is essentially similar, and both are sufficiently detailed and flexible for forestry applications. The choice of either base function in order to model diameter or basal area increment is dependent upon personal preference; however, the gamma base function may be preferred over the Box-Lucas, as it fits the periodic increment data in both a linear and nonlinear composite model form. Finally, the utility of site index as a predictor variable has been criticized, as it has been widely used in models for complex, mixed species forest stands though not well suited for this purpose. An alternative to site index in an increment model was explored, using site index and a combination of climate variables and Forest Ecosystem Classification (FEC) ecosites and data from the Province of Ontario, Canada. The results showed that a combination of climate and FEC ecosites variables can replace site index in the diameter increment model.
Resumo:
Building Information Modelling (BIM) has been regarded as a one stop shop capable of addressing the ills of the construction industry. Yet, while some firms have accepted BIM as a new way to work and gone on to record success, others (which have not so done) have raised such questions as: ‘How is BIM defined? Is it a tool or a process? Which kinds and sizes of organisations stand to benefit from BIM?’ These questions form the basis of this research. Hence, having explored the relevant body of literature, this research investigates three organisations within the UK – described as the earliest adopters of BIM – and considers how they have fared in terms of project performance in the years since adopting BIM; focusing on project cost, delivery time and quality achievement. This investigation also probed two of the leading voices in BIM in the UK in search of the much needed answers. The findings of the research show that successful projects executed in the organisations that have used BIM is predicated on its adoption as a process, rather than as a tool of technology; a process that changes the way work in the construction industry is typically done. Moreover, the successes recorded in the firms researched give credence to project success consequent upon adopting BIM. Nevertheless, the findings of this research show that the cornerstone of this success is leadership-driven innovation.
Resumo:
The importance of clean drinking water in any community is absolutely vital if we as the consumers are to sustain a life of health and wellbeing. Suspended particles in surface waters not only provide the means to transport micro-organisms which can cause serious infections and diseases, they can also affect the performance capacity of a water treatment plant. In such situations pre-treatment ahead of the main plant is recommended. Previous research carried out using non-woven synthetic as a pre-filter materials for protecting slow sand filters from high turbidity showed that filter run times can be extended by several times and filters can be regenerated by simply removing and washing of the fabric ( Mbwette and Graham, 1987 and Mbwette, 1991). Geosynthetic materials have been extensively used for soil retention and dewatering in geotechnical applications and little research exists for the application of turbidity reduction in water treatment. With the development of new materials in geosynthetics today, it was hypothesized that the turbidity removal efficiency can be improved further by selecting appropriate materials. Two different geosynthetic materials (75 micron) tested at a filtration rate of 0.7 m/h yielded 30-45% reduction in turbidity with relatively minor head loss. It was found that the non-woven geotextile Propex 1701 retained the highest performance in both filtration efficiency and head loss across the varying turbidity ranges in comparison to other geotextiles tested. With 5 layers of the Propex 1701 an average percent reduction of approximately 67% was achieved with a head loss average of 4mm over the two and half hour testing period. Using the data collected for the Propex 1701 a mathematical model was developed for predicting the expected percent reduction given the ability to control the cost and as a result the number of layers to be used in a given filtration scenario.