69 resultados para Data-driven energy e ciency


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for traffic flow prediction using correlation analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel k-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naïve Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cloud computing is offering utility-oriented IT services to users worldwide. Based on a pay-as-you-go model, it enables hosting of pervasive applications from consumer, scientific, and business domains. However, data centers hosting Cloud applications consume huge amounts of energy, contributing to high operational costs and carbon footprints to the environment. Therefore, we need Green Cloud computing solutions that can not only save energy for the environment but also reduce operational costs. This paper presents vision, challenges, and architectural elements for energy-efficient management of Cloud computing environments. We focus on the development of dynamic resource provisioning and allocation algorithms that consider the synergy between various data center infrastructures (i.e., the hardware, power units, cooling and software), and holistically work to boost data center energy efficiency and performance. In particular, this paper proposes (a) architectural principles for energy-efficient management of Clouds; (b) energy-efficient resource allocation policies and scheduling algorithms considering quality-of-service expectations, and devices power usage characteristics; and (c) a novel software technology for energy-efficient management of Clouds. We have validated our approach by conducting a set of rigorous performance evaluation study using the CloudSim toolkit. The results demonstrate that Cloud computing model has immense potential as it offers significant performance gains as regards to response time and cost saving under dynamic workload scenarios.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Intergovernmental Panel on Climate Change and the McKinsey Greenhouse Gas abatement studies have highlighted reduction of building energy consumption as a primary cost-effective element in the abatement of Global Warming. Nevertheless, the energy investigation in most of our existing building stock remains at a novice level at best. Building sub-metering, by which we mean any secondary, hourly, metering (after the main) of various circuits, provides substantial information on when and where energy is used in specific buildings. Furthermore, combining this information with external weather data provides information beyond basic metering results. This paper discusses three case studies and explains how sub-metering, augmented by external solar and temperature data, benefits energy management and identified problems. It explains how different methods of analysing energy usage allowed: justifiable sizing of a solar photovoltaic system, with a calculated Cooling Degree Unit, identified the absence of savings from a proprietary chiller controller, and the energy variation due to user schedules and external conditions indicated anomalies in energy use. The advantages of wireless access are noted. Extracting information in graphical formats suggests better strategies to understand and control energy use.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The assessment of the direct and indirect requirements for energy is known as embodied energy analysis. For buildings, the direct energy includes that used primarily on site, while the indirect energy includes primarily the energy required for the manufacture of building materials. This thesis is concerned with the completeness and reliability of embodied energy analysis methods. Previous methods tend to address either one of these issues, but not both at the same time. Industry-based methods are incomplete. National statistical methods, while comprehensive, are a ‘black box’ and are subject to errors. A new hybrid embodied energy analysis method is derived to optimise the benefits of previous methods while minimising their flaws. In industry-based studies, known as ‘process analyses’, the energy embodied in a product is traced laboriously upstream by examining the inputs to each preceding process towards raw materials. Process analyses can be significantly incomplete, due to increasing complexity. The other major embodied energy analysis method, ‘input-output analysis’, comprises the use of national statistics. While the input-output framework is comprehensive, many inherent assumptions make the results unreliable. Hybrid analysis methods involve the combination of the two major embodied energy analysis methods discussed above, either based on process analysis or input-output analysis. The intention in both hybrid analysis methods is to reduce errors associated with the two major methods on which they are based. However, the problems inherent to each of the original methods tend to remain, to some degree, in the associated hybrid versions. Process-based hybrid analyses tend to be incomplete, due to the exclusions associated with the process analysis framework. However, input-output-based hybrid analyses tend to be unreliable because the substitution of process analysis data into the input-output framework causes unwanted indirect effects. A key deficiency in previous input-output-based hybrid analysis methods is that the input-output model is a ‘black box’, since important flows of goods and services with respect to the embodied energy of a sector cannot be readily identified. A new input-output-based hybrid analysis method was therefore developed, requiring the decomposition of the input-output model into mutually exclusive components (ie, ‘direct energy paths’). A direct energy path represents a discrete energy requirement, possibly occurring one or more transactions upstream from the process under consideration. For example, the energy required directly to manufacture the steel used in the construction of a building would represent a direct energy path of one non-energy transaction in length. A direct energy path comprises a ‘product quantity’ (for example, the total tonnes of cement used) and a ‘direct energy intensity’ (for example, the energy required directly for cement manufacture, per tonne). The input-output model was decomposed into direct energy paths for the ‘residential building construction’ sector. It was shown that 592 direct energy paths were required to describe 90% of the overall total energy intensity for ‘residential building construction’. By extracting direct energy paths using yet smaller threshold values, they were shown to be mutually exclusive. Consequently, the modification of direct energy paths using process analysis data does not cause unwanted indirect effects. A non-standard individual residential building was then selected to demonstrate the benefits of the new input-output-based hybrid analysis method in cases where the products of a sector may not be similar. Particular direct energy paths were modified with case specific process analysis data. Product quantities and direct energy intensities were derived and used to modify some of the direct energy paths. The intention of this demonstration was to determine whether 90% of the total embodied energy calculated for the building could comprise the process analysis data normally collected for the building. However, it was found that only 51% of the total comprised normally collected process analysis. The integration of process analysis data with 90% of the direct energy paths by value was unsuccessful because: • typically only one of the direct energy path components was modified using process analysis data (ie, either the product quantity or the direct energy intensity); • of the complexity of the paths derived for ‘residential building construction’; and • of the lack of reliable and consistent process analysis data from industry, for both product quantities and direct energy intensities. While the input-output model used was the best available for Australia, many errors were likely to be carried through to the direct energy paths for ‘residential building construction’. Consequently, both the value and relative importance of the direct energy paths for ‘residential building construction’ were generally found to be a poor model for the demonstration building. This was expected. Nevertheless, in the absence of better data from industry, the input-output data is likely to remain the most appropriate for completing the framework of embodied energy analyses of many types of products—even in non-standard cases. ‘Residential building construction’ was one of the 22 most complex Australian economic sectors (ie, comprising those requiring between 592 and 3215 direct energy paths to describe 90% of their total energy intensities). Consequently, for the other 87 non-energy sectors of the Australian economy, the input-output-based hybrid analysis method is likely to produce more reliable results than those calculated for the demonstration building using the direct energy paths for ‘residential building construction’. For more complex sectors than ‘residential building construction’, the new input-output-based hybrid analysis method derived here allows available process analysis data to be integrated with the input-output data in a comprehensive framework. The proportion of the result comprising the more reliable process analysis data can be calculated and used as a measure of the reliability of the result for that product or part of the product being analysed (for example, a building material or component). To ensure that future applications of the new input-output-based hybrid analysis method produce reliable results, new sources of process analysis data are required, including for such processes as services (for example, ‘banking’) and processes involving the transformation of basic materials into complex products (for example, steel and copper into an electric motor). However, even considering the limitations of the demonstration described above, the new input-output-based hybrid analysis method developed achieved the aim of the thesis: to develop a new embodied energy analysis method that allows reliable process analysis data to be integrated into the comprehensive, yet unreliable, input-output framework. Plain language summary Embodied energy analysis comprises the assessment of the direct and indirect energy requirements associated with a process. For example, the construction of a building requires the manufacture of steel structural members, and thus indirectly requires the energy used directly and indirectly in their manufacture. Embodied energy is an important measure of ecological sustainability because energy is used in virtually every human activity and many of these activities are interrelated. This thesis is concerned with the relationship between the completeness of embodied energy analysis methods and their reliability. However, previous industry-based methods, while reliable, are incomplete. Previous national statistical methods, while comprehensive, are a ‘black box’ subject to errors. A new method is derived, involving the decomposition of the comprehensive national statistical model into components that can be modified discretely using the more reliable industry data, and is demonstrated for an individual building. The demonstration failed to integrate enough industry data into the national statistical model, due to the unexpected complexity of the national statistical data and the lack of available industry data regarding energy and non-energy product requirements. These unique findings highlight the flaws in previous methods. Reliable process analysis and input-output data are required, particularly for those processes that were unable to be examined in the demonstration of the new embodied energy analysis method. This includes the energy requirements of services sectors, such as banking, and processes involving the transformation of basic materials into complex products, such as refrigerators. The application of the new method to less complex products, such as individual building materials or components, is likely to be more successful than to the residential building demonstration.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Autism Spectrum Disorder (ASD) is growing at a staggering rate, but, little is known about the cause of this condition. Inferring learning patterns from therapeutic performance data, and subsequently clustering ASD children into subgroups, is important to understand this domain, and more importantly to inform evidence-based intervention. However, this data-driven task was difficult in the past due to insufficiency of data to perform reliable analysis. For the first time, using data from a recent application for early intervention in autism (TOBY Play pad), whose download count is now exceeding 4500, we present in this paper the automatic discovery of learning patterns across 32 skills in sensory, imitation and language. We use unsupervised learning methods for this task, but a notorious problem with existing methods is the correct specification of number of patterns in advance, which in our case is even more difficult due to complexity of the data. To this end, we appeal to recent Bayesian nonparametric methods, in particular the use of Bayesian Nonparametric Factor Analysis. This model uses Indian Buffet Process (IBP) as prior on a binary matrix of infinite columns to allocate groups of intervention skills to children. The optimal number of learning patterns as well as subgroup assignments are inferred automatically from data. Our experimental results follow an exploratory approach, present different newly discovered learning patterns. To provide quantitative results, we also report the clustering evaluation against K-means and Nonnegative matrix factorization (NMF). In addition to the novelty of this new problem, we were able to demonstrate the suitability of Bayesian nonparametric models over parametric rivals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Extracting knowledge from the transaction records and the personal data of credit card holders has great profit potential for the banking industry. The challenge is to detect/predict bankrupts and to keep and recruit the profitable customers. However, grouping and targeting credit card customers by traditional data-driven mining often does not directly meet the needs of the banking industry, because data-driven mining automatically generates classification outputs that are imprecise, meaningless, and beyond users' control. In this paper, we provide a novel domain-driven classification method that takes advantage of multiple criteria and multiple constraint-level programming for intelligent credit scoring. The method involves credit scoring to produce a set of customers' scores that allows the classification results actionable and controllable by human interaction during the scoring process. Domain knowledge and experts' experience parameters are built into the criteria and constraint functions of mathematical programming and the human and machine conversation is employed to generate an efficient and precise solution. Experiments based on various data sets validated the effectiveness and efficiency of the proposed methods. © 2006 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we develop a data-driven weight learning method for weighted quasi-arithmetic means where the observed data may vary in dimension.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to the increasing energy consumption in cloud data centers, energy saving has become a vital objective in designing the underlying cloud infrastructures. A precise energy consumption model is the foundation of many energy-saving strategies. This paper focuses on exploring the energy consumption of virtual machines running various CPU-intensive activities in the cloud server using two types of models: traditional time-series models, such as ARMA and ES, and time-series segmentation models, such as sliding windows model and bottom-up model. We have built a cloud environment using OpenStack, and conducted extensive experiments to analyze and compare the prediction accuracy of these strategies. The results indicate that the performance of ES model is better than the ARMA model in predicting the energy consumption of known activities. When predicting the energy consumption of unknown activities, sliding windows segmentation model and bottom-up segmentation model can all have satisfactory performance but the former is slightly better than the later.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cloud computing as the latest computing paradigm has shown its promising future in business workflow systems facing massive concurrent user requests and complicated computing tasks. With the fast growth of cloud data centers, energy management especially energy monitoring and saving in cloud workflow systems has been attracting increasing attention. It is obvious that the energy for running a cloud workflow instance is mainly dependent on the energy for executing its workflow activities. However, existing energy management strategies mainly monitor the virtual machines instead of the workflow activities running on them, and hence it is difficult to directly monitor and optimize the energy consumption of cloud workflows. To address such an issue, in this paper, we propose an effective energy testing framework for cloud workflow activities. This framework can help to accurately test and analyze the baseline energy of physical and virtual machines in the cloud environment, and then obtain the energy consumption data of cloud workflow activities. Based on these data, we can further produce the energy consumption model and apply energy prediction strategies. Our experiments are conducted in an OpenStack based cloud computing environment. The effectiveness of our framework has been successfully verified through a detailed case study and a set of energy modelling and prediction experiments based on representative time-series models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although the development of geographic information system (GIS) technology and digital data manipulation techniques has enabled practitioners in the geographical and geophysical sciences to make more efficient use of resource information, many of the methods used in forming spatial prediction models are still inherently based on traditional techniques of map stacking in which layers of data are combined under the guidance of a theoretical domain model. This paper describes a data-driven approach by which Artificial Neural Networks (ANNs) can be trained to represent a function characterising the probability that an instance of a discrete event, such as the presence of a mineral deposit or the sighting of an endangered animal species, will occur over some grid element of the spatial area under consideration. A case study describes the application of the technique to the task of mineral prospectivity mapping in the Castlemaine region of Victoria using a range of geological, geophysical and geochemical input variables. Comparison of the maps produced using neural networks with maps produced using a density estimation-based technique demonstrates that the maps can reliably be interpreted as representing probabilities. However, while the neural network model and the density estimation-based model yield similar results under an appropriate choice of values for the respective parameters, the neural network approach has several advantages, especially in high dimensional input spaces.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mineral potential mapping is the process of combining a set of input maps, each representing a distinct geo-scientific variable, to produce a single map which ranks areas according to their potential to host deposits of a particular type. The maps are combined using a mapping function which must be either provided by an expert (knowledge-driven approach), or induced from sample data (data-driven approach). Current data-driven approaches using multilayer perceptrons (MLPs) to represent the mapping function have several inherent problems: they rely heavily on subjective judgment in selecting training data and are highly sensitive to this selection; they do not utilize the contextual information provided by unlabeled data; and, there is no objective interpretation of the values output by the MLP. This paper presents a novel approach which overcomes these three problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As part of an ongoing project, a life cycle inventory (LCI) of aluminium high pressure die casting (HPDC) has been collected. This has been conducted from the view of an individual product and also the entire process. The objective of the study was to analyse the process and suggest changes to reduce environmental impacts. One modem aluminium high pressure die casting plant located in Victoria, Australia was evaluated and modelled. Site specific data on energy and materials was gathered and the process was modelled using a typical automotive component. The paper also presents our experience and methodology used in this inventory data collection process from the real industry for LCA purposes. The inventory data collected itself reveals that the HPDC process is energy intensive and as such the major emissions were from the use of natural gas fired furnaces and from the brown coal derived electricity. It is also found the large environmental benefits of using secondary aluminium over primary aluminium in the HPDC process. A detailed LCA is being cal1ied out based on the inventory obtained.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We consider a random design model based on independent and identically distributed (iid) pairs of observations (Xi, Yi), where the regression function m(x) is given by m(x) = E(Yi|Xi = x) with one independent variable. In a nonparametric setting the aim is to produce a reasonable approximation to the unknown function m(x) when we have no precise information about the form of the true density, f(x) of X. We describe an estimation procedure of non-parametric regression model at a given point by some appropriately constructed fixed-width (2d) confidence interval with the confidence coefficient of at least 1−. Here, d(> 0) and 2 (0, 1) are two preassigned values. Fixed-width confidence intervals are developed using both Nadaraya-Watson and local linear kernel estimators of nonparametric regression with data-driven bandwidths.

The sample size was optimized using the purely and two-stage sequential procedure together with asymptotic properties of the Nadaraya-Watson and local linear estimators. A large scale simulation study was performed to compare their coverage accuracy. The numerical results indicate that the confidence bands based on the local linear estimator have the best performance than those constructed by using Nadaraya-Watson estimator. However both estimators are shown to have asymptotically correct coverage properties.