14 resultados para Data mining, Business intelligence, Previsioni di mercato
em Aston University Research Archive
Resumo:
The purpose of this research is to propose a procurement system across other disciplines and retrieved information with relevant parties so as to have a better co-ordination between supply and demand sides. This paper demonstrates how to analyze the data with an agent-based procurement system (APS) to re-engineer and improve the existing procurement process. The intelligence agents take the responsibility of searching the potential suppliers, negotiation with the short-listed suppliers and evaluating the performance of suppliers based on the selection criteria with mathematical model. Manufacturing firms and trading companies spend more than half of their sales dollar in the purchase of raw material and components. Efficient data collection with high accuracy is one of the key success factors to generate quality procurement which is to purchasing right material at right quality from right suppliers. In general, the enterprises spend a significant amount of resources on data collection and storage, but too little on facilitating data analysis and sharing. To validate the feasibility of the approach, a case study on a manufacturing small and medium-sized enterprise (SME) has been conducted. APS supports the data and information analyzing technique to facilitate the decision making such that the agent can enhance the negotiation and suppler evaluation efficiency by saving time and cost.
Resumo:
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.
Resumo:
We introduce a flexible visual data mining framework which combines advanced projection algorithms from the machine learning domain and visual techniques developed in the information visualization domain. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection algorithms, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates and billboarding, to provide a visual data mining framework. Results on a real-life chemoinformatics dataset using GTM are promising and have been analytically compared with the results from the traditional projection methods. It is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework. Copyright 2006 ACM.
Resumo:
Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.
Resumo:
Guest editorial Ali Emrouznejad is a Senior Lecturer at the Aston Business School in Birmingham, UK. His areas of research interest include performance measurement and management, efficiency and productivity analysis as well as data mining. He has published widely in various international journals. He is an Associate Editor of IMA Journal of Management Mathematics and Guest Editor to several special issues of journals including Journal of Operational Research Society, Annals of Operations Research, Journal of Medical Systems, and International Journal of Energy Management Sector. He is in the editorial board of several international journals and co-founder of Performance Improvement Management Software. William Ho is a Senior Lecturer at the Aston University Business School. Before joining Aston in 2005, he had worked as a Research Associate in the Department of Industrial and Systems Engineering at the Hong Kong Polytechnic University. His research interests include supply chain management, production and operations management, and operations research. He has published extensively in various international journals like Computers & Operations Research, Engineering Applications of Artificial Intelligence, European Journal of Operational Research, Expert Systems with Applications, International Journal of Production Economics, International Journal of Production Research, Supply Chain Management: An International Journal, and so on. His first authored book was published in 2006. He is an Editorial Board member of the International Journal of Advanced Manufacturing Technology and an Associate Editor of the OR Insight Journal. Currently, he is a Scholar of the Advanced Institute of Management Research. Uses of frontier efficiency methodologies and multi-criteria decision making for performance measurement in the energy sector This special issue aims to focus on holistic, applied research on performance measurement in energy sector management and for publication of relevant applied research to bridge the gap between industry and academia. After a rigorous refereeing process, seven papers were included in this special issue. The volume opens with five data envelopment analysis (DEA)-based papers. Wu et al. apply the DEA-based Malmquist index to evaluate the changes in relative efficiency and the total factor productivity of coal-fired electricity generation of 30 Chinese administrative regions from 1999 to 2007. Factors considered in the model include fuel consumption, labor, capital, sulphur dioxide emissions, and electricity generated. The authors reveal that the east provinces were relatively and technically more efficient, whereas the west provinces had the highest growth rate in the period studied. Ioannis E. Tsolas applies the DEA approach to assess the performance of Greek fossil fuel-fired power stations taking undesirable outputs into consideration, such as carbon dioxide and sulphur dioxide emissions. In addition, the bootstrapping approach is deployed to address the uncertainty surrounding DEA point estimates, and provide bias-corrected estimations and confidence intervals for the point estimates. The author revealed from the sample that the non-lignite-fired stations are on an average more efficient than the lignite-fired stations. Maethee Mekaroonreung and Andrew L. Johnson compare the relative performance of three DEA-based measures, which estimate production frontiers and evaluate the relative efficiency of 113 US petroleum refineries while considering undesirable outputs. Three inputs (capital, energy consumption, and crude oil consumption), two desirable outputs (gasoline and distillate generation), and an undesirable output (toxic release) are considered in the DEA models. The authors discover that refineries in the Rocky Mountain region performed the best, and about 60 percent of oil refineries in the sample could improve their efficiencies further. H. Omrani, A. Azadeh, S. F. Ghaderi, and S. Abdollahzadeh presented an integrated approach, combining DEA, corrected ordinary least squares (COLS), and principal component analysis (PCA) methods, to calculate the relative efficiency scores of 26 Iranian electricity distribution units from 2003 to 2006. Specifically, both DEA and COLS are used to check three internal consistency conditions, whereas PCA is used to verify and validate the final ranking results of either DEA (consistency) or DEA-COLS (non-consistency). Three inputs (network length, transformer capacity, and number of employees) and two outputs (number of customers and total electricity sales) are considered in the model. Virendra Ajodhia applied three DEA-based models to evaluate the relative performance of 20 electricity distribution firms from the UK and the Netherlands. The first model is a traditional DEA model for analyzing cost-only efficiency. The second model includes (inverse) quality by modelling total customer minutes lost as an input data. The third model is based on the idea of using total social costs, including the firm’s private costs and the interruption costs incurred by consumers, as an input. Both energy-delivered and number of consumers are treated as the outputs in the models. After five DEA papers, Stelios Grafakos, Alexandros Flamos, Vlasis Oikonomou, and D. Zevgolis presented a multiple criteria analysis weighting approach to evaluate the energy and climate policy. The proposed approach is akin to the analytic hierarchy process, which consists of pairwise comparisons, consistency verification, and criteria prioritization. In the approach, stakeholders and experts in the energy policy field are incorporated in the evaluation process by providing an interactive mean with verbal, numerical, and visual representation of their preferences. A total of 14 evaluation criteria were considered and classified into four objectives, such as climate change mitigation, energy effectiveness, socioeconomic, and competitiveness and technology. Finally, Borge Hess applied the stochastic frontier analysis approach to analyze the impact of various business strategies, including acquisition, holding structures, and joint ventures, on a firm’s efficiency within a sample of 47 natural gas transmission pipelines in the USA from 1996 to 2005. The author finds that there were no significant changes in the firm’s efficiency by an acquisition, and there is a weak evidence for efficiency improvements caused by the new shareholder. Besides, the author discovers that parent companies appear not to influence a subsidiary’s efficiency positively. In addition, the analysis shows a negative impact of a joint venture on technical efficiency of the pipeline company. To conclude, we are grateful to all the authors for their contribution, and all the reviewers for their constructive comments, which made this special issue possible. We hope that this issue would contribute significantly to performance improvement of the energy sector.
Resumo:
Hierarchical visualization systems are desirable because a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex high-dimensional data sets. We extend an existing locally linear hierarchical visualization system PhiVis [1] in several directions: bf(1) we allow for em non-linear projection manifolds (the basic building block is the Generative Topographic Mapping -- GTM), bf(2) we introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree, bf(3) we describe folding patterns of low-dimensional projection manifold in high-dimensional data space by computing and visualizing the manifold's local directional curvatures. Quantities such as magnification factors [3] and directional curvatures are helpful for understanding the layout of the nonlinear projection manifold in the data space and for further refinement of the hierarchical visualization plot. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. We demonstrate the visualization system principle of the approach on a complex 12-dimensional data set and mention possible applications in the pharmaceutical industry.
Resumo:
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. miniDVMS v1.8 provides a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualisation domain. The advantage of this interface is that the user is directly involved in the data mining process. Principled projection methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), are integrated with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, and user interaction facilities, to provide this integrated visual data mining framework. The software also supports conventional visualisation techniques such as principal component analysis (PCA), Neuroscale, and PhiVis. This user manual gives an overview of the purpose of the software tool, highlights some of the issues to be taken care while creating a new model, and provides information about how to install and use the tool. The user manual does not require the readers to have familiarity with the algorithms it implements. Basic computing skills are enough to operate the software.
Resumo:
In the last two decades there have been substantial developments in the mathematical theory of inverse optimization problems, and their applications have expanded greatly. In parallel, time series analysis and forecasting have become increasingly important in various fields of research such as data mining, economics, business, engineering, medicine, politics, and many others. Despite the large uses of linear programming in forecasting models there is no a single application of inverse optimization reported in the forecasting literature when the time series data is available. Thus the goal of this paper is to introduce inverse optimization into forecasting field, and to provide a streamlined approach to time series analysis and forecasting using inverse linear programming. An application has been used to demonstrate the use of inverse forecasting developed in this study. © 2007 Elsevier Ltd. All rights reserved.
Resumo:
In this paper, a co-operative distributed process mining system (CDPMS) is developed to streamline the workflow along the supply chain in order to offer shorter delivery times, more flexibility and higher customer satisfaction with learning ability. The proposed system is equipped with the ‘distributed process mining’ feature which is used to discover the hidden relationships among each working decision in distributed manner. This method incorporates the concept of data mining and knowledge refinement into decision making process for ensuring ‘doing the right things’ within the workflow. An example of implementation is given, based on the case of slider manufacturer.
Resumo:
When applying multivariate analysis techniques in information systems and social science disciplines, such as management information systems (MIS) and marketing, the assumption that the empirical data originate from a single homogeneous population is often unrealistic. When applying a causal modeling approach, such as partial least squares (PLS) path modeling, segmentation is a key issue in coping with the problem of heterogeneity in estimated cause-and-effect relationships. This chapter presents a new PLS path modeling approach which classifies units on the basis of the heterogeneity of the estimates in the inner model. If unobserved heterogeneity significantly affects the estimated path model relationships on the aggregate data level, the methodology will allow homogenous groups of observations to be created that exhibit distinctive path model estimates. The approach will, thus, provide differentiated analytical outcomes that permit more precise interpretations of each segment formed. An application on a large data set in an example of the American customer satisfaction index (ACSI) substantiates the methodology’s effectiveness in evaluating PLS path modeling results.
Resumo:
Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach. © 2008 Springer-Verlag Berlin Heidelberg.
Resumo:
We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.
Resumo:
Product recommender systems are often deployed by e-commerce websites to improve user experience and increase sales. However, recommendation is limited by the product information hosted in those e-commerce sites and is only triggered when users are performing e-commerce activities. In this paper, we develop a novel product recommender system called METIS, a MErchanT Intelligence recommender System, which detects users' purchase intents from their microblogs in near real-time and makes product recommendation based on matching the users' demographic information extracted from their public profiles with product demographics learned from microblogs and online reviews. METIS distinguishes itself from traditional product recommender systems in the following aspects: 1) METIS was developed based on a microblogging service platform. As such, it is not limited by the information available in any specific e-commerce website. In addition, METIS is able to track users' purchase intents in near real-time and make recommendations accordingly. 2) In METIS, product recommendation is framed as a learning to rank problem. Users' characteristics extracted from their public profiles in microblogs and products' demographics learned from both online product reviews and microblogs are fed into learning to rank algorithms for product recommendation. We have evaluated our system in a large dataset crawled from Sina Weibo. The experimental results have verified the feasibility and effectiveness of our system. We have also made a demo version of our system publicly available and have implemented a live system which allows registered users to receive recommendations in real time. © 2014 ACM.