18 resultados para Customer feature selection
em Aston University Research Archive
Resumo:
A practical Bayesian approach for inference in neural network models has been available for ten years, and yet it is not used frequently in medical applications. In this chapter we show how both regularisation and feature selection can bring significant benefits in diagnostic tasks through two case studies: heart arrhythmia classification based on ECG data and the prognosis of lupus. In the first of these, the number of variables was reduced by two thirds without significantly affecting performance, while in the second, only the Bayesian models had an acceptable accuracy. In both tasks, neural networks outperformed other pattern recognition approaches.
Resumo:
Data visualization algorithms and feature selection techniques are both widely used in bioinformatics but as distinct analytical approaches. Until now there has been no method of measuring feature saliency while training a data visualization model. We derive a generative topographic mapping (GTM) based data visualization approach which estimates feature saliency simultaneously with the training of the visualization model. The approach not only provides a better projection by modeling irrelevant features with a separate noise model but also gives feature saliency values which help the user to assess the significance of each feature. We compare the quality of projection obtained using the new approach with the projections from traditional GTM and self-organizing maps (SOM) algorithms. The results obtained on a synthetic and a real-life chemoinformatics dataset demonstrate that the proposed approach successfully identifies feature significance and provides coherent (compact) projections. © 2006 IEEE.
Resumo:
The main objective of the project is to enhance the already effective health-monitoring system (HUMS) for helicopters by analysing structural vibrations to recognise different flight conditions directly from sensor information. The goal of this paper is to develop a new method to select those sensors and frequency bands that are best for detecting changes in flight conditions. We projected frequency information to a 2-dimensional space in order to visualise flight-condition transitions using the Generative Topographic Mapping (GTM) and a variant which supports simultaneous feature selection. We created an objective measure of the separation between different flight conditions in the visualisation space by calculating the Kullback-Leibler (KL) divergence between Gaussian mixture models (GMMs) fitted to each class: the higher the KL-divergence, the better the interclass separation. To find the optimal combination of sensors, they were considered in pairs, triples and groups of four sensors. The sensor triples provided the best result in terms of KL-divergence. We also found that the use of a variational training algorithm for the GMMs gave more reliable results.
Resumo:
We present results that compare the performance of neural networks trained with two Bayesian methods, (i) the Evidence Framework of MacKay (1992) and (ii) a Markov Chain Monte Carlo method due to Neal (1996) on a task of classifying segmented outdoor images. We also investigate the use of the Automatic Relevance Determination method for input feature selection.
Resumo:
We propose a generative topographic mapping (GTM) based data visualization with simultaneous feature selection (GTM-FS) approach which not only provides a better visualization by modeling irrelevant features ("noise") using a separate shared distribution but also gives a saliency value for each feature which helps the user to assess their significance. This technical report presents a varient of the Expectation-Maximization (EM) algorithm for GTM-FS.
Resumo:
This thesis seeks to describe the development of an inexpensive and efficient clustering technique for multivariate data analysis. The technique starts from a multivariate data matrix and ends with graphical representation of the data and pattern recognition discriminant function. The technique also results in distances frequency distribution that might be useful in detecting clustering in the data or for the estimation of parameters useful in the discrimination between the different populations in the data. The technique can also be used in feature selection. The technique is essentially for the discovery of data structure by revealing the component parts of the data. lhe thesis offers three distinct contributions for cluster analysis and pattern recognition techniques. The first contribution is the introduction of transformation function in the technique of nonlinear mapping. The second contribution is the us~ of distances frequency distribution instead of distances time-sequence in nonlinear mapping, The third contribution is the formulation of a new generalised and normalised error function together with its optimal step size formula for gradient method minimisation. The thesis consists of five chapters. The first chapter is the introduction. The second chapter describes multidimensional scaling as an origin of nonlinear mapping technique. The third chapter describes the first developing step in the technique of nonlinear mapping that is the introduction of "transformation function". The fourth chapter describes the second developing step of the nonlinear mapping technique. This is the use of distances frequency distribution instead of distances time-sequence. The chapter also includes the new generalised and normalised error function formulation. Finally, the fifth chapter, the conclusion, evaluates all developments and proposes a new program. for cluster analysis and pattern recognition by integrating all the new features.
Resumo:
There has been considerable recent research into the connection between Parkinson's disease (PD) and speech impairment. Recently, a wide range of speech signal processing algorithms (dysphonia measures) aiming to predict PD symptom severity using speech signals have been introduced. In this paper, we test how accurately these novel algorithms can be used to discriminate PD subjects from healthy controls. In total, we compute 132 dysphonia measures from sustained vowels. Then, we select four parsimonious subsets of these dysphonia measures using four feature selection algorithms, and map these feature subsets to a binary classification response using two statistical classifiers: random forests and support vector machines. We use an existing database consisting of 263 samples from 43 subjects, and demonstrate that these new dysphonia measures can outperform state-of-the-art results, reaching almost 99% overall classification accuracy using only ten dysphonia features. We find that some of the recently proposed dysphonia measures complement existing algorithms in maximizing the ability of the classifiers to discriminate healthy controls from PD subjects. We see these results as an important step toward noninvasive diagnostic decision support in PD.
Resumo:
The standard reference clinical score quantifying average Parkinson's disease (PD) symptom severity is the Unified Parkinson's Disease Rating Scale (UPDRS). At present, UPDRS is determined by the subjective clinical evaluation of the patient's ability to adequately cope with a range of tasks. In this study, we extend recent findings that UPDRS can be objectively assessed to clinically useful accuracy using simple, self-administered speech tests, without requiring the patient's physical presence in the clinic. We apply a wide range of known speech signal processing algorithms to a large database (approx. 6000 recordings from 42 PD patients, recruited to a six-month, multi-centre trial) and propose a number of novel, nonlinear signal processing algorithms which reveal pathological characteristics in PD more accurately than existing approaches. Robust feature selection algorithms select the optimal subset of these algorithms, which is fed into non-parametric regression and classification algorithms, mapping the signal processing algorithm outputs to UPDRS. We demonstrate rapid, accurate replication of the UPDRS assessment with clinically useful accuracy (about 2 UPDRS points difference from the clinicians' estimates, p < 0.001). This study supports the viability of frequent, remote, cost-effective, objective, accurate UPDRS telemonitoring based on self-administered speech tests. This technology could facilitate large-scale clinical trials into novel PD treatments.
Resumo:
One of the main challenges of classifying clinical data is determining how to handle missing features. Most research favours imputing of missing values or neglecting records that include missing data, both of which can degrade accuracy when missing values exceed a certain level. In this research we propose a methodology to handle data sets with a large percentage of missing values and with high variability in which particular data are missing. Feature selection is effected by picking variables sequentially in order of maximum correlation with the dependent variable and minimum correlation with variables already selected. Classification models are generated individually for each test case based on its particular feature set and the matching data values available in the training population. The method was applied to real patients' anonymous mental-health data where the task was to predict the suicide risk judgement clinicians would give for each patient's data, with eleven possible outcome classes: zero to ten, representing no risk to maximum risk. The results compare favourably with alternative methods and have the advantage of ensuring explanations of risk are based only on the data given, not imputed data. This is important for clinical decision support systems using human expertise for modelling and explaining predictions.
Resumo:
Feature selection is important in medical field for many reasons. However, selecting important variables is a difficult task with the presence of censoring that is a unique feature in survival data analysis. This paper proposed an approach to deal with the censoring problem in endovascular aortic repair survival data through Bayesian networks. It was merged and embedded with a hybrid feature selection process that combines cox's univariate analysis with machine learning approaches such as ensemble artificial neural networks to select the most relevant predictive variables. The proposed algorithm was compared with common survival variable selection approaches such as; least absolute shrinkage and selection operator LASSO, and Akaike information criterion AIC methods. The results showed that it was capable of dealing with high censoring in the datasets. Moreover, ensemble classifiers increased the area under the roc curves of the two datasets collected from two centers located in United Kingdom separately. Furthermore, ensembles constructed with center 1 enhanced the concordance index of center 2 prediction compared to the model built with a single network. Although the size of the final reduced model using the neural networks and its ensembles is greater than other methods, the model outperformed the others in both concordance index and sensitivity for center 2 prediction. This indicates the reduced model is more powerful for cross center prediction.
Resumo:
This thesis studies survival analysis techniques dealing with censoring to produce predictive tools that predict the risk of endovascular aortic aneurysm repair (EVAR) re-intervention. Censoring indicates that some patients do not continue follow up, so their outcome class is unknown. Methods dealing with censoring have drawbacks and cannot handle the high censoring of the two EVAR datasets collected. Therefore, this thesis presents a new solution to high censoring by modifying an approach that was incapable of differentiating between risks groups of aortic complications. Feature selection (FS) becomes complicated with censoring. Most survival FS methods depends on Cox's model, however machine learning classifiers (MLC) are preferred. Few methods adopted MLC to perform survival FS, but they cannot be used with high censoring. This thesis proposes two FS methods which use MLC to evaluate features. The two FS methods use the new solution to deal with censoring. They combine factor analysis with greedy stepwise FS search which allows eliminated features to enter the FS process. The first FS method searches for the best neural networks' configuration and subset of features. The second approach combines support vector machines, neural networks, and K nearest neighbor classifiers using simple and weighted majority voting to construct a multiple classifier system (MCS) for improving the performance of individual classifiers. It presents a new hybrid FS process by using MCS as a wrapper method and merging it with the iterated feature ranking filter method to further reduce the features. The proposed techniques outperformed FS methods based on Cox's model such as; Akaike and Bayesian information criteria, and least absolute shrinkage and selector operator in the log-rank test's p-values, sensitivity, and concordance. This proves that the proposed techniques are more powerful in correctly predicting the risk of re-intervention. Consequently, they enable doctors to set patients’ appropriate future observation plan.
Resumo:
Developing a strategy for online channels requires knowledge of the effects of customers' online use on their revenue and cost to serve, which ultimately influence customer profitability. The authors theoretically discuss and empirically examine these effects. An empirical study of retail banking customers reveals that online use improves customer profitability by increasing customer revenue and decreasing cost to serve. Moreover, the revenue effects of online use are substantially larger than the cost-to-serve effects, although the effects of online use on customer revenue and cost to serve vary by product portfolio. Self-selection effects also emerge and can be even greater than online use effects. Ignoring self-selection effects thus can lead to poor managerial decision-making.
Resumo:
Aircraft manufacturing industries are looking for solutions in order to increase their productivity. One of the solutions is to apply the metrology systems during the production and assembly processes. Metrology Process Model (MPM) (Maropoulos et al, 2007) has been introduced which emphasises metrology applications with assembly planning, manufacturing processes and product designing. Measurability analysis is part of the MPM and the aim of this analysis is to check the feasibility for measuring the designed large scale components. Measurability Analysis has been integrated in order to provide an efficient matching system. Metrology database is structured by developing the Metrology Classification Model. Furthermore, the feature-based selection model is also explained. By combining two classification models, a novel approach and selection processes for integrated measurability analysis system (MAS) are introduced and such integrated MAS could provide much more meaningful matching results for the operators. © Springer-Verlag Berlin Heidelberg 2010.
Resumo:
Anyone who looks at the title of this special issue will agree that the intent behind the preparation of this volume was ambitious: to predict and discuss “The Future of Manufacturing”. Will manufacturing be important in the future? Even though some sceptics might say not, and put on the table some old familiar arguments, we would strongly disagree. To bring subsidies for the argument we issued the call-for-papers for this special issue of Journal of Manufacturing Technology Management, fully aware of the size of the challenge in our hands. But we strongly believed that the enterprise would be worthwhile. The point of departure is the ongoing debate concerning the meaning and content of manufacturing. The easily visualised internal activity of using tangible resources to make physical products in factories is no longer a viable way to characterise manufacturing. It is now a more loosely defined concept concerning the organisation and management of open, interdependent, systems for delivering goods and services, tangible and intangible, to diverse types of markets. Interestingly, Wickham Skinner is the most cited author in this special issue of JMTM. He provides the departure point of several articles because his vision and insights have guided and inspired researchers in production and operations management from the late 1960s until today. However, the picture that we draw after looking at the contributions in this special issue is intrinsically distinct, much more dynamic, and complex. Seven articles address the following research themes: 1.new patterns of organisation, where the boundaries of firms become blurred and the role of the firm in the production system as well as that of manufacturing within the firm become contingent; 2.new approaches to strategic decision-making in markets characterised by turbulence and weak signals at the customer interface; 3.new challenges in strategic and operational decisions due to changes in the profile of the workforce; 4.new global players, especially China, modifying the manufacturing landscape; and 5.new techniques, methods and tools that are being made feasible through progress in new technological domains. Of course, many other important dimensions could be studied, but these themes are representative of current changes and future challenges. Three articles look at the first theme: organisational evolution of production and operations in firms and networks. Karlsson's and Skold's article represent one further step in their efforts to characterise “the extraprise”. In the article, they advance the construction of a new framework, based on “the network perspective” by defining the formal elements which compose it and exploring the meaning of different types of relationships. The way in which “actors, resources and activities” are conceptualised extends the existing boundaries of analytical thinking in operations management and open new avenues for research, teaching and practice. The higher level of abstraction, an intrinsic feature of the framework, is associated to the increasing degree of complexity that characterises decisions related to strategy and implementation in the manufacturing and operations area, a feature that is expected to become more and more pervasive as time proceeds. Riis, Johansen, Englyst and Sorensen have also based their article on their previous work, which in this case is on “the interactive firm”. They advance new propositions on strategic roles of manufacturing and discuss why the configuration of strategic manufacturing roles, at the level of the network, will become a key issue and how the indirect strategic roles of manufacturing will become increasingly important. Additionally, by considering that value chains will become value webs, they predict that shifts in strategic manufacturing roles will look like a sequence of moves similar to a game of chess. Then, lastly under the first theme, Fleury and Fleury develop a conceptual framework for the study of production systems in general derived from field research in the telecommunications industry, here considered a prototype of the coming information society and knowledge economy. They propose a new typology of firms which, on certain dimensions, complements the propositions found in the other two articles. Their telecoms-based framework (TbF) comprises six types of companies characterised by distinct profiles of organisational competences, which interact according to specific patterns of relationships, thus creating distinct configurations of production networks. The second theme is addressed by Kyläheiko and SandstroÍm in their article “Strategic options based framework for management of dynamic capabilities in manufacturing firms”. They propose a new approach to strategic decision-making in markets characterised by turbulence and weak signals at the customer interface. Their framework for a manufacturing firm in the digital age leads to active asset selection (strategic investments in both tangible and intangible assets) and efficient orchestrating of the global value net in “thin” intangible asset markets. The framework consists of five steps based on Porter's five-forces model, the resources-based view, complemented by means of the concepts of strategic options and related flexibility issues. Thun, GroÍssler and Miczka's contribution to the third theme brings the human dimension to the debate regarding the future of manufacturing. Their article focuses on the challenges brought to management by the ageing of workers in Germany but, in the arguments that are raised, the future challenges associated to workers and work organisation in every production system become visible and relevant. An interesting point in the approach adopted by the authors is that not only the factual problems and solutions are taken into account but the perception of the managers is brought into the picture. China cannot be absent in the discussion of the future of manufacturing. Therefore, within the fourth theme, Vaidya, Bennett and Liu provide the evidence of the gradual improvement of Chinese companies in the medium and high-tech sectors, by using the revealed comparative advantage (RCA) analysis. The Chinese evolution is shown to be based on capabilities developed through combining international technology transfer and indigenous learning. The main implication for the Western companies is the need to take account of the accelerated rhythm of capability development in China. For other developing countries China's case provides lessons of great importance. Finally, under the fifth theme, Kuehnle's article: “Post mass production paradigm (PMPP) trajectories” provides a futuristic scenario of what is already around us and might become prevalent in the future. It takes a very intensive look at a whole set of dimensions that are affecting manufacturing now, and will influence manufacturing in the future, ranging from the application of ICT to the need for social transparency. In summary, this special issue of JMTM presents a brief, but undisputable, demonstration of the possible richness of manufacturing in the future. Indeed, we could even say that manufacturing has no future if we only stick to the past perspectives. Embracing the new is not easy. The new configurations of production systems, the distributed and complementary roles to be performed by distinct types of companies in diversified networked structures, leveraged by the new emergent technologies and associated the new challenges for managing people, are all themes that are carriers of the future. The Guest Editors of this special issue on the future of manufacturing are strongly convinced that their undertaking has been worthwhile.
Resumo:
Market entry decisions are some of a firm's most important long-term strategic choices. Still, the international marketing literature has not yet fully incorporated the idea of relationship marketing in general, and the customer value concept in particular, as a basis for market entry decisions. This article presents some conceptual ideas about a customer value based market selection model. The metric International Added Customer Equity (IACE), a straightforward decision criterion derived from the customer equity concept is presented as an additional decision criterion for export market selection and ultimately market entry.