6 resultados para Feasible set mapping

em Aston University Research Archive


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper, addresses the problem of novelty detection in the case that the observed data is a mixture of a known 'background' process contaminated with an unknown other process, which generates the outliers, or novel observations. The framework we describe here is quite general, employing univariate classification with incomplete information, based on knowledge of the distribution (the 'probability density function', 'pdf') of the data generated by the 'background' process. The relative proportion of this 'background' component (the 'prior' 'background' 'probability), the 'pdf' and the 'prior' probabilities of all other components are all assumed unknown. The main contribution is a new classification scheme that identifies the maximum proportion of observed data following the known 'background' distribution. The method exploits the Kolmogorov-Smirnov test to estimate the proportions, and afterwards data are Bayes optimally separated. Results, demonstrated with synthetic data, show that this approach can produce more reliable results than a standard novelty detection scheme. The classification algorithm is then applied to the problem of identifying outliers in the SIC2004 data set, in order to detect the radioactive release simulated in the 'oker' data set. We propose this method as a reliable means of novelty detection in the emergency situation which can also be used to identify outliers prior to the application of a more general automatic mapping algorithm. © Springer-Verlag 2007.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examines the issues of `integration' of human resource management (HRM) into the corporate strategy, `devolvement' of HRM to line managers and the perceived influence of national culture on HRM in a cross-national comparative context. In order to achieve this, the cognition of personnel specialists from a matched sample of 48 Indian and British firms in the manufacturing sector using the `Visual Cards Sorting' and `CMAP2' methodologies are analyzed. The findings show that even where there is an apparent convergence of strategy — e.g., the desire of both Indian and British personnel managers to increase integration between HRM and business strategy, and to increase the level of devolvement to line managers, the two sets of specialists clearly follow a different logic of action, which is subject to a different set of cross-cultural influences. The implications of pursuing apparently similar HRM solutions in different cross-national contexts are considered. The analysis shows that HRM strategies, when considered in a cross-national context, vary a lot. Different logic leads to the adoption of similar HR strategies, and similar strategies in turn are perceived as producing different outcomes. This variance centres around the existence and perceived influence of several contextual variables such as industrial relations systems, operation of labour markets, and changes in business systems. Specific cross-cultural influences, along with different aspects of competitive business environment associated with the generic HR strategies of integration and devolvement in the two countries are highlighted. This research contributes to the fields of cross-cultural management research, international HRM and managerial and organizational cognition. It also has important messages for policy makers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been argued that a single two-dimensional visualization plot may not be sufficient to capture all of the interesting aspects of complex data sets, and therefore a hierarchical visualization system is desirable. In this paper we extend an existing locally linear hierarchical visualization system PhiVis (Bishop98a) in several directions: 1. We allow for em non-linear projection manifolds. The basic building block is the Generative Topographic Mapping. 2. We introduce a general formulation of hierarchical probabilistic models consisting of local probabilistic models organized in a hierarchical tree. General training equations are derived, regardless of the position of the model in the tree. 3. Using tools from differential geometry we derive expressions for local directionalcurvatures of the projection manifold. Like PhiVis, our system is statistically principled and is built interactively in a top-down fashion using the EM algorithm. It enables the user to interactively highlight those data in the parent visualization plot which are captured by a child model.We also incorporate into our system a hierarchical, locally selective representation of magnification factors and directional curvatures of the projection manifolds. Such information is important for further refinement of the hierarchical visualization plot, as well as for controlling the amount of regularization imposed on the local models. We demonstrate the principle of the approach on a toy data set andapply our system to two more complex 12- and 19-dimensional data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis applies a hierarchical latent trait model system to a large quantity of data. The motivation for it was lack of viable approaches to analyse High Throughput Screening datasets which maybe include thousands of data points with high dimensions. High Throughput Screening (HTS) is an important tool in the pharmaceutical industry for discovering leads which can be optimised and further developed into candidate drugs. Since the development of new robotic technologies, the ability to test the activities of compounds has considerably increased in recent years. Traditional methods, looking at tables and graphical plots for analysing relationships between measured activities and the structure of compounds, have not been feasible when facing a large HTS dataset. Instead, data visualisation provides a method for analysing such large datasets, especially with high dimensions. So far, a few visualisation techniques for drug design have been developed, but most of them just cope with several properties of compounds at one time. We believe that a latent variable model (LTM) with a non-linear mapping from the latent space to the data space is a preferred choice for visualising a complex high-dimensional data set. As a type of latent variable model, the latent trait model can deal with either continuous data or discrete data, which makes it particularly useful in this domain. In addition, with the aid of differential geometry, we can imagine the distribution of data from magnification factor and curvature plots. Rather than obtaining the useful information just from a single plot, a hierarchical LTM arranges a set of LTMs and their corresponding plots in a tree structure. We model the whole data set with a LTM at the top level, which is broken down into clusters at deeper levels of t.he hierarchy. In this manner, the refined visualisation plots can be displayed in deeper levels and sub-clusters may be found. Hierarchy of LTMs is trained using expectation-maximisation (EM) algorithm to maximise its likelihood with respect to the data sample. Training proceeds interactively in a recursive fashion (top-down). The user subjectively identifies interesting regions on the visualisation plot that they would like to model in a greater detail. At each stage of hierarchical LTM construction, the EM algorithm alternates between the E- and M-step. Another problem that can occur when visualising a large data set is that there may be significant overlaps of data clusters. It is very difficult for the user to judge where centres of regions of interest should be put. We address this problem by employing the minimum message length technique, which can help the user to decide the optimal structure of the model. In this thesis we also demonstrate the applicability of the hierarchy of latent trait models in the field of document data mining.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, issues of childhood obesity, unsafe toys, and child labor have raised the question of corporate responsibilities to children. However, business impacts on children are complex, multi-faceted, and frequently overlooked by senior managers. This article reports on a systematic analysis of the reputational landscape constructed by the media, corporations, and non-government organizations around business responsibilities to children. A content analysis methodology is applied to a sample of more than 350 relevant accounts during a 5-year period. We identify seven core responsibilities that are then used to provide a framework for enabling businesses to map their range of impacts on children. We set out guidelines for how to identify and manage the firm’s strategic responsibilities in this arena, and identify the␣constraints that corporations face in meeting such responsibilities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The accuracy of a map is dependent on the reference dataset used in its construction. Classification analyses used in thematic mapping can, for example, be sensitive to a range of sampling and data quality concerns. With particular focus on the latter, the effects of reference data quality on land cover classifications from airborne thematic mapper data are explored. Variations in sampling intensity and effort are highlighted in a dataset that is widely used in mapping and modelling studies; these may need accounting for in analyses. The quality of the labelling in the reference dataset was also a key variable influencing mapping accuracy. Accuracy varied with the amount and nature of mislabelled training cases with the nature of the effects varying between classifiers. The largest impacts on accuracy occurred when mislabelling involved confusion between similar classes. Accuracy was also typically negatively related to the magnitude of mislabelled cases and the support vector machine (SVM), which has been claimed to be relatively insensitive to training data error, was the most sensitive of the set of classifiers investigated, with overall classification accuracy declining by 8% (significant at 95% level of confidence) with the use of a training set containing 20% mislabelled cases.