10 resultados para Incomplete Data
em Aston University Research Archive
Resumo:
We analyse how the Generative Topographic Mapping (GTM) can be modified to cope with missing values in the training data. Our approach is based on an Expectation -Maximisation (EM) method which estimates the parameters of the mixture components and at the same time deals with the missing values. We incorporate this algorithm into a hierarchical GTM. We verify the method on a toy data set (using a single GTM) and a realistic data set (using a hierarchical GTM). The results show our algorithm can help to construct informative visualisation plots, even when some of the training points are corrupted with missing values.
Resumo:
Jaccard has been the choice similarity metric in ecology and forensic psychology for comparison of sites or offences, by species or behaviour. This paper applies a more powerful hierarchical measure - taxonomic similarity (s), recently developed in marine ecology - to the task of behaviourally linking serial crime. Forensic case linkage attempts to identify behaviourally similar offences committed by the same unknown perpetrator (called linked offences). s considers progressively higher-level taxa, such that two sites show some similarity even without shared species. We apply this index by analysing 55 specific offence behaviours classified hierarchically. The behaviours are taken from 16 sexual offences by seven juveniles where each offender committed two or more offences. We demonstrate that both Jaccard and s show linked offences to be significantly more similar than unlinked offences. With up to 20% of the specific behaviours removed in simulations, s is equally or more effective at distinguishing linked offences than where Jaccard uses a full data set. Moreover, s retains significant difference between linked and unlinked pairs, with up to 50% of the specific behaviours removed. As police decision-making often depends upon incomplete data, s has clear advantages and its application may extend to other crime types. Copyright © 2007 John Wiley & Sons, Ltd.
Resumo:
This dissertation investigates the very important and current problem of modelling human expertise. This is an apparent issue in any computer system emulating human decision making. It is prominent in Clinical Decision Support Systems (CDSS) due to the complexity of the induction process and the vast number of parameters in most cases. Other issues such as human error and missing or incomplete data present further challenges. In this thesis, the Galatean Risk Screening Tool (GRiST) is used as an example of modelling clinical expertise and parameter elicitation. The tool is a mental health clinical record management system with a top layer of decision support capabilities. It is currently being deployed by several NHS mental health trusts across the UK. The aim of the research is to investigate the problem of parameter elicitation by inducing them from real clinical data rather than from the human experts who provided the decision model. The induced parameters provide an insight into both the data relationships and how experts make decisions themselves. The outcomes help further understand human decision making and, in particular, help GRiST provide more accurate emulations of risk judgements. Although the algorithms and methods presented in this dissertation are applied to GRiST, they can be adopted for other human knowledge engineering domains.
Resumo:
One dominant feature of the modern manufacturing chains is the movement of goods. Manufacturing companies would remain an unprofitable investment if the supplies/logistics of raw materials, semi-finished products or final goods are not handled in an effective way. Both levels of a modern manufacturing chain-actual production and logistics-are characterized by continuous data creation at a much faster rate than they can be meaningfully analyzed and acted upon manually. Often, instant and reliable decisions need to be taken based on huge, previously inconceivable amounts of heterogeneous, contradictory or incomplete data. The paper will highlight aspects of information flows related to business process data visibility and observability in modern manufacturing networks. An information management platform developed in the framework of the EU FP7 project ADVANCE will be presented.
Resumo:
Purpose – The purpose of this paper is to examine challenges and potential of big data in heterogeneous business networks and relate these to an implemented logistics solution. Design/methodology/approach – The paper establishes an overview of challenges and opportunities of current significance in the area of big data, specifically in the context of transparency and processes in heterogeneous enterprise networks. Within this context, the paper presents how existing components and purpose-driven research were combined for a solution implemented in a nationwide network for less-than-truckload consignments. Findings – Aside from providing an extended overview of today’s big data situation, the findings have shown that technical means and methods available today can comprise a feasible process transparency solution in a large heterogeneous network where legacy practices, reporting lags and incomplete data exist, yet processes are sensitive to inadequate policy changes. Practical implications – The means introduced in the paper were found to be of utility value in improving process efficiency, transparency and planning in logistics networks. The particular system design choices in the presented solution allow an incremental introduction or evolution of resource handling practices, incorporating existing fragmentary, unstructured or tacit knowledge of experienced personnel into the theoretically founded overall concept. Originality/value – The paper extends previous high-level view on the potential of big data, and presents new applied research and development results in a logistics application.
Resumo:
Purpose – To propose and investigate a stable numerical procedure for the reconstruction of the velocity of a viscous incompressible fluid flow in linear hydrodynamics from knowledge of the velocity and fluid stress force given on a part of the boundary of a bounded domain. Design/methodology/approach – Earlier works have involved the similar problem but for stationary case (time-independent fluid flow). Extending these ideas a procedure is proposed and investigated also for the time-dependent case. Findings – The paper finds a novel variation method for the Cauchy problem. It proves convergence and also proposes a new boundary element method. Research limitations/implications – The fluid flow domain is limited to annular domains; this restriction can be removed undertaking analyses in appropriate weighted spaces to incorporate singularities that can occur on general bounded domains. Future work involves numerical investigations and also to consider Oseen type flow. A challenging problem is to consider non-linear Navier-Stokes equation. Practical implications – Fluid flow problems where data are known only on a part of the boundary occur in a range of engineering situations such as colloidal suspension and swimming of microorganisms. For example, the solution domain can be the region between to spheres where only the outer sphere is accessible for measurements. Originality/value – A novel variational method for the Cauchy problem is proposed which preserves the unsteady Stokes operator, convergence is proved and using recent for the fundamental solution for unsteady Stokes system, a new boundary element method for this system is also proposed.
Resumo:
An iterative procedure for determining temperature fields from Cauchy data given on a part of the boundary is presented. At each iteration step, a series of mixed well-posed boundary value problems are solved for the heat operator and its adjoint. A convergence proof of this method in a weighted L2-space is included, as well as a stopping criteria for the case of noisy data. Moreover, a solvability result in a weighted Sobolev space for a parabolic initial boundary value problem of second order with mixed boundary conditions is presented. Regularity of the solution is proved. (© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Resumo:
With an increased emphasis on outsourcing and shortening business cycles, contracts between firms have become more important. Carefully written contracts contribute to the efficiency and longevity of inter-firm relationships as they may constrain opportunism and are often a less costly governance mechanism than maintaining complex social relationships (Larson 1992). This exploratory examination adds to our understanding of how incomplete contracts affect interorganizational exchange. First, we consider the multiple dimensions of contract constraints (safeguards). We also investigate the extent that constraints affect decisions to enforce the relationship by delaying payments, and whether the decision is efficient. Finally, we examine the extent the constraints are effective (and ineffective) at reducing transaction problems associated with enforcement. Based on 971 observations of transactions using explicit, written terms and other secondary data in the context of IT transaction in The Netherlands we test our research propositions.
Resumo:
DUE TO INCOMPLETE PAPERWORK, ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
Heterogeneous and incomplete datasets are common in many real-world visualisation applications. The probabilistic nature of the Generative Topographic Mapping (GTM), which was originally developed for complete continuous data, can be extended to model heterogeneous (i.e. containing both continuous and discrete values) and missing data. This paper describes and assesses the resulting model on both synthetic and real-world heterogeneous data with missing values.