168 resultados para Correlation (Statistics)
Resumo:
This paper introduces two new techniques for determining nonlinear canonical correlation coefficients between two variable sets. A genetic strategy is incorporated to determine these coefficients. Compared to existing methods for nonlinear canonical correlation analysis (NLCCA), the benefits here are that the nonlinear mapping requires fewer parameters to be determined, consequently a more parsimonious NLCCA model can be established which is therefore simpler to interpret. A further contribution of the paper is the investigation of a variety of nonlinear deflation procedures for determining the subsequent nonlinear canonical coefficients. The benefits of the new approaches presented are demonstrated by application to an example from the literature and to recorded data from an industrial melter process. These studies show the advantages of the new NLCCA techniques presented and suggest that a nonlinear deflation procedure should be considered. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Biosignal measurement and processing is increasingly being deployed in ambulatory situations particularly in connected health applications. Such an environment dramatically increases the likelihood of artifacts which can occlude features of interest and reduce the quality of information available in the signal. If multichannel recordings are available for a given signal source, then there are currently a considerable range of methods which can suppress or in some cases remove the distorting effect of such artifacts. There are, however, considerably fewer techniques available if only a single-channel measurement is available and yet single-channel measurements are important where minimal instrumentation complexity is required. This paper describes a novel artifact removal technique for use in such a context. The technique known as ensemble empirical mode decomposition with canonical correlation analysis (EEMD-CCA) is capable of operating on single-channel measurements. The EEMD technique is first used to decompose the single-channel signal into a multidimensional signal. The CCA technique is then employed to isolate the artifact components from the underlying signal using second-order statistics. The new technique is tested against the currently available wavelet denoising and EEMD-ICA techniques using both electroencephalography and functional near-infrared spectroscopy data and is shown to produce significantly improved results. © 1964-2012 IEEE.
Resumo:
Soil carbon stores are a major component of the annual returns required by EU governments to the Intergovernmental Panel on Climate Change. Peat has a high proportion of soil carbon due to the relatively high carbon density of peat and organic-rich soils. For this reason it has become increasingly important to measure and model soil carbon stores and changes in peat stocks to facilitate the management of carbon changes over time. The approach investigated in this research evaluates the use of airborne geophysical (radiometric) data to estimate peat thickness using the attenuation of bedrock geology radioactivity by superficial peat cover. Remotely sensed radiometric data are validated with ground peat depth measurements combined with non-invasive geophysical surveys. Two field-based case studies exemplify and validate the results. Variography and kriging are used to predict peat thickness from point measurements of peat depth and airborne radiometric data and provide an estimate of uncertainty in the predictions. Cokriging, by assessing the degree of spatial correlation between recent remote sensed geophysical monitoring and previous peat depth models, is used to examine changes in peat stocks over time. The significance of the coregionalisation is that the spatial cross correlation between the remote and ground based data can be used to update the model of peat depth. The result is that by integrating remotely sensed data with ground geophysics, the need is reduced for extensive ground-based monitoring and invasive peat depth measurements. The overall goal is to provide robust estimates of peat thickness to improve estimates of carbon stocks. The implications from the research have a broader significance that promotes a reduction in the need for damaging onsite peat thickness measurement and an increase in the use of remote sensed data for carbon stock estimations.
Resumo:
In this paper, novel closed-form expressions for the level crossing rate and average fade duration of κ − μ shadowed fading channels are derived. The new equations provide the capability of modeling the correlation between the time derivative of the shadowed dominant and multipath components of the κ − μ shadowed fading envelope. Verification of the new equations is performed by reduction to a number of known special cases. It is shown that as the shadowing of the resultant dominant component decreases, the signal crosses lower threshold levels at a reduced rate. Furthermore, the impact of increasing correlation between the slope of the shadowed dominant and multipath components similarly acts to reduce crossings at lower signal levels. The new expressions for the second-order statistics are also compared with field measurements obtained for cellular device-to-device and body-centric communication channels, which are known to be susceptible to shadowed fading.
Resumo:
Here, we describe gene expression compositional assignment (GECA), a powerful, yet simple method based on compositional statistics that can validate the transfer of prior knowledge, such as gene lists, into independent data sets, platforms and technologies. Transcriptional profiling has been used to derive gene lists that stratify patients into prognostic molecular subgroups and assess biomarker performance in the pre-clinical setting. Archived public data sets are an invaluable resource for subsequent in silico validation, though their use can lead to data integration issues. We show that GECA can be used without the need for normalising expression levels between data sets and can outperform rank-based correlation methods. To validate GECA, we demonstrate its success in the cross-platform transfer of gene lists in different domains including: bladder cancer staging, tumour site of origin and mislabelled cell lines. We also show its effectiveness in transferring an epithelial ovarian cancer prognostic gene signature across technologies, from a microarray to a next-generation sequencing setting. In a final case study, we predict the tumour site of origin and histopathology of epithelial ovarian cancer cell lines. In particular, we identify and validate the commonly-used cell line OVCAR-5 as non-ovarian, being gastrointestinal in origin. GECA is available as an open-source R package.
Correlation of simulated and measured noise emissions using a combined 1D/3D computational technique
Resumo:
This paper presents a statistical-based fault diagnosis scheme for application to internal combustion engines. The scheme relies on an identified model that describes the relationships between a set of recorded engine variables using principal component analysis (PCA). Since combustion cycles are complex in nature and produce nonlinear relationships between the recorded engine variables, the paper proposes the use of nonlinear PCA (NLPCA). The paper further justifies the use of NLPCA by comparing the model accuracy of the NLPCA model with that of a linear PCA model. A new nonlinear variable reconstruction algorithm and bivariate scatter plots are proposed for fault isolation, following the application of NLPCA. The proposed technique allows the diagnosis of different fault types under steady-state operating conditions. More precisely, nonlinear variable reconstruction can remove the fault signature from the recorded engine data, which allows the identification and isolation of the root cause of abnormal engine behaviour. The paper shows that this can lead to (i) an enhanced identification of potential root causes of abnormal events and (ii) the masking of faulty sensor readings. The effectiveness of the enhanced NLPCA based monitoring scheme is illustrated by its application to a sensor fault and a process fault. The sensor fault relates to a drift in the fuel flow reading, whilst the process fault relates to a partial blockage of the intercooler. These faults are introduced to a Volkswagen TDI 1.9 Litre diesel engine mounted on an experimental engine test bench facility.
Resumo:
Summary statistics continue to play an important role in identifying and monitoring patterns and trends in educational inequalities between differing groups of pupils over time. However, this article argues that their uncritical use can also encourage the labelling of whole groups of pupils as ‘underachievers’ or ‘overachievers’ as the findings of group-level data are simply applied to individual group members, a practice commonly termed the ‘ecological fallacy’. Some of the adverse consequences of this will be outlined in relation to current debates concerning gender and ethnic differences in educational attainment. It will be argued that one way of countering this uncritical use of summary statistics and the ecological fallacy that it tends to encourage, is to make much more use of the principles and methods of what has been termed ‘exploratory data analysis’. Such an approach is illustrated through a secondary analysis of data from the Youth Cohort Study of England and Wales, focusing on gender and ethnic differences in educational attainment. It will be shown that, by placing an emphasis on the graphical display of data and on encouraging researchers to describe those data more qualitatively, such an approach represents an essential addition to the use of simple summary statistics and helps to avoid the limitations associated with them.
The Working Poor in Northern Ireland: What can analysis of administrative (WFTC) statistics tell us?
Resumo:
This study sought to extend earlier work by Mulhern and Wylie (2004) to investigate a UK-wide sample of psychology undergraduates. A total of 890 participants from eight universities across the UK were tested on six broadly defined components of mathematical thinking relevant to the teaching of statistics in psychology - calculation, algebraic reasoning, graphical interpretation, proportionality and ratio, probability and sampling, and estimation. Results were consistent with Mulhern and Wylie's (2004) previously reported findings. Overall, participants across institutions exhibited marked deficiencies in many aspects of mathematical thinking. Results also revealed significant gender differences on calculation, proportionality and ratio, and estimation. Level of qualification in mathematics was found to predict overall performance. Analysis of the nature and content of errors revealed consistent patterns of misconceptions in core mathematical knowledge , likely to hamper the learning of statistics.