928 resultados para improved principal components analysis (IPCA) algorithm
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
In broader catchment scale investigations, there is a need to understand and ultimately exploit the spatial variation of agricultural crops for an improved economic return. In many instances, this spatial variation is temporally unstable and may be different for various crop attributes and crop species. In the Australian sugar industry, the opportunity arose to evaluate the performance of 231 farms in the Tully Mill area in far north Queensland using production information on cane yield (t/ha) and CCS ( a fresh weight measure of sucrose content in the cane) accumulated over a 12-year period. Such an arrangement of data can be expressed as a 3-way array where a farm x attribute x year matrix can be evaluated and interactions considered. Two multivariate techniques, the 3-way mixture method of clustering and the 3-mode principal component analysis, were employed to identify meaningful relationships between farms that performed similarly for both cane yield and CCS. In this context, farm has a spatial component and the aim of this analysis was to determine if systematic patterns in farm performance expressed by cane yield and CCS persisted over time. There was no spatial relationship between cane yield and CCS. However, the analysis revealed that the relationship between farms was remarkably stable from one year to the next for both attributes and there was some spatial aggregation of farm performance in parts of the mill area. This finding is important, since temporally consistent spatial variation may be exploited to improve regional production. Alternatively, the putative causes of the spatial variation may be explored to enhance the understanding of sugarcane production in the wet tropics of Australia.
Resumo:
A combination of uni- and multiplex PCR assays targeting 58 virulence genes (VGs) associated with Escherichia coli strains causing intestinal and extraintestinal disease in humans and other mammals was used to analyze the VG repertoire of 23 commensal E. coli isolates from healthy pigs and 52 clinical isolates associated with porcine neonatal diarrhea (ND) and postweaning diarrhea (PWD). The relationship between the presence and absence of VGs was interrogated using three statistical methods. According to the generalized linear model, 17 of 58 VGs were found to be significant (P < 0.05) in distinguishing between commensal and clinical isolates. Nine of the 17 genes represented by iha, hlyA, aidA, east1, aah, fimH, iroN(E).(coli), traT, and saa have not been previously identified as important VGs in clinical porcine isolates in Australia. The remaining eight VGs code for fimbriae (F4, F5, F18, and F41) and toxins (STa, STh, LT, and Stx2), normally associated with porcine enterotoxigenic E. coli. Agglomerative hierarchical algorithm analysis grouped E. coli strains into subclusters based primarily on their serogroup. Multivariate analyses of clonal relationships based on the 17 VGs were collapsed into two-dimensional space by principal coordinate analysis. PWD clones were distributed in two quadrants, separated from ND and commensal clones, which tended to cluster within one quadrant. Clonal subclusters within quadrants were highly correlated with serogroups. These methods of analysis provide different perspectives in our attempts to understand how commensal and clinical porcine enterotoxigenic E. coli strains have evolved and are engaged in the dynamic process of losing or acquiring VGs within the pig population.
Resumo:
Quantitative genetics provides a powerful framework for studying phenotypic evolution and the evolution of adaptive genetic variation. Central to the approach is G, the matrix of additive genetic variances and covariances. G summarizes the genetic basis of the traits and can be used to predict the phenotypic response to multivariate selection or to drift. Recent analytical and computational advances have improved both the power and the accessibility of the necessary multivariate statistics. It is now possible to study the relationships between G and other evolutionary parameters, such as those describing the mutational input, the shape and orientation of the adaptive landscape, and the phenotypic divergence among populations. At the same time, we are moving towards a greater understanding of how the genetic variation summarized by G evolves. Computer simulations of the evolution of G, innovations in matrix comparison methods, and rapid development of powerful molecular genetic tools have all opened the way for dissecting the interaction between allelic variation and evolutionary process. Here I discuss some current uses of G, problems with the application of these approaches, and identify avenues for future research.
Resumo:
Onsite wastewater treatment systems aim to assimilate domestic effluent into the environment. Unfortunately failure of such systems is common and inadequate effluent treatment can have serious environmental implications. The capacity of a particular soil to treat wastewater will change over time. The physical properties influence the rate of effluent movement through the soil and its chemical properties dictate the ability to renovate effluent. A research project was undertaken to determine the role that physical and chemical soil properties play in predicting the long-term behaviour of soil under effluent irrigation and to determine if they have a potential function as early indicators of adverse effects of effluent irrigation on treatment sustainability. Principal Component Analysis (PCA) and Cluster Analysis grouped the soils independently of their soil classifications and allowed us to distinguish the most suitable soils for sustainable long term effluent irrigation and determine the most influential soil parameters to characterise them. Multivariate analysis allowed a clear distinction between soils based on the cation exchange capacities. This in turn correlated well with the soil mineralogy. Mixed mineralogy soils in particular sodium or magnesium dominant soils are the most susceptible to dispersion under effluent irrigation. The soil Exchangeable Sodium Percentage (ESP) was identified as a crucial parameter and was highly correlated with percentage clay, electrical conductivity, exchangeable sodium, exchangeable magnesium and low Ca:Mg ratios (less than 0.5).
Resumo:
Background: In 1992, Frisch et al (Psychol Assess. 1992;4:92- 10 1) developed the Quality of Life Inventory (QOLI) to measure the concept of quality of life (QOL) because it has long been thought to be related to both physical and emotional well-being. However, the psychometric properties of the QOLI in clinical populations are still in debate. The present study examined the factor structure of QOLI and reported its validity and reliability in a clinical sample. Method: Two hundred seventeen patients with anxiety and depressive disorders completed the QOLI and additional questionnaires measuring symptoms (Zung Self-rating Depression Scale, Beck Anxiety Inventory, Fear Questionnaire, Depression Anxiety Stress Scale-Stress) and subjective well-being (Satisfaction With Life Scale) were also used. Results: Exploratory factor analysis via the principal components method, with oblique rotation, revealed a 2-factor structure that accounted for 42.73% of the total variance, and a subsequent confirmatory factor analysis suggested a moderate fit of the data to this model. The 2 factors appeared to describe self-oriented QOL and externally oriented QOL. The Cronbach alpha coefficients were 0.85 for the overall QOLI score, 0.81 for the first factor, and 0.75 for the second factor. Conclusion: Consistent evidence was also found to support the concurrent, discriminant, predictive, and criterion-related validity of the QOLI. (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
Visualization of high-dimensional data has always been a challenging task. Here we discuss and propose variants of non-linear data projection methods (Generative Topographic Mapping (GTM) and GTM with simultaneous feature saliency (GTM-FS)) that are adapted to be effective on very high-dimensional data. The adaptations use log space values at certain steps of the Expectation Maximization (EM) algorithm and during the visualization process. We have tested the proposed algorithms by visualizing electrostatic potential data for Major Histocompatibility Complex (MHC) class-I proteins. The experiments show that the variation in the original version of GTM and GTM-FS worked successfully with data of more than 2000 dimensions and we compare the results with other linear/nonlinear projection methods: Principal Component Analysis (PCA), Neuroscale (NSC) and Gaussian Process Latent Variable Model (GPLVM).
Resumo:
Visualising data for exploratory analysis is a major challenge in many applications. Visualisation allows scientists to gain insight into the structure and distribution of the data, for example finding common patterns and relationships between samples as well as variables. Typically, visualisation methods like principal component analysis and multi-dimensional scaling are employed. These methods are favoured because of their simplicity, but they cannot cope with missing data and it is difficult to incorporate prior knowledge about properties of the variable space into the analysis; this is particularly important in the high-dimensional, sparse datasets typical in geochemistry. In this paper we show how to utilise a block-structured correlation matrix using a modification of a well known non-linear probabilistic visualisation model, the Generative Topographic Mapping (GTM), which can cope with missing data. The block structure supports direct modelling of strongly correlated variables. We show that including prior structural information it is possible to improve both the data visualisation and the model fit. These benefits are demonstrated on artificial data as well as a real geochemical dataset used for oil exploration, where the proposed modifications improved the missing data imputation results by 3 to 13%.
Resumo:
The thesis presents new methodology and algorithms that can be used to analyse and measure the hand tremor and fatigue of surgeons while performing surgery. This will assist them in deriving useful information about their fatigue levels, and make them aware of the changes in their tool point accuracies. This thesis proposes that muscular changes of surgeons, which occur through a day of operating, can be monitored using Electromyography (EMG) signals. The multi-channel EMG signals are measured at different muscles in the upper arm of surgeons. The dependence of EMG signals has been examined to test the hypothesis that EMG signals are coupled with and dependent on each other. The results demonstrated that EMG signals collected from different channels while mimicking an operating posture are independent. Consequently, single channel fatigue analysis has been performed. In measuring hand tremor, a new method for determining the maximum tremor amplitude using Principal Component Analysis (PCA) and a new technique to detrend acceleration signals using Empirical Mode Decomposition algorithm were introduced. This tremor determination method is more representative for surgeons and it is suggested as an alternative fatigue measure. This was combined with the complexity analysis method, and applied to surgically captured data to determine if operating has an effect on a surgeon’s fatigue and tremor levels. It was found that surgical tremor and fatigue are developed throughout a day of operating and that this could be determined based solely on their initial values. Finally, several Nonlinear AutoRegressive with eXogenous inputs (NARX) neural networks were evaluated. The results suggest that it is possible to monitor surgeon tremor variations during surgery from their EMG fatigue measurements.
Resumo:
The object of this project was to identify those elements of management practice which characterised firms in the West Midlands Road Transport Industry. The object being to establish the contents of what might be termed a management policy portfolio for growth. The First Phase was the review of those factors which were generally accepted as having an influence on the success rate of transport firms in order to ascertain if they explained observed patterns. Secondly, if this were not the case, to instigate a field work study to isolate those policies which were associated with growth organizations. Investigation of the vehicle movements for the entire West Midlands Fleet over a complete licence cycle suggested that conventional explanations could not fully account for the observed patterns. To carry out the second phase of the study a sample of growth firms were visited in order to measure their attitudes on a range of factors hypothesised to affect growth. Field data were analysed to establish management activities over a wide range of areas and the results further investigated through a Principal Components and Cluster Analysis programme. The outcome of the study indicates that some past attitudes on the skills and attitudes of transport managers may have to be re-examined. As a result, the project produced a new classification of road transport firms based not on the conventional categories of long and short haul, or the types of traffics carried, but on the marketing policies and management skills employed within the organization.
Resumo:
The ability to measure ocular surface temperature (OST) with thermal imaging offers potential insight into ocular physiology that has been acknowledged in the literature. The TH7102MX thermo-camera (NEC San-ei, Japan) continuously records dynamic information about OST without sacrificing spatial resolution. Using purpose-designed image analysis software, it was possible to select and quantify the principal components of absolute temperature values and the magnitude plus rate of temperature change that followed blinking. The techniques was examined for repeatability, reproducibility and the effects of extrinsic factors: a suitable experimental protocol was thus developed. The precise source of the measured thermal radiation has previously been subject toe dispute: in this thesis, the results of a study examining the relationships between physical parameters of the anterior eye and OST, confirmed a principal role for the tear film in OST. The dynamic changes in OST were studied in a large group of young subjects: quantifying the post-blink changes in temperature with time also established a role for tear flow dynamics in OST. Using dynamic thermography, the effects of hydrogel contact lens wear on OST were investigated: a model eye for in vivo work, and both neophyte and adapted contact lens wearers for in vivo studies. Significantly greater OST was observed in contact lens wearers, particularly with silicone hydrogel lenses compared to etafilcon A, and tended to be greatest when lenses had been worn continuously. This finding is important to understanding the ocular response to contact lens wear. In a group of normal subjects, dynamic thermography appeared to measure the ocular response to the application of artificial tear drops: this may prove to be a significant research and clinical tool.
Resumo:
This thesis first considers the calibration and signal processing requirements of a neuromagnetometer for the measurement of human visual function. Gradiometer calibration using straight wire grids is examined and optimal grid configurations determined, given realistic constructional tolerances. Simulations show that for gradiometer balance of 1:104 and wire spacing error of 0.25mm the achievable calibration accuracy of gain is 0.3%, of position is 0.3mm and of orientation is 0.6°. Practical results with a 19-channel 2nd-order gradiometer based system exceed this performance. The real-time application of adaptive reference noise cancellation filtering to running-average evoked response data is examined. In the steady state, the filter can be assumed to be driven by a non-stationary step input arising at epoch boundaries. Based on empirical measures of this driving step an optimal progression for the filter time constant is proposed which improves upon fixed time constant filter performance. The incorporation of the time-derivatives of the reference channels was found to improve the performance of the adaptive filtering algorithm by 15-20% for unaveraged data, falling to 5% with averaging. The thesis concludes with a neuromagnetic investigation of evoked cortical responses to chromatic and luminance grating stimuli. The global magnetic field power of evoked responses to the onset of sinusoidal gratings was shown to have distinct chromatic and luminance sensitive components. Analysis of the results, using a single equivalent current dipole model, shows that these components arise from activity within two distinct cortical locations. Co-registration of the resulting current source localisations with MRI shows a chromatically responsive area lying along the midline within the calcarine fissure, possibly extending onto the lingual and cuneal gyri. It is postulated that this area is the human homologue of the primate cortical area V4.
Resumo:
This paper presents a fast part-based subspace selection algorithm, termed the binary sparse nonnegative matrix factorization (B-SNMF). Both the training process and the testing process of B-SNMF are much faster than those of binary principal component analysis (B-PCA). Besides, B-SNMF is more robust to occlusions in images. Experimental results on face images demonstrate the effectiveness and the efficiency of the proposed B-SNMF.
Resumo:
Biological experiments often produce enormous amount of data, which are usually analyzed by data clustering. Cluster analysis refers to statistical methods that are used to assign data with similar properties into several smaller, more meaningful groups. Two commonly used clustering techniques are introduced in the following section: principal component analysis (PCA) and hierarchical clustering. PCA calculates the variance between variables and groups them into a few uncorrelated groups or principal components (PCs) that are orthogonal to each other. Hierarchical clustering is carried out by separating data into many clusters and merging similar clusters together. Here, we use an example of human leukocyte antigen (HLA) supertype classification to demonstrate the usage of the two methods. Two programs, Generating Optimal Linear Partial Least Square Estimations (GOLPE) and Sybyl, are used for PCA and hierarchical clustering, respectively. However, the reader should bear in mind that the methods have been incorporated into other software as well, such as SIMCA, statistiXL, and R.
Resumo:
Principal component analysis (PCA) is well recognized in dimensionality reduction, and kernel PCA (KPCA) has also been proposed in statistical data analysis. However, KPCA fails to detect the nonlinear structure of data well when outliers exist. To reduce this problem, this paper presents a novel algorithm, named iterative robust KPCA (IRKPCA). IRKPCA works well in dealing with outliers, and can be carried out in an iterative manner, which makes it suitable to process incremental input data. As in the traditional robust PCA (RPCA), a binary field is employed for characterizing the outlier process, and the optimization problem is formulated as maximizing marginal distribution of a Gibbs distribution. In this paper, this optimization problem is solved by stochastic gradient descent techniques. In IRKPCA, the outlier process is in a high-dimensional feature space, and therefore kernel trick is used. IRKPCA can be regarded as a kernelized version of RPCA and a robust form of kernel Hebbian algorithm. Experimental results on synthetic data demonstrate the effectiveness of IRKPCA. © 2010 Taylor & Francis.