914 resultados para principal component
Resumo:
Data associated with germplasm collections are typically large and multivariate with a considerable number of descriptors measured on each of many accessions. Pattern analysis methods of clustering and ordination have been identified as techniques for statistically evaluating the available diversity in germplasm data. While used in many studies, the approaches have not dealt explicitly with the computational consequences of large data sets (i.e. greater than 5000 accessions). To consider the application of these techniques to germplasm evaluation data, 11328 accessions of groundnut (Arachis hypogaea L) from the International Research Institute for the Semi-Arid Tropics, Andhra Pradesh, India were examined. Data for nine quantitative descriptors measured in the rainy and post-rainy growing seasons were used. The ordination technique of principal component analysis was used to reduce the dimensionality of the germplasm data. The identification of phenotypically similar groups of accessions within large scale data via the computationally intensive hierarchical clustering techniques was not feasible and non-hierarchical techniques had to be used. Finite mixture models that maximise the likelihood of an accession belonging to a cluster were used to cluster the accessions in this collection. The patterns of response for the different growing seasons were found to be highly correlated. However, in relating the results to passport and other characterisation and evaluation descriptors, the observed patterns did not appear to be related to taxonomy or any other well known characteristics of groundnut.
Resumo:
As a sequel to a paper that dealt with the analysis of two-way quantitative data in large germplasm collections, this paper presents analytical methods appropriate for two-way data matrices consisting of mixed data types, namely, ordered multicategory and quantitative data types. While various pattern analysis techniques have been identified as suitable for analysis of the mixed data types which occur in germplasm collections, the clustering and ordination methods used often can not deal explicitly with the computational consequences of large data sets (i.e. greater than 5000 accessions) with incomplete information. However, it is shown that the ordination technique of principal component analysis and the mixture maximum likelihood method of clustering can be employed to achieve such analyses. Germplasm evaluation data for 11436 accessions of groundnut (Arachis hypogaea L.) from the International Research Institute of the Semi-Arid Tropics, Andhra Pradesh, India were examined. Data for nine quantitative descriptors measured in the post-rainy season and five ordered multicategory descriptors were used. Pattern analysis results generally indicated that the accessions could be distinguished into four regions along the continuum of growth habit (or plant erectness). Interpretation of accession membership in these regions was found to be consistent with taxonomic information, such as subspecies. Each growth habit region contained accessions from three of the most common groundnut botanical varieties. This implies that within each of the habit types there is the full range of expression for the other descriptors used in the analysis. Using these types of insights, the patterns of variability in germplasm collections can provide scientists with valuable information for their plant improvement programs.
Resumo:
Information on the variation available for different plant attributes has enabled germplasm collections to be effectively utilised in plant breeding. A world sourced collection of white clover germplasm has been developed at the White Clover Resource Centre at Glen Innes, New South Wales. This collection of 439 accessions was characterised under field conditions as a preliminary study of the genotypic variation for morphological attributes; stolon density, stolon branching, number of nodes. number of rooted nodes, stolon thickness, internode length, leaf length, plant height and plant spread, together with seasonal herbage yield. Characterisation was conducted on different batches of germplasm (subsets of accessions taken from the complete collection) over a period of five years. Inclusion of two check cultivars, Haifa and Huia, in each batch enabled adjustment of the characterisation data for year effects and attribute-by-year interaction effects. The component of variance for seasonal herbage yield among batches was large relative to that for accessions. Accession-by-experiment and accession-by-season interactions for herbage yield were not detected. Accession mean repeatability for herbage yield across seasons was intermediate (0.453). The components of genotypic variance among accessions for all attributes, except plant height, were larger than their respective standard errors. The estimates of accession mean repeatability for the attributes ranged from low (0.277 for plant height) to intermediate (0.544 for internode length). Multivariate techniques of clustering and ordination were used to investigate the diversity present among the accessions in the collection. Both cluster analysis and principal component analysis suggested that seven groups of accessions existed. It was also proposed from the pattern analysis results that accessions from a group characterised by large leaves, tall plants and thick stolons could be crossed with accessions from a group that had above average stolon density and stolon branching. This material could produce breeding populations to be used in recurrent selection for the development of white clover cultivars for dryland summer moisture stress environments in Australia. The germplasm collection was also found to be deficient in genotypes with high stolon density, high number of branches high number of rooted nodes and large leaves. This warrants addition of new germplasm accessions possessing these characteristics to the present germplasm collection.
Resumo:
A catchment-scale multivariate statistical analysis of hydrochemistry enabled assessment of interactions between alluvial groundwater and Cressbrook Creek, an intermittent drainage system in southeast Queensland, Australia. Hierarchical cluster analyses and principal component analysis were applied to time-series data to evaluate the hydrochemical evolution of groundwater during periods of extreme drought and severe flooding. A simple three-dimensional geological model was developed to conceptualise the catchment morphology and the stratigraphic framework of the alluvium. The alluvium forms a two-layer system with a basal coarse-grained layer overlain by a clay-rich low-permeability unit. In the upper and middle catchment, alluvial groundwater is chemically similar to streamwater, particularly near the creek (reflected by high HCO3/Cl and K/Na ratios and low salinities), indicating a high degree of connectivity. In the lower catchment, groundwater is more saline with lower HCO3/Cl and K/Na ratios, notably during dry periods. Groundwater salinity substantially decreased following severe flooding in 2011, notably in the lower catchment, confirming that flooding is an important mechanism for both recharge and maintaining groundwater quality. The integrated approach used in this study enabled effective interpretation of hydrological processes and can be applied to a variety of hydrological settings to synthesise and evaluate large hydrochemical datasets.
Resumo:
The contamination of electrical insulators is one of the major contributors to the risk of operation outages in electrical substations, especially in coastal zones with high salinity levels and atmospheric pollution. By using the measurement of leakage-currents, which is one of the main indicators of contamination in insulators, this work seeks to the determine the correlation with climatic variables, such as ambient temperature, relative humidity, solar irradiance, atmospheric pressure, and wind speed and direction. The results obtained provide an input to the behaviour of the leakage current under atmospheric conditions that are particular to the Caribbean coast of Colombia. Spearman’s rank correlation coefficients and principal component analysis are utilised to determine the significant relationships among the different variables under consideration. The necessary information for the study was obtained via historical databases of both atmospheric variables and the leakage current measured in over a period of one year in a 220-kV potential transformer insulator. We identified the influencing factors of temperature, humidity, radiation, wind speed and direction on the magnitude of the leakage current as the most relevant.
Resumo:
Research problem: Overfitting and collinearity problems commonly exist in current construction cost estimation applications and obstruct researchers and practitioners in achieving better modelling results. Research objective and method: A hybrid approach of Akaike information criterion (AIC) stepwise regression and principal component regression (PCR) is proposed to help solve overfitting and collinearity problems. Utilization of this approach in linear regression is validated by comparing it with other commonly used approaches. The mean square error obtained by leave-one-out cross validation (MSELOOCV) is used in model selection in deciding predictive variables.
Resumo:
Development literature has argued that empowering women can effectively increase the utilisation of maternal health care. This study examines this hypothesis in the context of Nepal where only 28% of women delivered in facilities. The two-level random intercept logit models were fitted for data from the Nepal Demographic and Health Surveys 2011. Women‟s empowerment was quantified with a single index constructed from many variables. These variables captured different aspects of women‟s lives and decision-making in their households, and were combined using the principal component analysis method. The results confirmed a positive relationship between women‟s as an inevitable product of the economic development process.
Resumo:
This paper presents an online, unsupervised training algorithm enabling vision-based place recognition across a wide range of changing environmental conditions such as those caused by weather, seasons, and day-night cycles. The technique applies principal component analysis to distinguish between aspects of a location’s appearance that are condition-dependent and those that are condition-invariant. Removing the dimensions associated with environmental conditions produces condition-invariant images that can be used by appearance-based place recognition methods. This approach has a unique benefit – it requires training images from only one type of environmental condition, unlike existing data-driven methods that require training images with labelled frame correspondences from two or more environmental conditions. The method is applied to two benchmark variable condition datasets. Performance is equivalent or superior to the current state of the art despite the lesser training requirements, and is demonstrated to generalise to previously unseen locations.
Resumo:
Thirteen sites in Deception Bay, Queensland, Australia were sampled three times over a period of 7 months and assessed for contamination by a range of heavy metals, primarily As, Cd, Cr, Cu, Pb and Hg. Fraction analysis, enrichment factors and Principal Components Analysis-Absolute Principal Component Scores (PCA-APCS) analysis were conducted in order to identify the potential bioavailability of these elements of concern and their sources. Hg and Te were identified as the elements of highest enrichment in Deception Bay while marine sediments, shipping and antifouling agents were identified as the sources of the Weak acid Extractable Metals (WE-M), with antifouling agents showing long residence time for mercury contamination. This has significant implications for the future of monitoring and regulation of heavy metal contamination within Deception Bay.
Resumo:
The Galilee and Eromanga basins are sub-basins of the Great Artesian Basin (GAB). In this study, a multivariate statistical approach (hierarchical cluster analysis, principal component analysis and factor analysis) is carried out to identify hydrochemical patterns and assess the processes that control hydrochemical evolution within key aquifers of the GAB in these basins. The results of the hydrochemical assessment are integrated into a 3D geological model (previously developed) to support the analysis of spatial patterns of hydrochemistry, and to identify the hydrochemical and hydrological processes that control hydrochemical variability. In this area of the GAB, the hydrochemical evolution of groundwater is dominated by evapotranspiration near the recharge area resulting in a dominance of the Na–Cl water types. This is shown conceptually using two selected cross-sections which represent discrete groundwater flow paths from the recharge areas to the deeper parts of the basins. With increasing distance from the recharge area, a shift towards a dominance of carbonate (e.g. Na–HCO3 water type) has been observed. The assessment of hydrochemical changes along groundwater flow paths highlights how aquifers are separated in some areas, and how mixing between groundwater from different aquifers occurs elsewhere controlled by geological structures, including between GAB aquifers and coal bearing strata of the Galilee Basin. The results of this study suggest that distinct hydrochemical differences can be observed within the previously defined Early Cretaceous–Jurassic aquifer sequence of the GAB. A revision of the two previously recognised hydrochemical sequences is being proposed, resulting in three hydrochemical sequences based on systematic differences in hydrochemistry, salinity and dominant hydrochemical processes. The integrated approach presented in this study which combines different complementary multivariate statistical techniques with a detailed assessment of the geological framework of these sedimentary basins, can be adopted in other complex multi-aquifer systems to assess hydrochemical evolution and its geological controls.
Resumo:
Structural damage detection using measured dynamic data for pattern recognition is a promising approach. These pattern recognition techniques utilize artificial neural networks and genetic algorithm to match pattern features. In this study, an artificial neural network–based damage detection method using frequency response functions is presented, which can effectively detect nonlinear damages for a given level of excitation. The main objective of this article is to present a feasible method for structural vibration–based health monitoring, which reduces the dimension of the initial frequency response function data and transforms it into new damage indices and employs artificial neural network method for detecting different levels of nonlinearity using recognized damage patterns from the proposed algorithm. Experimental data of the three-story bookshelf structure at Los Alamos National Laboratory are used to validate the proposed method. Results showed that the levels of nonlinear damages can be identified precisely by the developed artificial neural networks. Moreover, it is identified that artificial neural networks trained with summation frequency response functions give higher precise damage detection results compared to the accuracy of artificial neural networks trained with individual frequency response functions. The proposed method is therefore a promising tool for structural assessment in a real structure because it shows reliable results with experimental data for nonlinear damage detection which renders the frequency response function–based method convenient for structural health monitoring.
Resumo:
Diagnosis of articular cartilage pathology in the early disease stages using current clinical diagnostic imaging modalities is challenging, particularly because there is often no visible change in the tissue surface and matrix content, such as proteoglycans (PG). In this study, we propose the use of near infrared (NIR) spectroscopy to spatially map PG content in articular cartilage. The relationship between NIR spectra and reference data (PG content) obtained from histology of normal and artificially induced PG-depleted cartilage samples was investigated using principal component (PC) and partial least squares (PLS) regression analyses. Significant correlation was obtained between both data (R2 = 91.40%, p<0.0001). The resulting correlation was used to predict PG content from spectra acquired from whole joint sample, this was then employed to spatially map this component of cartilage across the intact sample. We conclude that NIR spectroscopy is a feasible tool for evaluating cartilage contents and mapping their distribution across mammalian joint