893 resultados para principal component regression
Resumo:
Radiomics is the high-throughput extraction and analysis of quantitative image features. For non-small cell lung cancer (NSCLC) patients, radiomics can be applied to standard of care computed tomography (CT) images to improve tumor diagnosis, staging, and response assessment. The first objective of this work was to show that CT image features extracted from pre-treatment NSCLC tumors could be used to predict tumor shrinkage in response to therapy. This is important since tumor shrinkage is an important cancer treatment endpoint that is correlated with probability of disease progression and overall survival. Accurate prediction of tumor shrinkage could also lead to individually customized treatment plans. To accomplish this objective, 64 stage NSCLC patients with similar treatments were all imaged using the same CT scanner and protocol. Quantitative image features were extracted and principal component regression with simulated annealing subset selection was used to predict shrinkage. Cross validation and permutation tests were used to validate the results. The optimal model gave a strong correlation between the observed and predicted shrinkages with . The second objective of this work was to identify sets of NSCLC CT image features that are reproducible, non-redundant, and informative across multiple machines. Feature sets with these qualities are needed for NSCLC radiomics models to be robust to machine variation and spurious correlation. To accomplish this objective, test-retest CT image pairs were obtained from 56 NSCLC patients imaged on three CT machines from two institutions. For each machine, quantitative image features with concordance correlation coefficient values greater than 0.90 were considered reproducible. Multi-machine reproducible feature sets were created by taking the intersection of individual machine reproducible feature sets. Redundant features were removed through hierarchical clustering. The findings showed that image feature reproducibility and redundancy depended on both the CT machine and the CT image type (average cine 4D-CT imaging vs. end-exhale cine 4D-CT imaging vs. helical inspiratory breath-hold 3D CT). For each image type, a set of cross-machine reproducible, non-redundant, and informative image features was identified. Compared to end-exhale 4D-CT and breath-hold 3D-CT, average 4D-CT derived image features showed superior multi-machine reproducibility and are the best candidates for clinical correlation.
Resumo:
Copyright © (2014) by the International Machine Learning Society (IMLS) All rights reserved. Classical methods such as Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are ubiquitous in statistics. However, these techniques are only able to reveal linear re-lationships in data. Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale. In a separate strand of recent research, randomized methods have been proposed to construct features that help reveal nonlinear patterns in data. For basic tasks such as regression or classification, random features exhibit little or no loss in performance, while achieving drastic savings in computational requirements. In this paper we leverage randomness to design scalable new variants of nonlinear PCA and CCA; our ideas extend to key multivariate analysis tools such as spectral clustering or LDA. We demonstrate our algorithms through experiments on real- world data, on which we compare against the state-of-the-art. A simple R implementation of the presented algorithms is provided.
Resumo:
Subspace monitoring has recently been proposed as a condition monitoring tool that requires considerably fewer variables to be analysed compared to dynamic principal component analysis (PCA). This paper analyses subspace monitoring in identifying and isolating fault conditions, which reveals that the existing work suffers from inherent limitations if complex fault senarios arise. Based on the assumption that the fault signature is deterministic while the monitored variables are stochastic, the paper introduces a regression-based reconstruction technique to overcome these limitations. The utility of the proposed fault identification and isolation method is shown using a simulation example and the analysis of experimental data from an industrial reactive distillation unit.
Resumo:
Single component geochemical maps are the most basic representation of spatial elemental distributions and commonly used in environmental and exploration geochemistry. However, the compositional nature of geochemical data imposes several limitations on how the data should be presented. The problems relate to the constant sum problem (closure), and the inherently multivariate relative information conveyed by compositional data. Well known is, for instance, the tendency of all heavy metals to show lower values in soils with significant contributions of diluting elements (e.g., the quartz dilution effect); or the contrary effect, apparent enrichment in many elements due to removal of potassium during weathering. The validity of classical single component maps is thus investigated, and reasonable alternatives that honour the compositional character of geochemical concentrations are presented. The first recommended such method relies on knowledge-driven log-ratios, chosen to highlight certain geochemical relations or to filter known artefacts (e.g. dilution with SiO2 or volatiles). This is similar to the classical normalisation approach to a single element. The second approach uses the (so called) log-contrasts, that employ suitable statistical methods (such as classification techniques, regression analysis, principal component analysis, clustering of variables, etc.) to extract potentially interesting geochemical summaries. The caution from this work is that if a compositional approach is not used, it becomes difficult to guarantee that any identified pattern, trend or anomaly is not an artefact of the constant sum constraint. In summary the authors recommend a chain of enquiry that involves searching for the appropriate statistical method that can answer the required geological or geochemical question whilst maintaining the integrity of the compositional nature of the data. The required log-ratio transformations should be applied followed by the chosen statistical method. Interpreting the results may require a closer working relationship between statisticians, data analysts and geochemists.
Resumo:
The problem of regression under Gaussian assumptions is treated generally. The relationship between Bayesian prediction, regularization and smoothing is elucidated. The ideal regression is the posterior mean and its computation scales as O(n3), where n is the sample size. We show that the optimal m-dimensional linear model under a given prior is spanned by the first m eigenfunctions of a covariance operator, which is a trace-class operator. This is an infinite dimensional analogue of principal component analysis. The importance of Hilbert space methods to practical statistics is also discussed.
Resumo:
A model to predict the buildup of mainly traffic-generated volatile organic compounds or VOCs (toluene, ethylbenzene, ortho-xylene, meta-xylene, and para-xylene) on urban road surfaces is presented. The model required three traffic parameters, namely average daily traffic (ADT), volume to capacity ratio (V/C), and surface texture depth (STD), and two chemical parameters, namely total suspended solid (TSS) and total organic carbon (TOC), as predictor variables. Principal component analysis and two phase factor analysis were performed to characterize the model calibration parameters. Traffic congestion was found to be the underlying cause of traffic-related VOC buildup on urban roads. The model calibration was optimized using orthogonal experimental design. Partial least squares regression was used for model prediction. It was found that a better optimized orthogonal design could be achieved by including the latent factors of the data matrix into the design. The model performed fairly accurately for three different land uses as well as five different particle size fractions. The relative prediction errors were 10–40% for the different size fractions and 28–40% for the different land uses while the coefficients of variation of the predicted intersite VOC concentrations were in the range of 25–45% for the different size fractions. Considering the sizes of the data matrices, these coefficients of variation were within the acceptable interlaboratory range for analytes at ppb concentration levels.
Resumo:
Many studies into construction procurement methods reveal evidence of a need to change the culture and attitude in the construction industry, transition from traditional adversarial relationships to cooperative and collaborative relationships. At the same time there is also increasing concern and discussion on alternative procurement methods, involving a movement away from traditional procurement systems. Relational contracting approaches, such as partnering and relationship management, are business strategies that align the objectives of clients, commercial participants and stakeholders. It provides a collaborative environment and a framework for all participants to adapt their behaviour to project objectives and allows for engagement of those subcontractors and suppliers down the supply chain. The efficacy of relationship management in the client and contractor groups is proven and well documented. However, the industry has a history of slow implementation of relational contracting down the supply chain. Furthermore, there exists little research on relationship management conducted in the supply chain context. This research aims to explore the association between relational contracting structures and processes and supply chain sustainability in the civil engineering construction industry. It endeavours to shed light on the practices and prerequisites for relationship management implementation success and for supply sustainability to develop. The research methodology is a triangulated approach based on Cheung.s (2006) earlier research where questionnaire survey, interviews and case studies were conducted. This new research includes a face-to-face questionnaire survey that was carried out with 100 professionals from 27 contracting organisations in Queensland from June 2008 to January 2009. A follow-up survey sub-questionnaire, further examining project participants. perspectives was sent to another group of professionals (as identified in the main questionnaire survey). Statistical analysis including multiple regression, correlation, principal component factor analysis and analysis of variance were used to identify the underlying dimensions and test the relationships among variables. Interviews and case studies were conducted to assist in providing a deeper understanding as well as explaining findings of the quantitative study. The qualitative approaches also gave the opportunity to critique and validate the research findings. This research presents the implementation of relationship management from the contractor.s perspective. Findings show that the adaption of relational contracting approach in the supply chain is found to be limited; contractors still prefer to keep the suppliers and subcontractors at arm.s length. This research shows that the degree of match and mismatch between organisational structuring and organisational process has an impact on staff.s commitment level and performance effectiveness. Key issues affecting performance effectiveness and relationship effectiveness include total influence between parties, access to information, personal acquaintance, communication process, risk identification, timely problem solving and commercial framework. Findings also indicate that alliance and Early Contractor Involvement (ECI) projects achieve higher performance effectiveness at both short-term and long-term levels compared to projects with either no or partial relationship management adopted.
Resumo:
In this paper, spatially offset Raman spectroscopy (SORS) is demonstrated for non-invasively investigating the composition of drug mixtures inside an opaque plastic container. The mixtures consisted of three components including a target drug (acetaminophen or phenylephrine hydrochloride) and two diluents (glucose and caffeine). The target drug concentrations ranged from 5% to 100%. After conducting SORS analysis to ascertain the Raman spectra of the concealed mixtures, principal component analysis (PCA) was performed on the SORS spectra to reveal trends within the data. Partial least squares (PLS) regression was used to construct models that predicted the concentration of each target drug, in the presence of the other two diluents. The PLS models were able to predict the concentration of acetaminophen in the validation samples with a root-mean-square error of prediction (RMSEP) of 3.8% and the concentration of phenylephrine hydrochloride with an RMSEP of 4.6%. This work demonstrates the potential of SORS, used in conjunction with multivariate statistical techniques, to perform non-invasive, quantitative analysis on mixtures inside opaque containers. This has applications for pharmaceutical analysis, such as monitoring the degradation of pharmaceutical products on the shelf, in forensic investigations of counterfeit drugs, and for the analysis of illicit drug mixtures which may contain multiple components.
Resumo:
The content and context of work significantly influences an employees’ satisfaction. While managers see work motivation as a tool to engage the employees so that they perform better, academicians value work motivation for its contribution to human behaviour. Though the relationship between employee motivation and project success has been extensively covered in the literature, more research focusing on the nature of job design on project success may have been wanting. We address this gap through this study. The present study contributes to the extant literature by suggesting an operational framework of work motivation for project—based organizations. We are also advancing the conceptual understanding of this variable by understanding how the different facets of work motivation have a differing impact of the various parameters of project performance. A survey instrument using standardized scales of work motivation and project success was used. 199 project workers from various industries completed the survey. We first ‘operationalized’ the definition of work motivation for the purpose of our study through a principal component analysis of work motivation items. We obtained a five factor structure that had items pertaining to employee development, work climate, goal clarity, and job security. We then performed a Pearson’s correlation analysis which revealed moderate to significant relationship between project outcomes ad work climate; project outcomes & employee development. In order to establish a causality between work motivation and project management success, we employed linear regression analysis. The results show that work climate is a significant predictor of client satisfaction, while it moderately influences the project quality. Further, bringing in objectivity to project work is important for a successful implementation.
Resumo:
Traffic generated semi and non volatile organic compounds (SVOCs and NVOCs) pose a serious threat to human and ecosystem health when washed off into receiving water bodies by stormwater. Climate change influenced rainfall characteristics makes the estimation of these pollutants in stormwater quite complex. The research study discussed in the paper developed a prediction framework for such pollutants under the dynamic influence of climate change on rainfall characteristics. It was established through principal component analysis (PCA) that the intensity and durations of low to moderate rain events induced by climate change mainly affect the wash-off of SVOCs and NVOCs from urban roads. The study outcomes were able to overcome the limitations of stringent laboratory preparation of calibration matrices by extracting uncorrelated underlying factors in the data matrices through systematic application of PCA and factor analysis (FA). Based on the initial findings from PCA and FA, the framework incorporated orthogonal rotatable central composite experimental design to set up calibration matrices and partial least square regression to identify significant variables in predicting the target SVOCs and NVOCs in four particulate fractions ranging from >300-1 μm and one dissolved fraction of <1 μm. For the particulate fractions range >300-1 μm, similar distributions of predicted and observed concentrations of the target compounds from minimum to 75th percentile were achieved. The inter-event coefficient of variations for particulate fractions of >300-1 μm were 5% to 25%. The limited solubility of the target compounds in stormwater restricted the predictive capacity of the proposed method for the dissolved fraction of <1 μm.
Resumo:
Objective The aim of this study was to demonstrate the potential of near-infrared (NIR) spectroscopy for categorizing cartilage degeneration induced in animal models. Method Three models of osteoarthritic degeneration were induced in laboratory rats via one of the following methods: (i) menisectomy (MSX); (ii) anterior cruciate ligament transaction (ACLT); and (iii) intra-articular injection of mono-ido-acetete (1 mg) (MIA), in the right knee joint, with 12 rats per model group. After 8 weeks, the animals were sacrificed and tibial knee joints were collected. A custom-made nearinfrared (NIR) probe of diameter 5 mm was placed on the cartilage surface and spectral data were acquired from each specimen in the wavenumber range 4 000 – 12 500 cm−1. Following spectral data acquisition, the specimens were fixed and Safranin–O staining was performed to assess disease severity based on the Mankin scoring system. Using multivariate statistical analysis based on principal component analysis and partial least squares regression, the spectral data were then related to the Mankinscores of the samples tested. Results Mild to severe degenerative cartilage changes were observed in the subject animals. The ACLT models showed mild cartilage degeneration, MSX models moderate, and MIA severe cartilage degenerative changes both morphologically and histologically. Our result demonstrate that NIR spectroscopic information is capable of separating the cartilage samples into different groups relative to the severity of degeneration, with NIR correlating significantly with their Mankinscore (R2 = 88.85%). Conclusion We conclude that NIR is a viable tool for evaluating articularcartilage health and physical properties such as change in thickness with degeneration.
Resumo:
Determining the properties and integrity of subchondral bone in the developmental stages of osteoarthritis, especially in a form that can facilitate real-time characterization for diagnostic and decision-making purposes, is still a matter for research and development. This paper presents relationships between near infrared absorption spectra and properties of subchondral bone obtained from 3 models of osteoarthritic degeneration induced in laboratory rats via: (i) menisectomy (MSX); (ii) anterior cruciate ligament transaction (ACL); and (iii) intra-articular injection of mono-ido-acetate (1 mg) (MIA), in the right knee joint, with 12 rats per model group (N = 36). After 8 weeks, the animals were sacrificed and knee joints were collected. A custom-made diffuse reflectance NIR probe of diameter 5 mm was placed on the tibial surface and spectral data were acquired from each specimen in the wavenumber range 4000–12 500 cm− 1. After spectral acquisition, micro computed tomography (micro-CT) was performed on the samples and subchondral bone parameters namely: bone volume (BV) and bone mineral density (BMD) were extracted from the micro-CT data. Statistical correlation was then conducted between these parameters and regions of the near infrared spectra using multivariate techniques including principal component analysis (PCA), discriminant analysis (DA), and partial least squares (PLS) regression. Statistically significant linear correlations were found between the near infrared absorption spectra and subchondral bone BMD (R2 = 98.84%) and BV (R2 = 97.87%). In conclusion, near infrared spectroscopic probing can be used to detect, qualify and quantify changes in the composition of the subchondral bone, and could potentially assist in distinguishing healthy from OA bone as demonstrated with our laboratory rat models.
Resumo:
Near-infrared spectroscopy (NIRS) calibrations were developed for the discrimination of Chinese hawthorn (Crataegus pinnatifida Bge. var. major) fruit from three geographical regions as well as for the estimation of the total sugar, total acid, total phenolic content, and total antioxidant activity. Principal component analysis (PCA) was used for the discrimination of the fruit on the basis of their geographical origin. Three pattern recognition methods, linear discriminant analysis, partial least-squares-discriminant analysis, and back-propagation artificial neural networks, were applied to classify and compare these samples. Furthermore, three multivariate calibration models based on the first derivative NIR spectroscopy, partial least-squares regression, back-propagation artificial neural networks, and least-squares-support vector machines, were constructed for quantitative analysis of the four analytes, total sugar, total acid, total phenolic content, and total antioxidant activity, and validated by prediction data sets.
Resumo:
The main purpose of this article is to gain an insight into the relationships between variables describing the environmental conditions of the Far Northern section of the Great Barrier Reef, Australia. Several of the variables describing these conditions had different measurement levels and often they had non-linear relationships. Using non-linear principal component analysis, it was possible to acquire an insight into these relationships. Furthermore, three geographical areas with unique environmental characteristics could be identified.
Resumo:
Diagnosis of articular cartilage pathology in the early disease stages using current clinical diagnostic imaging modalities is challenging, particularly because there is often no visible change in the tissue surface and matrix content, such as proteoglycans (PG). In this study, we propose the use of near infrared (NIR) spectroscopy to spatially map PG content in articular cartilage. The relationship between NIR spectra and reference data (PG content) obtained from histology of normal and artificially induced PG-depleted cartilage samples was investigated using principal component (PC) and partial least squares (PLS) regression analyses. Significant correlation was obtained between both data (R2 = 91.40%, p<0.0001). The resulting correlation was used to predict PG content from spectra acquired from whole joint sample, this was then employed to spatially map this component of cartilage across the intact sample. We conclude that NIR spectroscopy is a feasible tool for evaluating cartilage contents and mapping their distribution across mammalian joint