61 resultados para least squares learning
Resumo:
Many of the most interesting questions ecologists ask lead to analyses of spatial data. Yet, perhaps confused by the large number of statistical models and fitting methods available, many ecologists seem to believe this is best left to specialists. Here, we describe the issues that need consideration when analysing spatial data and illustrate these using simulation studies. Our comparative analysis involves using methods including generalized least squares, spatial filters, wavelet revised models, conditional autoregressive models and generalized additive mixed models to estimate regression coefficients from synthetic but realistic data sets, including some which violate standard regression assumptions. We assess the performance of each method using two measures and using statistical error rates for model selection. Methods that performed well included generalized least squares family of models and a Bayesian implementation of the conditional auto-regressive model. Ordinary least squares also performed adequately in the absence of model selection, but had poorly controlled Type I error rates and so did not show the improvements in performance under model selection when using the above methods. Removing large-scale spatial trends in the response led to poor performance. These are empirical results; hence extrapolation of these findings to other situations should be performed cautiously. Nevertheless, our simulation-based approach provides much stronger evidence for comparative analysis than assessments based on single or small numbers of data sets, and should be considered a necessary foundation for statements of this type in future.
Resumo:
A study was undertaken to examine a range of sample preparation and near infrared reflectance spectroscopy (NIPS) methodologies, using undried samples, for predicting organic matter digestibility (OMD g kg(-1)) and ad libitum intake (g kg(-1) W-0.75) of grass silages. A total of eight sample preparation/NIRS scanning methods were examined involving three extents of silage comminution, two liquid extracts and scanning via either external probe (1100-2200 nm) or internal cell (1100-2500 nm). The spectral data (log 1/R) for each of the eight methods were examined by three regression techniques each with a range of data transformations. The 136 silages used in the study were obtained from farms across Northern Ireland, over a two year period, and had in vivo OMD (sheep) and ad libitum intake (cattle) determined under uniform conditions. In the comparisons of the eight sample preparation/scanning methods, and the differing mathematical treatments of the spectral data, the sample population was divided into calibration (n = 91) and validation (n = 45) sets. The standard error of performance (SEP) on the validation set was used in comparisons of prediction accuracy. Across all 8 sample preparation/scanning methods, the modified partial least squares (MPLS) technique, generally minimized SEP's for both OMD and intake. The accuracy of prediction also increased with degree of comminution of the forage and with scanning by internal cell rather than external probe. The system providing the lowest SEP used the MPLS regression technique on spectra from the finely milled material scanned through the internal cell. This resulted in SEP and R-2 (variance accounted for in validation set) values of 24 (g/kg OM) and 0.88 (OMD) and 5.37 (g/kg W-0.75) and 0.77 (intake) respectively. These data indicate that with appropriate techniques NIRS scanning of undried samples of grass silage can produce predictions of intake and digestibility with accuracies similar to those achieved previously using NIRS with dried samples. (C) 1998 Elsevier Science B.V.
Resumo:
A study combining high resolution mass spectrometry (liquid chromatography-quadrupole time-of-flight-mass spectrometry, UPLC-QTof-MS) and chemometrics for the analysis of post-mortem brain tissue from subjects with Alzheimer’s disease (AD) (n = 15) and healthy age-matched controls (n = 15) was undertaken. The huge potential of this metabolomics approach for distinguishing AD cases is underlined by the correct prediction of disease status in 94–97% of cases. Predictive power was confirmed in a blind test set of 60 samples, reaching 100% diagnostic accuracy. The approach also indicated compounds significantly altered in concentration following the onset of human AD. Using orthogonal partial least-squares discriminant analysis (OPLS-DA), a multivariate model was created for both modes of acquisition explaining the maximum amount of variation between sample groups (Positive Mode-R2 = 97%; Q2 = 93%; root mean squared error of validation (RMSEV) = 13%; Negative Mode-R2 = 99%; Q2 = 92%; RMSEV = 15%). In brain extracts, 1264 and 1457 ions of interest were detected for the different modes of acquisition (positive and negative, respectively). Incorporation of gender into the model increased predictive accuracy and decreased RMSEV values. High resolution UPLC-QTof-MS has not previously been employed to biochemically profile post-mortem brain tissue, and the novel methods described and validated herein prove its potential for making new discoveries related to the etiology, pathophysiology, and treatment of degenerative brain disorders.
Resumo:
The techniques of principal component analysis (PCA) and partial least squares (PLS) are introduced from the point of view of providing a multivariate statistical method for modelling process plants. The advantages and limitations of PCA and PLS are discussed from the perspective of the type of data and problems that might be encountered in this application area. These concepts are exemplified by two case studies dealing first with data from a continuous stirred tank reactor (CSTR) simulation and second a literature source describing a low-density polyethylene (LDPE) reactor simulation.
Resumo:
Objective
To examine whether early inflammation is related to cortisol levels at 18 months corrected age (CA) in children born very preterm.
Study Design
Infants born ≤ 32 weeks gestational age were recruited in the NICU, and placental histopathology, MRI, and chart review were obtained. At 18 months CA developmental assessment and collection of 3 salivary cortisol samples were carried out. Generalized least squares was used to analyze data from 85 infants providing 222 cortisol samples.
Results
Infants exposed to chorioamnionitis with funisitis had a significantly different pattern of cortisol across the samples compared to infants with chorioamnionitis alone or no prenatal inflammation (F[4,139] = 7.3996, P <.0001). Postnatal infections, necrotizing enterocolitis and chronic lung disease were not significantly associated with the cortisol pattern at 18 months CA.
Conclusion
In children born very preterm, prenatal inflammatory stress may contribute to altered programming of the HPA axis.
Keywords: preterm, chorioamnionitis, funisitis, premature infants, hypothalamic-pituitary-adrenal axis, infection, cortisol, stress
Resumo:
Objective
To investigate the effect of fast food consumption on mean population body mass index (BMI) and explore the possible influence of market deregulation on fast food consumption and BMI.
Methods
The within-country association between fast food consumption and BMI in 25 high-income member countries of the Organisation for Economic Co-operation and Development between 1999 and 2008 was explored through multivariate panel regression models, after adjustment for per capita gross domestic product, urbanization, trade openness, lifestyle indicators and other covariates. The possible mediating effect of annual per capita intake of soft drinks, animal fats and total calories on the association between fast food consumption and BMI was also analysed. Two-stage least squares regression models were conducted, using economic freedom as an instrumental variable, to study the causal effect of fast food consumption on BMI.
Findings
After adjustment for covariates, each 1-unit increase in annual fast food transactions per capita was associated with an increase of 0.033 kg/m2 in age-standardized BMI (95% confidence interval, CI: 0.013–0.052). Only the intake of soft drinks – not animal fat or total calories – mediated the observed association (β: 0.030; 95% CI: 0.010–0.050). Economic freedom was an independent predictor of fast food consumption (β: 0.27; 95% CI: 0.16–0.37). When economic freedom was used as an instrumental variable, the association between fast food and BMI weakened but remained significant (β: 0.023; 95% CI: 0.001–0.045).
Conclusion
Fast food consumption is an independent predictor of mean BMI in high-income countries. Market deregulation policies may contribute to the obesity epidemic by facilitating the spread of fast food.
Resumo:
Thermocouples are one of the most popular devices for temperature measurement due to their robustness, ease of manufacture and installation, and low cost. However, when used in certain harsh environments, for example, in combustion systems and engine exhausts, large wire diameters are required, and consequently the measurement bandwidth is reduced. This article discusses a software compensation technique to address the loss of high frequency fluctuations based on measurements from two thermocouples. In particular, a difference equation (DE) approach is proposed and compared with existing methods both in simulation and on experimental test rig data with constant flow velocity. It is found that the DE algorithm, combined with the use of generalized total least squares for parameter identification, provides better performance in terms of time constant estimation without any a priori assumption on the time constant ratios of the thermocouples.
Resumo:
BACKGROUND: It is now common for individuals to require dialysis following the failure of a kidney transplant. Management of complications and preparation for dialysis are suboptimal in this group. To aid planning, it is desirable to estimate the time to dialysis requirement. The rate of decline in the estimated glomerular filtration rate (eGFR) may be used to this end.
METHODS: This study compared the rate of eGFR decline prior to dialysis commencement between individuals with failing transplants and transplant-naïve patients. The rate of eGFR decline was also compared between transplant recipients with and without graft failure. eGFR was calculated using the four-variable MDRD equation with rate of decline calculated by least squares linear regression.
RESULTS: The annual rate of eGFR decline in incident dialysis patients with graft failure exceeded that of the transplant-naïve incident dialysis patients. In the transplant cohort, the mean annual rate of eGFR decline prior to graft failure was 7.3 ml/min/1.73 m(2) compared to 4.8 ml/min/1.73 m(2) in the transplant-naïve group (p < 0.001) and 0.35 ml/min/1.73 m(2) in recipients without graft failure (p < 0.001). Factors associated with eGFR decline were recipient age, decade of transplantation, HLA mismatch and histological evidence of chronic immunological injury.
CONCLUSIONS: Individuals with graft failure have a rapid decline in eGFR prior to dialysis commencement. To improve outcomes, dialysis planning and management of chronic kidney disease complications should be initiated earlier than in the transplant-naïve population.
Resumo:
Tropical peatlands represent globally important carbon sinks with a unique biodiversity and are currently threatened by climate change and human activities. It is now imperative that proxy methods are developed to understand the ecohydrological dynamics of these systems and for testing peatland development models. Testate amoebae have been used as environmental indicators in ecological and palaeoecological studies of peatlands, primarily in ombrotrophic Sphagnum-dominated peatlands in the mid- and high-latitudes. We present the first ecological analysis of testate amoebae in a tropical peatland, a nutrient-poor domed bog in western (Peruvian) Amazonia. Litter samples were collected from different hydrological microforms (hummock to pool) along a transect from the edge to the interior of the peatland. We recorded 47 taxa from 21 genera. The most common taxa are Cryptodifflugia oviformis, Euglypha rotunda type, Phryganella acropodia, Pseudodifflugia fulva type and Trinema lineare. One species found only in the southern hemisphere, Argynnia spicata, is present. Arcella spp., Centropyxis aculeata and Lesqueresia spiralis are indicators of pools containing standing water. Canonical correspondence analysis and non-metric multidimensional scaling illustrate that water table depth is a significant control on the distribution of testate amoebae, similar to the results from mid- and high-latitude peatlands. A transfer function model for water table based on weighted averaging partial least-squares (WAPLS) regression is presented and performs well under cross-validation (r 2apparent=0.76,RMSE=4.29;r2jack=0.68,RMSEP=5.18. The transfer function was applied to a 1-m peat core, and sample-specific reconstruction errors were generated using bootstrapping. The reconstruction generally suggests near-surface water tables over the last 3,000 years, with a shift to drier conditions at c. cal. 1218-1273 AD
Resumo:
Brain tissue from so-called Alzheimer's disease (AD) mouse models has previously been examined using H-1 NMR-metabolomics, but comparable information concerning human AD is negligible. Since no animal model recapitulates all the features of human AD we undertook the first H-1 NMR-metabolomics investigation of human AD brain tissue. Human post-mortem tissue from 15 AD subjects and 15 age-matched controls was prepared for analysis through a series of lyophilised, milling, extraction and randomisation steps and samples were analysed using H-1 NMR. Using partial least squares discriminant analysis, a model was built using data obtained from brain extracts. Analysis of brain extracts led to the elucidation of 24 metabolites. Significant elevations in brain alanine (15.4 %) and taurine (18.9 %) were observed in AD patients (p ≤ 0.05). Pathway topology analysis implicated either dysregulation of taurine and hypotaurine metabolism or alanine, aspartate and glutamate metabolism. Furthermore, screening of metabolites for AD biomarkers demonstrated that individual metabolites weakly discriminated cases of AD [receiver operating characteristic (ROC) AUC <0.67; p < 0.05]. However, paired metabolites ratios (e.g. alanine/carnitine) were more powerful discriminating tools (ROC AUC = 0.76; p < 0.01). This study further demonstrates the potential of metabolomics for elucidating the underlying biochemistry and to help identify AD in patients attending the memory clinic
Resumo:
Many AMS systems can measure 14C, 13C and 12C simultaneously thus providing δ13C values which can be used for fractionation normalization without the need for offline 13C /12C measurements on isotope ratio mass spectrometers (IRMS). However AMS δ13C values on our 0.5MV NEC Compact Accelerator often differ from IRMS values on the same material by 4-5‰ or more. It has been postulated that the AMS δ13C values account for the potential graphitization and machine induced fractionation, in addition to natural fractionation, but how much does this affect the 14C ages or F14C? We present an analysis of F14C as a linear least squares fit with AMS δ13C results for several of our secondary standards. While there are samples for which there is an obvious correlation between AMS δ13C and F14C, as quantified with the calculated probability of no correlation, we find that the trend lies within one standard deviation of the variance on our F14C measurements. Our laboratory produces both zinc and hydrogen reduced graphite, and we present our results for each type. Additionally, we show the variance on our AMS δ13C measurements of our secondary standards.
Resumo:
In this paper, a multiloop robust control strategy is proposed based on H∞ control and a partial least squares (PLS) model (H∞_PLS) for multivariable chemical processes. It is developed especially for multivariable systems in ill-conditioned plants and non-square systems. The advantage of PLS is to extract the strongest relationship between the input and the output variables in the reduced space of the latent variable model rather than in the original space of the highly dimensional variables. Without conventional decouplers, the dynamic PLS framework automatically decomposes the MIMO process into multiple single-loop systems in the PLS subspace so that the controller design can be simplified. Since plant/model mismatch is almost inevitable in practical applications, to enhance the robustness of this control system, the controllers based on the H∞ mixed sensitivity problem are designed in the PLS latent subspace. The feasibility and the effectiveness of the proposed approach are illustrated by the simulation results of a distillation column and a mixing tank process. Comparisons between H∞_PLS control and conventional individual control (either H∞ control or PLS control only) are also made
Resumo:
One of the major challenges in systems biology is to understand the complex responses of a biological system to external perturbations or internal signalling depending on its biological conditions. Genome-wide transcriptomic profiling of cellular systems under various chemical perturbations allows the manifestation of certain features of the chemicals through their transcriptomic expression profiles. The insights obtained may help to establish the connections between human diseases, associated genes and therapeutic drugs. The main objective of this study was to systematically analyse cellular gene expression data under various drug treatments to elucidate drug-feature specific transcriptomic signatures. We first extracted drug-related information (drug features) from the collected textual description of DrugBank entries using text-mining techniques. A novel statistical method employing orthogonal least square learning was proposed to obtain drug-feature-specific signatures by integrating gene expression with DrugBank data. To obtain robust signatures from noisy input datasets, a stringent ensemble approach was applied with the combination of three techniques: resampling, leave-one-out cross validation, and aggregation. The validation experiments showed that the proposed method has the capacity of extracting biologically meaningful drug-feature-specific gene expression signatures. It was also shown that most of signature genes are connected with common hub genes by regulatory network analysis. The common hub genes were further shown to be related to general drug metabolism by Gene Ontology analysis. Each set of genes has relatively few interactions with other sets, indicating the modular nature of each signature and its drug-feature-specificity. Based on Gene Ontology analysis, we also found that each set of drug feature (DF)-specific genes were indeed enriched in biological processes related to the drug feature. The results of these experiments demonstrated the pot- ntial of the method for predicting certain features of new drugs using their transcriptomic profiles, providing a useful methodological framework and a valuable resource for drug development and characterization.
Resumo:
The aim of the study was to investigate the potential of a metabolomics platform to distinguish between pigs treated with ronidazole, dimetridazole and metronidazole and non-medicated animals (controls), at two withdrawal periods (day 0 and 5). Livers from each animal were biochemically profiled using UHPLC–QTof-MS in ESI+ mode of acquisition. Several Orthogonal Partial Least Squares-Discriminant Analysis models were generated from the acquired mass spectrometry data. The models classified the two groups control and treated animals. A total of 42 ions of interest explained the variation in ESI+. It was possible to find the identity of 3 of the ions and to positively classify 4 of the ionic features, which can be used as potential biomarkers of illicit 5-nitroimidazole abuse. Further evidence of the toxic mechanisms of 5-nitroimidazole drugs has been revealed, which may be of substantial importance as metronidazole is widely used in human medicine.
Resumo:
A geostatistical version of the classical Fisher rule (linear discriminant analysis) is presented.This method is applicable when a large dataset of multivariate observations is available within a domain split in several known subdomains, and it assumes that the variograms (or covariance functions) are comparable between subdomains, which only differ in the mean values of the available variables. The method consists on finding the eigen-decomposition of the matrix W-1B, where W is the matrix of sills of all direct- and cross-variograms, and B is the covariance matrix of the vectors of weighted means within each subdomain, obtained by generalized least squares. The method is used to map peat blanket occurrence in Northern Ireland, with data from the Tellus
survey, which requires a minimal change to the general recipe: to use compositionally-compliant variogram tools and models, and work with log-ratio transformed data.