866 resultados para Ordinary Least Squares Method
Resumo:
Abstract: Raman spectroscopy has been used for the first time to predict the FA composition of unextracted adipose tissue of pork, beef, lamb, and chicken. It was found that the bulk unsaturation parameters could be predicted successfully [R-2 = 0.97, root mean square error of prediction (RMSEP) = 4.6% of 4 sigma], with cis unsaturation, which accounted for the majority of the unsaturation, giving similar correlations. The combined abundance of all measured PUFA (>= 2 double bonds per chain) was also well predicted with R-2 = 0.97 and RMSEP = 4.0% of 4 sigma. Trans unsaturation was not as well modeled (R-2 = 0.52, RMSEP = 18% of 4 sigma); this reduced prediction ability can be attributed to the low levels of trans FA found in adipose tissue (0.035 times the cis unsaturation level). For the individual FA, the average partial least squares (PLS) regression coefficient of the 18 most abundant FA (relative abundances ranging from 0.1 to 38.6% of the total FA content) was R-2 = 0.73; the average RMSEP = 11.9% of 4 sigma. Regression coefficients and prediction errors for the five most abundant FA were all better than the average value (in some cases as low as RMSEP = 4.7% of 4 sigma). Cross-correlation between the abundances of the minor FA and more abundant acids could be determined by principal component analysis methods, and the resulting groups of correlated compounds were also well-predicted using PLS. The accuracy of the prediction of individual FA was at least as good as other spectroscopic methods, and the extremely straightforward sampling method meant that very rapid analysis of samples at ambient temperature was easily achieved. This work shows that Raman profiling of hundreds of samples per day is easily achievable with an automated sampling system.
Resumo:
Raman spectroscopy has been used to predict the abundance of the FA in clarified butterfat that was obtained from dairy cows fed a range of levels of rapeseed oil in their diet. Partial least squares regression of the Raman spectra against FA compositions obtained by GC showed good prediction for the five major (abundance >5%) FA with R-2=0.74-0.92 and a root mean SE of prediction (RMSEP) that was 5-7% of the mean. In general, the prediction accuracy fell with decreasing abundance in the sample, but the RMSEP was 1.25%. The Raman method has the best prediction ability for unsaturated FA (R-2=0.85-0.92), and in particular trans unsaturated FA (best-predicted FA was 18:1 tDelta9). This enhancement was attributed to the isolation of the unsaturated modes from the saturated modes and the significantly higher spectral response of unsaturated bonds compared with saturated bonds. Raman spectra of the melted butter samples could also be used to predict bulk parameters calculated from standard analyzes, such as iodine value (R-2=0.80) and solid fat content at low temperature (R-2=0.87). For solid fat contents determined at higher temperatures, the prediction ability was significantly reduced (R-2=0.42), and this decrease in performance was attributed to the smaller range of values in solid fat content at the higher temperatures. Finally, although the prediction errors for the abundances of each of the FA in a given sample are much larger with Raman than with full GC analysis, the accuracy is acceptably high for quality control applications. This, combined with the fact that Raman spectra can be obtained with no sample preparation and with 60-s data collection times, means that high-throughput, on-line Raman analysis of butter samples should be possible.
Resumo:
Context: The masses previously obtained for the X-ray binary 2S 0921-630 inferred a compact object that was either a high-mass neutron star or low-mass black-hole, but used a previously published value for the rotational broadening (v sin i) with large uncertainties. Aims: We aim to determine an accurate mass for the compact object through an improved measurement of the secondary star's projected equatorial rotational velocity. Methods: We have used UVES echelle spectroscopy to determine the v sin i of the secondary star (V395 Car) in the low-mass X-ray binary 2S 0921-630 by comparison to an artificially broadened spectral-type template star. In addition, we have also measured v sin i from a single high signal-to-noise ratio absorption line profile calculated using the method of Least-Squares Deconvolution (LSD). Results: We determine v sin i to lie between 31.3±0.5 km s-1 to 34.7±0.5 km s-1 (assuming zero and continuum limb darkening, respectively) in disagreement with previous results based on intermediate resolution spectroscopy obtained with the 3.6 m NTT. Using our revised v sin i value in combination with the secondary star's radial velocity gives a binary mass ratio of 0.281±0.034. Furthermore, assuming a binary inclination angle of 75° gives a compact object mass of 1.37±0.13 M_?. Conclusions: We find that using relatively low-resolution spectroscopy can result in systemic uncertainties in the measured v sin i values obtained using standard methods. We suggest the use of LSD as a secondary, reliable check of the results as LSD allows one to directly discern the shape of the absorption line profile. In the light of the new v sin i measurement, we have revised down the compact object's mass, such that it is now compatible with a canonical neutron star mass.
Resumo:
Estimation and detection of the hemodynamic response (HDR) are of great importance in functional MRI (fMRI) data analysis. In this paper, we propose the use of three H 8 adaptive filters (finite memory, exponentially weighted, and time-varying) for accurate estimation and detection of the HDR. The H 8 approach is used because it safeguards against the worst case disturbances and makes no assumptions on the (statistical) nature of the signals [B. Hassibi and T. Kailath, in Proc. ICASSP, 1995, vol. 2, pp. 949-952; T. Ratnarajah and S. Puthusserypady, in Proc. 8th IEEE Workshop DSP, 1998, pp. 1483-1487]. Performances of the proposed techniques are compared to the conventional t-test method as well as the well-known LMSs and recursive least squares algorithms. Extensive numerical simulations show that the proposed methods result in better HDR estimations and activation detections.
Resumo:
The conventional radial basis function (RBF) network optimization methods, such as orthogonal least squares or the two-stage selection, can produce a sparse network with satisfactory generalization capability. However, the RBF width, as a nonlinear parameter in the network, is not easy to determine. In the aforementioned methods, the width is always pre-determined, either by trial-and-error, or generated randomly. Furthermore, all hidden nodes share the same RBF width. This will inevitably reduce the network performance, and more RBF centres may then be needed to meet a desired modelling specification. In this paper we investigate a new two-stage construction algorithm for RBF networks. It utilizes the particle swarm optimization method to search for the optimal RBF centres and their associated widths. Although the new method needs more computation than conventional approaches, it can greatly reduce the model size and improve model generalization performance. The effectiveness of the proposed technique is confirmed by two numerical simulation examples.
Resumo:
It is convenient and effective to solve nonlinear problems with a model that has a linear-in-the-parameters (LITP) structure. However, the nonlinear parameters (e.g. the width of Gaussian function) of each model term needs to be pre-determined either from expert experience or through exhaustive search. An alternative approach is to optimize them by a gradient-based technique (e.g. Newton’s method). Unfortunately, all of these methods still need a lot of computations. Recently, the extreme learning machine (ELM) has shown its advantages in terms of fast learning from data, but the sparsity of the constructed model cannot be guaranteed. This paper proposes a novel algorithm for automatic construction of a nonlinear system model based on the extreme learning machine. This is achieved by effectively integrating the ELM and leave-one-out (LOO) cross validation with our two-stage stepwise construction procedure [1]. The main objective is to improve the compactness and generalization capability of the model constructed by the ELM method. Numerical analysis shows that the proposed algorithm only involves about half of the computation of orthogonal least squares (OLS) based method. Simulation examples are included to confirm the efficacy and superiority of the proposed technique.
Resumo:
The techniques of principal component analysis (PCA) and partial least squares (PLS) are introduced from the point of view of providing a multivariate statistical method for modelling process plants. The advantages and limitations of PCA and PLS are discussed from the perspective of the type of data and problems that might be encountered in this application area. These concepts are exemplified by two case studies dealing first with data from a continuous stirred tank reactor (CSTR) simulation and second a literature source describing a low-density polyethylene (LDPE) reactor simulation.
Resumo:
A geostatistical version of the classical Fisher rule (linear discriminant analysis) is presented.This method is applicable when a large dataset of multivariate observations is available within a domain split in several known subdomains, and it assumes that the variograms (or covariance functions) are comparable between subdomains, which only differ in the mean values of the available variables. The method consists on finding the eigen-decomposition of the matrix W-1B, where W is the matrix of sills of all direct- and cross-variograms, and B is the covariance matrix of the vectors of weighted means within each subdomain, obtained by generalized least squares. The method is used to map peat blanket occurrence in Northern Ireland, with data from the Tellus
survey, which requires a minimal change to the general recipe: to use compositionally-compliant variogram tools and models, and work with log-ratio transformed data.
Resumo:
O objectivo principal da presente tese consiste no desenvolvimento de estimadores robustos do variograma com boas propriedades de eficiência. O variograma é um instrumento fundamental em Geoestatística, pois modela a estrutura de dependência do processo em estudo e influencia decisivamente a predição de novas observações. Os métodos tradicionais de estimação do variograma não são robustos, ou seja, são sensíveis a pequenos desvios das hipóteses do modelo. Essa questão é importante, pois as propriedades que motivam a aplicação de tais métodos, podem não ser válidas nas vizinhanças do modelo assumido. O presente trabalho começa por conter uma revisão dos principais conceitos em Geoestatística e da estimação tradicional do variograma. De seguida, resumem-se algumas noções fundamentais sobre robustez estatística. No seguimento, apresenta-se um novo método de estimação do variograma que se designou por estimador de múltiplos variogramas. O método consiste em quatro etapas, nas quais prevalecem, alternadamente, os critérios de robustez ou de eficiência. A partir da amostra inicial, são calculadas, de forma robusta, algumas estimativas pontuais do variograma; com base nessas estimativas pontuais, são estimados os parâmetros do modelo pelo método dos mínimos quadrados; as duas fases anteriores são repetidas, criando um conjunto de múltiplas estimativas da função variograma; por fim, a estimativa final do variograma é definida pela mediana das estimativas obtidas anteriormente. Assim, é possível obter um estimador que tem boas propriedades de robustez e boa eficiência em processos Gaussianos. A investigação desenvolvida revelou que, quando se usam estimativas discretas na primeira fase da estimação do variograma, existem situações onde a identificabilidade dos parâmetros não está assegurada. Para os modelos de variograma mais comuns, foi possível estabelecer condições, pouco restritivas, que garantem a unicidade de solução na estimação do variograma. A estimação do variograma supõe sempre a estacionaridade da média do processo. Como é importante que existam procedimentos objectivos para avaliar tal condição, neste trabalho sugere-se um teste para validar essa hipótese. A estatística do teste é um estimador-MM, cuja distribuição é desconhecida nas condições de dependência assumidas. Tendo em vista a sua aproximação, apresenta-se uma versão do método bootstrap adequada ao estudo de observações dependentes de processos espaciais. Finalmente, o estimador de múltiplos variogramas é avaliado em termos da sua aplicação prática. O trabalho contém um estudo de simulação que confirma as propriedades estabelecidas. Em todos os casos analisados, o estimador de múltiplos variogramas produziu melhores resultados do que as alternativas usuais, tanto para a distribuição assumida, como para distribuições contaminadas.
Resumo:
Os Modelos de Equações Simultâneas (SEM) são modelos estatísticos com muita tradição em estudos de Econometria, uma vez que permitem representar e estudar uma vasta gama de processos económicos. Os estimadores mais usados em SEM resultam da aplicação do Método dos Mínimos Quadrados ou do Método da Máxima Verosimilhança, os quais não são robustos. Em Maronna e Yohai (1997), os autores propõem formas de “robustificar” esses estimadores. Um outro método de estimação com interesse nestes modelos é o Método dos Momentos Generalizado (GMM), o qual também conduz a estimadores não robustos. Estimadores que sofrem de falta de robustez são muito inconvenientes uma vez que podem conduzir a resultados enganadores quando são violadas as hipóteses subjacentes ao modelo assumido. Os estimadores robustos são de grande valor, em particular quando os modelos em estudo são complexos, como é o caso dos SEM. O principal objectivo desta investigação foi o de procurar tais estimadores tendo-se construído um estimador robusto a que se deu o nome de GMMOGK. Trata-se de uma versão robusta do estimador GMM. Para avaliar o desempenho do novo estimador foi feito um adequado estudo de simulação e foi também feita a aplicação do estimador a um conjunto de dados reais. O estimador robusto tem um bom desempenho nos modelos heterocedásticos considerados e, nessas condições, comporta-se melhor do que os estimadores não robustos usados no estudo. Contudo, quando a análise é feita em cada equação separadamente, a especificidade de cada equação individual e a estrutura de dependência do sistema são dois aspectos que influenciam o desempenho do estimador, tal como acontece com os estimadores usuais. Para enquadrar a investigação, o texto inclui uma revisão de aspectos essenciais dos SEM, o seu papel em Econometria, os principais métodos de estimação, com particular ênfase no GMM, e uma curta introdução à estimação robusta.
Resumo:
The work reported in this thesis aimed at applying the methodology known as metabonomics to the detailed study of a particular type of beer and its quality control, with basis on the use of multivariate analysis (MVA) to extract meaningful information from given analytical data sets. In Chapter 1, a detailed description of beer is given considering the brewing process, main characteristics and typical composition of beer, beer stability and the commonly used analytical techniques for beer analysis. The fundamentals of the analytical methods employed here, namely nuclear magnetic resonance (NMR) spectroscopy, gas-chromatography-mass spectrometry (GC-MS) and mid-infrared (MIR) spectroscopy, together with the description of the metabonomics methodology are described shortly in Chapter 2. In Chapter 3, the application of high resolution NMR to characterize the chemical composition of a lager beer is described. The 1H NMR spectrum obtained by direct analysis of beer show a high degree of complexity, confirming the great potential of NMR spectroscopy for the detection of a wide variety of families of compounds, in a single run. Spectral assignment was carried out by 2D NMR, resulting in the identification of about 40 compounds, including alcohols, amino acids, organic acids, nucleosides and sugars. In a second part of Chapter 3, the compositional variability of beer was assessed. For that purpose, metabonomics was applied to 1H NMR data (NMR/MVA) to evaluate beer variability between beers from the same brand (lager), produced nationally but differing in brewing site and date of production. Differences between brewing sites and/or dates were observed, reflecting compositional differences related to particular processing steps, including mashing, fermentation and maturation. Chapter 4 describes the quantification of organic acids in beer by NMR, using different quantitative methods: direct integration of NMR signals (vs. internal reference or vs. an external electronic reference, ERETIC method) and by quantitative statistical methods (using the partial least squares (PLS) regression) were developed and compared. PLS1 regression models were built using different quantitative methods as reference: capillary electrophoresis with direct and indirect detection and enzymatic essays. It was found that NMR integration results generally agree with those obtained by the best performance PLS models, although some overestimation for malic and pyruvic acids and an apparent underestimation for citric acid were observed. Finally, Chapter 5 describes metabonomic studies performed to better understand the forced aging (18 days, at 45 ºC) beer process. The aging process of lager beer was followed by i) NMR, ii) GC-MS, and iii) MIR spectroscopy. MVA methods of each analytical data set revealed clear separation between different aging days for both NMR and GC-MS data, enabling the identification of compounds closely related with the aging process: 5-hydroxymethylfurfural (5-HMF), organic acids, γ-amino butyric acid (GABA), proline and the ratio linear/branched dextrins (NMR domain) and 5-HMF, furfural, diethyl succinate and phenylacetaldehyde (known aging markers) and, for the first time, 2,3-dihydro-3,5-dihydroxy-6-methyl-4(H)-pyran-4-one xii (DDMP) and maltoxazine (by GC-MS domain). For MIR/MVA, no aging trend could be measured, the results reflecting the need of further experimental optimizations. Data correlation between NMR and GC-MS data was performed by outer product analysis (OPA) and statistical heterospectroscopy (SHY) methodologies, enabling the identification of further compounds (11 compounds, 5 of each are still unassigned) highly related with the aging process. Data correlation between sensory characteristics and NMR and GC-MS was also assessed through PLS1 regression models using the sensory response as reference. The results obtained showed good relationships between analytical data response and sensory response, particularly for the aromatic region of the NMR spectra and for GC-MS data (r > 0.89). However, the prediction power of all built PLS1 regression models was relatively low, possibly reflecting the low number of samples/tasters employed, an aspect to improve in future studies.
Resumo:
Esta investigação teve como objetivo central averiguar se o comportamento espaciotemporal do turista urbano influencia a sua satisfação com a experiência de visita multiatração. Apesar de a mobilidade ser uma condição sine qua non do turismo, e, por outro lado, a visita a múltiplas atrações o contexto habitual em que se desenvolve a experiência turística em contexto urbano, a investigação neste domínio tende a ignorar a dimensão espaciotemporal e multiatração dessa experiência. O modelo conceptual proposto visa a sistematização da análise do comportamento espaciotemporal do turista bem como o estudo da sua relação com a satisfação, enquanto satisfação global e satisfação com dimensões da experiência. A partir deste, foi definido o modelo da pesquisa que, modelizando a questão central em estudo, teve por base dois instrumentos principais: estudo de rastreamento através de equipamento GPS e inquérito por questionário, realizados junto de hóspedes de dez hotéis de Lisboa (n= 413). A análise dos dados assume, por sua vez, dupla natureza: espacial e estatística. Em termos de análise espacial, a metodologia SIG em que se baseou a concretização dos mapas foi executada tendo como suporte a solução ArcGIS for Desktop 10.1, permitindo gerar visualizações úteis do ponto de vista da questão em estudo. A análise estatística dos dados compreendeu métodos descritivos, exploratórios e inferenciais, tendo como principal instrumento de teste das hipóteses formuladas a modelação PLS-PM, complementada pela análise PLS-MGA, com recurso ao programa SmartPLS 2.0. Entre as várias relações significativas encontradas, a conclusão mais importante que se pode retirar da investigação empírica é que, de facto, o comportamento espaciotemporal do turista urbano influencia a sua satisfação com a experiência de visita multiatração, afigurando-se particularmente importante neste contexto, em termos científicos e empíricos, investigar a heterogeneidade subjacente à população em estudo.
Resumo:
Dissertação de mest., Qualidade em Análises, Faculdade de Ciências e Tecnologia, Univ. do Algarve, 2013
Resumo:
In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Mat`ern models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion.
Resumo:
Different oil-containing substrates, namely, used cooking oil (UCO), fatty acids-byproduct from biodiesel production (FAB) and olive oil deodorizer distillate (OODD) were tested as inexpensive carbon sources for the production of polyhydroxyalkanoates (PHA) using twelve bacterial strains, in batch experiments. The OODD and FAB were exploited for the first time as alternative substrates for PHA production. Among the tested bacterial strains, Cupriavidus necator and Pseudomonas resinovorans exhibited the most promising results, producing poly-3-hydroxybutyrate, P(3HB), form UCO and OODD and mcl-PHA mainly composed of 3-hydroxyoctanoate (3HO) and 3-hydroxydecanoate (3HD) monomers from OODD, respectively. Afterwards, these bacterial strains were cultivated in bioreactor. C. necator were cultivated in bioreactor using UCO as carbon source. Different feeding strategies were tested for the bioreactor cultivation of C. necator, namely, batch, exponential feeding and DO-stat mode. The highest overall PHA productivity (12.6±0.78 g L-1 day-1) was obtained using DO-stat mode. Apparently, the different feeding regimes had no impact on polymer thermal properties. However, differences in polymer‟s molecular mass distribution were observed. C. necator was also tested in batch and fed-batch modes using a different type of oil-containing substrate, extracted from spent coffee grounds (SCG) by super critical carbon dioxide (sc-CO2). Under fed-batch mode (DO-stat), the overall PHA productivity were 4.7 g L-1 day-1 with a storage yield of 0.77 g g-1. Results showed that SCG can be a bioresource for production of PHA with interesting properties. Furthermore, P. resinovorans was cultivated using OODD as substrate in bioreactor under fed-batch mode (pulse feeding regime). The polymer was highly amorphous, as shown by its low crystallinity of 6±0.2%, with low melting and glass transition temperatures of 36±1.2 and -16±0.8 ºC, respectively. Due to its sticky behavior at room temperature, adhesiveness and mechanical properties were also studied. Its shear bond strength for wood (67±9.4 kPa) and glass (65±7.3 kPa) suggests it may be used for the development of biobased glues. Bioreactor operation and monitoring with oil-containing substrates is very challenging, since this substrate is water immiscible. Thus, near-infrared spectroscopy (NIR) was implemented for online monitoring of the C. necator cultivation with UCO, using a transflectance probe. Partial least squares (PLS) regression was applied to relate NIR spectra with biomass, UCO and PHA concentrations in the broth. The NIR predictions were compared with values obtained by offline reference methods. Prediction errors to these parameters were 1.18 g L-1, 2.37 g L-1 and 1.58 g L-1 for biomass, UCO and PHA, respectively, which indicates the suitability of the NIR spectroscopy method for online monitoring and as a method to assist bioreactor control. UCO and OODD are low cost substrates with potential to be used in PHA batch and fed-batch production. The use of NIR in this bioprocess also opened an opportunity for optimization and control of PHA production process.