867 resultados para LINEAR-REGRESSION MODELS
Resumo:
In this paper, we tackle the problem of learning a linear regression model whose parameter is a fixed-rank matrix. We study the Riemannian manifold geometry of the set of fixed-rank matrices and develop efficient line-search algorithms. The proposed algorithms have many applications, scale to high-dimensional problems, enjoy local convergence properties and confer a geometric basis to recent contributions on learning fixed-rank matrices. Numerical experiments on benchmarks suggest that the proposed algorithms compete with the state-of-the-art, and that manifold optimization offers a versatile framework for the design of rank-constrained machine learning algorithms. Copyright 2011 by the author(s)/owner(s).
Resumo:
Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. This paper presents the beginnings of an automatic statistician, focusing on regression problems. Our system explores an open-ended space of statistical models to discover a good explanation of a data set, and then produces a detailed report with figures and natural- language text. Our approach treats unknown regression functions non- parametrically using Gaussian processes, which has two important consequences. First, Gaussian processes can model functions in terms of high-level properties (e.g. smoothness, trends, periodicity, changepoints). Taken together with the compositional structure of our language of models this allows us to automatically describe functions in simple terms. Second, the use of flexible nonparametric models and a rich language for composing them in an open-ended manner also results in state- of-the-art extrapolation performance evaluated over 13 real time series data sets from various domains.
Resumo:
We use a computational homogenisation approach to derive a non linear constitutive model for lattice materials. A representative volume element (RVE) of the lattice is modelled by means of discrete structural elements, and macroscopic stress-strain relationships are numerically evaluated after applying appropriate periodic boundary conditions to the RVE. The influence of the choice of the RVE on the predictions of the model is discussed. The model has been used for the analysis of the hexagonal and the triangulated lattices subjected to large strains. The fidelity of the model has been demonstrated by analysing a plate with a central hole under prescribed in plane compressive and tensile loads, and then comparing the results from the discrete and the homogenised models. © 2013 Elsevier Ltd.
Resumo:
IEECAS SKLLQG
Resumo:
This paper provides a root-n consistent, asymptotically normal weighted least squares estimator of the coefficients in a truncated regression model. The distribution of the errors is unknown and permits general forms of unknown heteroskedasticity. Also provided is an instrumental variables based two-stage least squares estimator for this model, which can be used when some regressors are endogenous, mismeasured, or otherwise correlated with the errors. A simulation study indicates that the new estimators perform well in finite samples. Our limiting distribution theory includes a new asymptotic trimming result addressing the boundary bias in first-stage density estimation without knowledge of the support boundary. © 2007 Cambridge University Press.
Resumo:
1. We collated information from the literature on life history traits of the roach (a generalist freshwater fish), and analysed variation in absolute fecundity, von Bertalanffy parameters, and reproductive lifespan in relation to latitude, using both linear and non-linear regression models. We hypothesized that because most life history traits are dependent on growth rate, and growth rate is non-linearly related with temperature, it was likely that when analysed over the whole distribution range of roach, variation in key life history traits would show non-linear patterns with latitude.
Resumo:
The paper describes the development and application of a multiple linear regression model to identify how the key elements of waste and recycling infrastructure, namely container capacity and frequency of collection affect the yield from municipal kerbside recycling programmes. The overall aim of the research was to gain an understanding of the factors affecting the yield from municipal kerbside recycling programmes in Scotland. The study isolates the principal kerbside collection service offered by 32 councils across Scotland, eliminating those recycling programmes associated with flatted properties or multi occupancies. The results of a regression analysis model has identified three principal factors which explain 80% of the variability in the average yield of the principal dry recyclate services: weekly residual waste capacity, number of materials collected and the weekly recycling capacity. The use of the model has been evaluated and recommendations made on ongoing methodological development and the use of the results in informing the design of kerbside recycling programmes. The authors hope that the research can provide insights for the ongoing development of methods to optimise the design and operation of kerbside recycling programmes.
Resumo:
Virtual metrology (VM) aims to predict metrology values using sensor data from production equipment and physical metrology values of preceding samples. VM is a promising technology for the semiconductor manufacturing industry as it can reduce the frequency of in-line metrology operations and provide supportive information for other operations such as fault detection, predictive maintenance and run-to-run control. The prediction models for VM can be from a large variety of linear and nonlinear regression methods and the selection of a proper regression method for a specific VM problem is not straightforward, especially when the candidate predictor set is of high dimension, correlated and noisy. Using process data from a benchmark semiconductor manufacturing process, this paper evaluates the performance of four typical regression methods for VM: multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), neural networks (NN) and Gaussian process regression (GPR). It is observed that GPR performs the best among the four methods and that, remarkably, the performance of linear regression approaches that of GPR as the subset of selected input variables is increased. The observed competitiveness of high-dimensional linear regression models, which does not hold true in general, is explained in the context of extreme learning machines and functional link neural networks.
Resumo:
Over 1 million km2 of seafloor experience permanent low-oxygen conditions within oxygen minimum zones (OMZs). OMZs are predicted to grow as a consequence of climate change, potentially affecting oceanic biogeochemical cycles. The Arabian Sea OMZ impinges upon the western Indian continental margin at bathyal depths (150 - 1500 m) producing a strong depth dependent oxygen gradient at the sea floor. The influence of the OMZ upon the short term processing of organic matter by sediment ecosystems was investigated using in situ stable isotope pulse chase experiments. These deployed doses of 13C:15N labeled organic matter onto the sediment surface at four stations from across the OMZ (water depth 540 - 1100 m; [O2] = 0.35 - 15 μM). In order to prevent experimentally anoxia, the mesocosms were not sealed. 13C and 15N labels were traced into sediment, bacteria, fauna and 13C into sediment porewater DIC and DOC. However, the DIC and DOC flux to the water column could not be measured, limiting our capacity to obtain mass-balance for C in each experimental mesocosm. Linear Inverse Modeling (LIM) provides a method to obtain a mass-balanced model of carbon flow that integrates stable-isotope tracer data with community biomass and biogeochemical flux data from a range of sources. Here we present an adaptation of the LIM methodology used to investigate how ecosystem structure influenced carbon flow across the Indian margin OMZ. We demonstrate how oxygen conditions affect food-web complexity, affecting the linkages between the bacteria, foraminifera and metazoan fauna, and their contributions to benthic respiration. The food-web models demonstrate how changes in ecosystem complexity are associated with oxygen availability across the OMZ and allow us to obtain a complete carbon budget for the stationa where stable-isotope labelling experiments were conducted.
Resumo:
As técnicas estatísticas são fundamentais em ciência e a análise de regressão linear é, quiçá, uma das metodologias mais usadas. É bem conhecido da literatura que, sob determinadas condições, a regressão linear é uma ferramenta estatística poderosíssima. Infelizmente, na prática, algumas dessas condições raramente são satisfeitas e os modelos de regressão tornam-se mal-postos, inviabilizando, assim, a aplicação dos tradicionais métodos de estimação. Este trabalho apresenta algumas contribuições para a teoria de máxima entropia na estimação de modelos mal-postos, em particular na estimação de modelos de regressão linear com pequenas amostras, afetados por colinearidade e outliers. A investigação é desenvolvida em três vertentes, nomeadamente na estimação de eficiência técnica com fronteiras de produção condicionadas a estados contingentes, na estimação do parâmetro ridge em regressão ridge e, por último, em novos desenvolvimentos na estimação com máxima entropia. Na estimação de eficiência técnica com fronteiras de produção condicionadas a estados contingentes, o trabalho desenvolvido evidencia um melhor desempenho dos estimadores de máxima entropia em relação ao estimador de máxima verosimilhança. Este bom desempenho é notório em modelos com poucas observações por estado e em modelos com um grande número de estados, os quais são comummente afetados por colinearidade. Espera-se que a utilização de estimadores de máxima entropia contribua para o tão desejado aumento de trabalho empírico com estas fronteiras de produção. Em regressão ridge o maior desafio é a estimação do parâmetro ridge. Embora existam inúmeros procedimentos disponíveis na literatura, a verdade é que não existe nenhum que supere todos os outros. Neste trabalho é proposto um novo estimador do parâmetro ridge, que combina a análise do traço ridge e a estimação com máxima entropia. Os resultados obtidos nos estudos de simulação sugerem que este novo estimador é um dos melhores procedimentos existentes na literatura para a estimação do parâmetro ridge. O estimador de máxima entropia de Leuven é baseado no método dos mínimos quadrados, na entropia de Shannon e em conceitos da eletrodinâmica quântica. Este estimador suplanta a principal crítica apontada ao estimador de máxima entropia generalizada, uma vez que prescinde dos suportes para os parâmetros e erros do modelo de regressão. Neste trabalho são apresentadas novas contribuições para a teoria de máxima entropia na estimação de modelos mal-postos, tendo por base o estimador de máxima entropia de Leuven, a teoria da informação e a regressão robusta. Os estimadores desenvolvidos revelam um bom desempenho em modelos de regressão linear com pequenas amostras, afetados por colinearidade e outliers. Por último, são apresentados alguns códigos computacionais para estimação com máxima entropia, contribuindo, deste modo, para um aumento dos escassos recursos computacionais atualmente disponíveis.
Resumo:
The problem of Small Area Estimation is about how to produce reliable estimates of domain characteristics when the sample sizes within the domain is very small ou even zero.
Resumo:
Long-term contractual decisions are the basis of an efficient risk management. However those types of decisions have to be supported with a robust price forecast methodology. This paper reports a different approach for long-term price forecast which tries to give answers to that need. Making use of regression models, the proposed methodology has as main objective to find the maximum and a minimum Market Clearing Price (MCP) for a specific programming period, and with a desired confidence level α. Due to the problem complexity, the meta-heuristic Particle Swarm Optimization (PSO) was used to find the best regression parameters and the results compared with the obtained by using a Genetic Algorithm (GA). To validate these models, results from realistic data are presented and discussed in detail.