955 resultados para Variable-selection Problems
Resumo:
Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.
Resumo:
Causal inference methods - mainly path analysis and structural equation modeling - offer plant physiologists information about cause-and-effect relationships among plant traits. Recently, an unusual approach to causal inference through stepwise variable selection has been proposed and used in various works on plant physiology. The approach should not be considered correct from a biological point of view. Here, it is explained why stepwise variable selection should not be used for causal inference, and shown what strange conclusions can be drawn based upon the former analysis when one aims to interpret cause-and-effect relationships among plant traits.
Resumo:
In the paper we present two continuous selection theorems in hyperconvex metric spaces and apply these to study xed point and coincidence point problems as well as variational inequality problems in hyperconvex metric spaces.
Resumo:
Copyright © 2013 Springer Netherlands.
Resumo:
The aim of this paper is to predict time series of SO2 concentrations emitted by coal-fired power stations in order to estimate in advance emission episodes and analyze the influence of some meteorological variables in the prediction. An emission episode is said to occur when the series of bi-hourly means of SO2 is greater than a specific level. For coal-fired power stations it is essential to predict emission epi- sodes sufficiently in advance so appropriate preventive measures can be taken. We proposed a meth- odology to predict SO2 emission episodes based on using an additive model and an algorithm for variable selection. The methodology was applied to the estimation of SO2 emissions registered in sampling lo- cations near a coal-fired power station located in Northern Spain. The results obtained indicate a good performance of the model considering only two terms of the time series and that the inclusion of the meteorological variables in the model is not significant.
Resumo:
This paper develops methods for Stochastic Search Variable Selection (currently popular with regression and Vector Autoregressive models) for Vector Error Correction models where there are many possible restrictions on the cointegration space. We show how this allows the researcher to begin with a single unrestricted model and either do model selection or model averaging in an automatic and computationally efficient manner. We apply our methods to a large UK macroeconomic model.
Resumo:
This paper develops stochastic search variable selection (SSVS) for zero-inflated count models which are commonly used in health economics. This allows for either model averaging or model selection in situations with many potential regressors. The proposed techniques are applied to a data set from Germany considering the demand for health care. A package for the free statistical software environment R is provided.
Resumo:
An input variable selection procedure is introduced for the identification and construction of multi-input multi-output (MIMO) neurofuzzy operating point dependent models. The algorithm is an extension of a forward modified Gram-Schmidt orthogonal least squares procedure for a linear model structure which is modified to accommodate nonlinear system modeling by incorporating piecewise locally linear model fitting. The proposed input nodes selection procedure effectively tackles the problem of the curse of dimensionality associated with lattice-based modeling algorithms such as radial basis function neurofuzzy networks, enabling the resulting neurofuzzy operating point dependent model to be widely applied in control and estimation. Some numerical examples are given to demonstrate the effectiveness of the proposed construction algorithm.
Resumo:
This paper is concerned with the use of a genetic algorithm to select financial ratios for corporate distress classification models. For this purpose, the fitness value associated to a set of ratios is made to reflect the requirements of maximizing the amount of information available for the model and minimizing the collinearity between the model inputs. A case study involving 60 failed and continuing British firms in the period 1997-2000 is used for illustration. The classification model based on ratios selected by the genetic algorithm compares favorably with a model employing ratios usually found in the financial distress literature.
Resumo:
This work analyses the optimal menu of contracts offered by a risk neutral principal to a risk averse agent under moral hazard, adverse selection and limited liability. There are two output levels, whose probability of occurrence are given by agent’s private information choice of effort. The agent’s cost of effort is also private information. First, we show that without assumptions on the cost function, it is not possible to guarantee that the optimal contract menu is simple, when the agent is strictly risk averse. Then, we provide sufficient conditions over the cost function under which it is optimal to offer a single contract, independently of agent’s risk aversion. Our full-pooling cases are caused by non-responsiveness, which is induced by the high cost of enforcing higher effort levels. Also, we show that limited liability generates non-responsiveness.
Resumo:
Este artigo apresenta uma aplicação do método para determinação espectrofotométrica simultânea dos íons divalentes de cobre, manganês e zinco à análise de medicamento polivitamínico/polimineral. O método usa 4-(2-piridilazo) resorcinol (PAR), calibração multivariada e técnicas de seleção de variáveis e foi otimizado o empregando-se o algoritmo das projeções sucessivas (APS) e o algoritmo genético (AG), para escolha dos comprimentos de onda mais informativos para a análise. Com essas técnicas, foi possível construir modelos de calibração por regressão linear múltipla (RLM-APS e RLM-AG). Os resultados obtidos foram comparados com modelos de regressão em componentes principais (PCR) e nos mínimos quadrados parciais (PLS). Demonstra-se a partir do erro médio quadrático de previsão (RMSEP) que os modelos apresentam desempenhos semelhantes ao prever as concentrações dos três analitos no medicamento. Todavia os modelos RLM são mais simples pois requerem um número muito menor de comprimentos de onda e são mais fáceis de interpretar que os baseados em variáveis latentes.