881 resultados para Generalized Least Squares Estimation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quantile regression (QR) was first introduced by Roger Koenker and Gilbert Bassett in 1978. It is robust to outliers which affect least squares estimator on a large scale in linear regression. Instead of modeling mean of the response, QR provides an alternative way to model the relationship between quantiles of the response and covariates. Therefore, QR can be widely used to solve problems in econometrics, environmental sciences and health sciences. Sample size is an important factor in the planning stage of experimental design and observational studies. In ordinary linear regression, sample size may be determined based on either precision analysis or power analysis with closed form formulas. There are also methods that calculate sample size based on precision analysis for QR like C.Jennen-Steinmetz and S.Wellek (2005). A method to estimate sample size for QR based on power analysis was proposed by Shao and Wang (2009). In this paper, a new method is proposed to calculate sample size based on power analysis under hypothesis test of covariate effects. Even though error distribution assumption is not necessary for QR analysis itself, researchers have to make assumptions of error distribution and covariate structure in the planning stage of a study to obtain a reasonable estimate of sample size. In this project, both parametric and nonparametric methods are provided to estimate error distribution. Since the method proposed can be implemented in R, user is able to choose either parametric distribution or nonparametric kernel density estimation for error distribution. User also needs to specify the covariate structure and effect size to carry out sample size and power calculation. The performance of the method proposed is further evaluated using numerical simulation. The results suggest that the sample sizes obtained from our method provide empirical powers that are closed to the nominal power level, for example, 80%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivated by environmental protection concerns, monitoring the flue gas of thermal power plant is now often mandatory due to the need to ensure that emission levels stay within safe limits. Optical based gas sensing systems are increasingly employed for this purpose, with regression techniques used to relate gas optical absorption spectra to the concentrations of specific gas components of interest (NOx, SO2 etc.). Accurately predicting gas concentrations from absorption spectra remains a challenging problem due to the presence of nonlinearities in the relationships and the high-dimensional and correlated nature of the spectral data. This article proposes a generalized fuzzy linguistic model (GFLM) to address this challenge. The GFLM is made up of a series of “If-Then” fuzzy rules. The absorption spectra are input variables in the rule antecedent. The rule consequent is a general nonlinear polynomial function of the absorption spectra. Model parameters are estimated using least squares and gradient descent optimization algorithms. The performance of GFLM is compared with other traditional prediction models, such as partial least squares, support vector machines, multilayer perceptron neural networks and radial basis function networks, for two real flue gas spectral datasets: one from a coal-fired power plant and one from a gas-fired power plant. The experimental results show that the generalized fuzzy linguistic model has good predictive ability, and is competitive with alternative approaches, while having the added advantage of providing an interpretable model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivated by environmental protection concerns, monitoring the flue gas of thermal power plant is now often mandatory due to the need to ensure that emission levels stay within safe limits. Optical based gas sensing systems are increasingly employed for this purpose, with regression techniques used to relate gas optical absorption spectra to the concentrations of specific gas components of interest (NOx, SO2 etc.). Accurately predicting gas concentrations from absorption spectra remains a challenging problem due to the presence of nonlinearities in the relationships and the high-dimensional and correlated nature of the spectral data. This article proposes a generalized fuzzy linguistic model (GFLM) to address this challenge. The GFLM is made up of a series of “If-Then” fuzzy rules. The absorption spectra are input variables in the rule antecedent. The rule consequent is a general nonlinear polynomial function of the absorption spectra. Model parameters are estimated using least squares and gradient descent optimization algorithms. The performance of GFLM is compared with other traditional prediction models, such as partial least squares, support vector machines, multilayer perceptron neural networks and radial basis function networks, for two real flue gas spectral datasets: one from a coal-fired power plant and one from a gas-fired power plant. The experimental results show that the generalized fuzzy linguistic model has good predictive ability, and is competitive with alternative approaches, while having the added advantage of providing an interpretable model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the most disputable matters in the theory of finance has been the theory of capital structure. The seminal contributions of Modigliani and Miller (1958, 1963) gave rise to a multitude of studies and debates. Since the initial spark, the financial literature has offered two competing theories of financing decision: the trade-off theory and the pecking order theory. The trade-off theory suggests that firms have an optimal capital structure balancing the benefits and costs of debt. The pecking order theory approaches the firm capital structure from information asymmetry perspective and assumes a hierarchy of financing, with firms using first internal funds, followed by debt and as a last resort equity. This thesis analyses the trade-off and pecking order theories and their predictions on a panel data consisting 78 Finnish firms listed on the OMX Helsinki stock exchange. Estimations are performed for the period 2003–2012. The data is collected from Datastream system and consists of financial statement data. A number of capital structure characteristics are identified: firm size, profitability, firm growth opportunities, risk, asset tangibility and taxes, speed of adjustment and financial deficit. A regression analysis is used to examine the effects of the firm characteristics on capitals structure. The regression models were formed based on the relevant theories. The general capital structure model is estimated with fixed effects estimator. Additionally, dynamic models play an important role in several areas of corporate finance, but with the combination of fixed effects and lagged dependent variables the model estimation is more complicated. A dynamic partial adjustment model is estimated using Arellano and Bond (1991) first-differencing generalized method of moments, the ordinary least squares and fixed effects estimators. The results for Finnish listed firms show support for the predictions of profitability, firm size and non-debt tax shields. However, no conclusive support for the pecking-order theory is found. However, the effect of pecking order cannot be fully ignored and it is concluded that instead of being substitutes the trade-off and pecking order theory appear to complement each other. For the partial adjustment model the results show that Finnish listed firms adjust towards their target capital structure with a speed of 29% a year using book debt ratio.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We developed orthogonal least-squares techniques for fitting crystalline lens shapes, and used the bootstrap method to determine uncertainties associated with the estimated vertex radii of curvature and asphericities of five different models. Three existing models were investigated including one that uses two separate conics for the anterior and posterior surfaces, and two whole lens models based on a modulated hyperbolic cosine function and on a generalized conic function. Two new models were proposed including one that uses two interdependent conics and a polynomial based whole lens model. The models were used to describe the in vitro shape for a data set of twenty human lenses with ages 7–82 years. The two-conic-surface model (7 mm zone diameter) and the interdependent surfaces model had significantly lower merit functions than the other three models for the data set, indicating that most likely they can describe human lens shape over a wide age range better than the other models (although with the two-conic-surfaces model being unable to describe the lens equatorial region). Considerable differences were found between some models regarding estimates of radii of curvature and surface asphericities. The hyperbolic cosine model and the new polynomial based whole lens model had the best precision in determining the radii of curvature and surface asphericities across the five considered models. Most models found significant increase in anterior, but not posterior, radius of curvature with age. Most models found a wide scatter of asphericities, but with the asphericities usually being positive and not significantly related to age. As the interdependent surfaces model had lower merit function than three whole lens models, there is further scope to develop an accurate model of the complete shape of human lenses of all ages. The results highlight the continued difficulty in selecting an appropriate model for the crystalline lens shape.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Financial processes may possess long memory and their probability densities may display heavy tails. Many models have been developed to deal with this tail behaviour, which reflects the jumps in the sample paths. On the other hand, the presence of long memory, which contradicts the efficient market hypothesis, is still an issue for further debates. These difficulties present challenges with the problems of memory detection and modelling the co-presence of long memory and heavy tails. This PhD project aims to respond to these challenges. The first part aims to detect memory in a large number of financial time series on stock prices and exchange rates using their scaling properties. Since financial time series often exhibit stochastic trends, a common form of nonstationarity, strong trends in the data can lead to false detection of memory. We will take advantage of a technique known as multifractal detrended fluctuation analysis (MF-DFA) that can systematically eliminate trends of different orders. This method is based on the identification of scaling of the q-th-order moments and is a generalisation of the standard detrended fluctuation analysis (DFA) which uses only the second moment; that is, q = 2. We also consider the rescaled range R/S analysis and the periodogram method to detect memory in financial time series and compare their results with the MF-DFA. An interesting finding is that short memory is detected for stock prices of the American Stock Exchange (AMEX) and long memory is found present in the time series of two exchange rates, namely the French franc and the Deutsche mark. Electricity price series of the five states of Australia are also found to possess long memory. For these electricity price series, heavy tails are also pronounced in their probability densities. The second part of the thesis develops models to represent short-memory and longmemory financial processes as detected in Part I. These models take the form of continuous-time AR(∞) -type equations whose kernel is the Laplace transform of a finite Borel measure. By imposing appropriate conditions on this measure, short memory or long memory in the dynamics of the solution will result. A specific form of the models, which has a good MA(∞) -type representation, is presented for the short memory case. Parameter estimation of this type of models is performed via least squares, and the models are applied to the stock prices in the AMEX, which have been established in Part I to possess short memory. By selecting the kernel in the continuous-time AR(∞) -type equations to have the form of Riemann-Liouville fractional derivative, we obtain a fractional stochastic differential equation driven by Brownian motion. This type of equations is used to represent financial processes with long memory, whose dynamics is described by the fractional derivative in the equation. These models are estimated via quasi-likelihood, namely via a continuoustime version of the Gauss-Whittle method. The models are applied to the exchange rates and the electricity prices of Part I with the aim of confirming their possible long-range dependence established by MF-DFA. The third part of the thesis provides an application of the results established in Parts I and II to characterise and classify financial markets. We will pay attention to the New York Stock Exchange (NYSE), the American Stock Exchange (AMEX), the NASDAQ Stock Exchange (NASDAQ) and the Toronto Stock Exchange (TSX). The parameters from MF-DFA and those of the short-memory AR(∞) -type models will be employed in this classification. We propose the Fisher discriminant algorithm to find a classifier in the two and three-dimensional spaces of data sets and then provide cross-validation to verify discriminant accuracies. This classification is useful for understanding and predicting the behaviour of different processes within the same market. The fourth part of the thesis investigates the heavy-tailed behaviour of financial processes which may also possess long memory. We consider fractional stochastic differential equations driven by stable noise to model financial processes such as electricity prices. The long memory of electricity prices is represented by a fractional derivative, while the stable noise input models their non-Gaussianity via the tails of their probability density. A method using the empirical densities and MF-DFA will be provided to estimate all the parameters of the model and simulate sample paths of the equation. The method is then applied to analyse daily spot prices for five states of Australia. Comparison with the results obtained from the R/S analysis, periodogram method and MF-DFA are provided. The results from fractional SDEs agree with those from MF-DFA, which are based on multifractal scaling, while those from the periodograms, which are based on the second order, seem to underestimate the long memory dynamics of the process. This highlights the need and usefulness of fractal methods in modelling non-Gaussian financial processes with long memory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper firstly presents an extended ambiguity resolution model that deals with an ill-posed problem and constraints among the estimated parameters. In the extended model, the regularization criterion is used instead of the traditional least squares in order to estimate the float ambiguities better. The existing models can be derived from the general model. Secondly, the paper examines the existing ambiguity searching methods from four aspects: exclusion of nuisance integer candidates based on the available integer constraints; integer rounding; integer bootstrapping and integer least squares estimations. Finally, this paper systematically addresses the similarities and differences between the generalized TCAR and decorrelation methods from both theoretical and practical aspects.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main goal of this research is to design an efficient compression al~ gorithm for fingerprint images. The wavelet transform technique is the principal tool used to reduce interpixel redundancies and to obtain a parsimonious representation for these images. A specific fixed decomposition structure is designed to be used by the wavelet packet in order to save on the computation, transmission, and storage costs. This decomposition structure is based on analysis of information packing performance of several decompositions, two-dimensional power spectral density, effect of each frequency band on the reconstructed image, and the human visual sensitivities. This fixed structure is found to provide the "most" suitable representation for fingerprints, according to the chosen criteria. Different compression techniques are used for different subbands, based on their observed statistics. The decision is based on the effect of each subband on the reconstructed image according to the mean square criteria as well as the sensitivities in human vision. To design an efficient quantization algorithm, a precise model for distribution of the wavelet coefficients is developed. The model is based on the generalized Gaussian distribution. A least squares algorithm on a nonlinear function of the distribution model shape parameter is formulated to estimate the model parameters. A noise shaping bit allocation procedure is then used to assign the bit rate among subbands. To obtain high compression ratios, vector quantization is used. In this work, the lattice vector quantization (LVQ) is chosen because of its superior performance over other types of vector quantizers. The structure of a lattice quantizer is determined by its parameters known as truncation level and scaling factor. In lattice-based compression algorithms reported in the literature the lattice structure is commonly predetermined leading to a nonoptimized quantization approach. In this research, a new technique for determining the lattice parameters is proposed. In the lattice structure design, no assumption about the lattice parameters is made and no training and multi-quantizing is required. The design is based on minimizing the quantization distortion by adapting to the statistical characteristics of the source in each subimage. 11 Abstract Abstract Since LVQ is a multidimensional generalization of uniform quantizers, it produces minimum distortion for inputs with uniform distributions. In order to take advantage of the properties of LVQ and its fast implementation, while considering the i.i.d. nonuniform distribution of wavelet coefficients, the piecewise-uniform pyramid LVQ algorithm is proposed. The proposed algorithm quantizes almost all of source vectors without the need to project these on the lattice outermost shell, while it properly maintains a small codebook size. It also resolves the wedge region problem commonly encountered with sharply distributed random sources. These represent some of the drawbacks of the algorithm proposed by Barlaud [26). The proposed algorithm handles all types of lattices, not only the cubic lattices, as opposed to the algorithms developed by Fischer [29) and Jeong [42). Furthermore, no training and multiquantizing (to determine lattice parameters) is required, as opposed to Powell's algorithm [78). For coefficients with high-frequency content, the positive-negative mean algorithm is proposed to improve the resolution of reconstructed images. For coefficients with low-frequency content, a lossless predictive compression scheme is used to preserve the quality of reconstructed images. A method to reduce bit requirements of necessary side information is also introduced. Lossless entropy coding techniques are subsequently used to remove coding redundancy. The algorithms result in high quality reconstructed images with better compression ratios than other available algorithms. To evaluate the proposed algorithms their objective and subjective performance comparisons with other available techniques are presented. The quality of the reconstructed images is important for a reliable identification. Enhancement and feature extraction on the reconstructed images are also investigated in this research. A structural-based feature extraction algorithm is proposed in which the unique properties of fingerprint textures are used to enhance the images and improve the fidelity of their characteristic features. The ridges are extracted from enhanced grey-level foreground areas based on the local ridge dominant directions. The proposed ridge extraction algorithm, properly preserves the natural shape of grey-level ridges as well as precise locations of the features, as opposed to the ridge extraction algorithm in [81). Furthermore, it is fast and operates only on foreground regions, as opposed to the adaptive floating average thresholding process in [68). Spurious features are subsequently eliminated using the proposed post-processing scheme.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In semisupervised learning (SSL), a predictive model is learn from a collection of labeled data and a typically much larger collection of unlabeled data. These paper presented a framework called multi-view point cloud regularization (MVPCR), which unifies and generalizes several semisupervised kernel methods that are based on data-dependent regularization in reproducing kernel Hilbert spaces (RKHSs). Special cases of MVPCR include coregularized least squares (CoRLS), manifold regularization (MR), and graph-based SSL. An accompanying theorem shows how to reduce any MVPCR problem to standard supervised learning with a new multi-view kernel.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One of the fundamental econometric models in finance is predictive regression. The standard least squares method produces biased coefficient estimates when the regressor is persistent and its innovations are correlated with those of the dependent variable. This article proposes a general and convenient method based on the jackknife technique to tackle the estimation problem. The proposed method reduces the bias for both single- and multiple-regressor models and for both short- and long-horizon regressions. The effectiveness of the proposed method is demonstrated by simulations. An empirical application to equity premium prediction using the dividend yield and the short rate highlights the differences between the results by the standard approach and those by the bias-reduced estimator. The significant predictive variables under the ordinary least squares become insignificant after adjusting for the finite-sample bias. These discrepancies suggest that bias reduction in predictive regressions is important in practical applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reliable ambiguity resolution (AR) is essential to Real-Time Kinematic (RTK) positioning and its applications, since incorrect ambiguity fixing can lead to largely biased positioning solutions. A partial ambiguity fixing technique is developed to improve the reliability of AR, involving partial ambiguity decorrelation (PAD) and partial ambiguity resolution (PAR). Decorrelation transformation could substantially amplify the biases in the phase measurements. The purpose of PAD is to find the optimum trade-off between decorrelation and worst-case bias amplification. The concept of PAR refers to the case where only a subset of the ambiguities can be fixed correctly to their integers in the integer least-squares (ILS) estimation system at high success rates. As a result, RTK solutions can be derived from these integer-fixed phase measurements. This is meaningful provided that the number of reliably resolved phase measurements is sufficiently large for least-square estimation of RTK solutions as well. Considering the GPS constellation alone, partially fixed measurements are often insufficient for positioning. The AR reliability is usually characterised by the AR success rate. In this contribution an AR validation decision matrix is firstly introduced to understand the impact of success rate. Moreover the AR risk probability is included into a more complete evaluation of the AR reliability. We use 16 ambiguity variance-covariance matrices with different levels of success rate to analyse the relation between success rate and AR risk probability. Next, the paper examines during the PAD process, how a bias in one measurement is propagated and amplified onto many others, leading to more than one wrong integer and to affect the success probability. Furthermore, the paper proposes a partial ambiguity fixing procedure with a predefined success rate criterion and ratio-test in the ambiguity validation process. In this paper, the Galileo constellation data is tested with simulated observations. Numerical results from our experiment clearly demonstrate that only when the computed success rate is very high, the AR validation can provide decisions about the correctness of AR which are close to real world, with both low AR risk and false alarm probabilities. The results also indicate that the PAR procedure can automatically chose adequate number of ambiguities to fix at given high-success rate from the multiple constellations instead of fixing all the ambiguities. This is a benefit that multiple GNSS constellations can offer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Global Navigation Satellite Systems (GNSS)-based observation systems can provide high precision positioning and navigation solutions in real time, in the order of subcentimetre if we make use of carrier phase measurements in the differential mode and deal with all the bias and noise terms well. However, these carrier phase measurements are ambiguous due to unknown, integer numbers of cycles. One key challenge in the differential carrier phase mode is to fix the integer ambiguities correctly. On the other hand, in the safety of life or liability-critical applications, such as for vehicle safety positioning and aviation, not only is high accuracy required, but also the reliability requirement is important. This PhD research studies to achieve high reliability for ambiguity resolution (AR) in a multi-GNSS environment. GNSS ambiguity estimation and validation problems are the focus of the research effort. Particularly, we study the case of multiple constellations that include initial to full operations of foreseeable Galileo, GLONASS and Compass and QZSS navigation systems from next few years to the end of the decade. Since real observation data is only available from GPS and GLONASS systems, the simulation method named Virtual Galileo Constellation (VGC) is applied to generate observational data from another constellation in the data analysis. In addition, both full ambiguity resolution (FAR) and partial ambiguity resolution (PAR) algorithms are used in processing single and dual constellation data. Firstly, a brief overview of related work on AR methods and reliability theory is given. Next, a modified inverse integer Cholesky decorrelation method and its performance on AR are presented. Subsequently, a new measure of decorrelation performance called orthogonality defect is introduced and compared with other measures. Furthermore, a new AR scheme considering the ambiguity validation requirement in the control of the search space size is proposed to improve the search efficiency. With respect to the reliability of AR, we also discuss the computation of the ambiguity success rate (ASR) and confirm that the success rate computed with the integer bootstrapping method is quite a sharp approximation to the actual integer least-squares (ILS) method success rate. The advantages of multi-GNSS constellations are examined in terms of the PAR technique involving the predefined ASR. Finally, a novel satellite selection algorithm for reliable ambiguity resolution called SARA is developed. In summary, the study demonstrats that when the ASR is close to one, the reliability of AR can be guaranteed and the ambiguity validation is effective. The work then focuses on new strategies to improve the ASR, including a partial ambiguity resolution procedure with a predefined success rate and a novel satellite selection strategy with a high success rate. The proposed strategies bring significant benefits of multi-GNSS signals to real-time high precision and high reliability positioning services.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Mortality and cost outcomes of elderly intensive care unit (ICU) trauma patients were characterised in a retrospective cohort study from an Australian tertiary ICU. Trauma patients admitted between January 2000 and December 2005 were grouped into three major age categories: aged ≥65 years admitted into ICU (n=272); aged ≥65 years admitted into general ward (n=610) and aged <65 years admitted into ICU (n=1617). Hospital mortality predictors were characterised as odds ratios (OR) using logistic regression. The impact of predictor variables on (log) total hospital-stay costs was determined using least squares regression. An alternate treatment-effects regression model estimated the mortality cost-effect as an endogenous variable. Mortality predictors (P ≤0.0001, comparator: ICU ≥65 years, ventilated) were: ICU <65 not-ventilated (OR 0.014); ICU <65 ventilated (OR 0.090); ICU age ≥65 not-ventilated (OR 0.061) and ward ≥65 (OR 0.086); increasing injury severity score and increased Charlson comorbidity index of 1 and 2, compared with zero (OR 2.21 [1.40 to 3.48] and OR 2.57 [1.45 to 4.55]). The raw mean daily ICU and hospital costs in A$ 2005 (US$) for age <65 and ≥65 to ICU, and ≥65 to the ward were; for year 2000: ICU, $2717 (1462) and $2777 (1494); hospital, $1837 (988) and $1590 (855); ward $933 (502); for year 2005: ICU, $3202 (2393) and $3086 (2307); hospital, $1938 (1449) and $1914 (1431); ward $1180 (882). Cost increments were predicted by age ≥65 and ICU admission, increasing injury severity score, mechanical ventilation, Charlson comorbidity index increments and hospital survival. Mortalitycost-effect was estimated at -63% by least squares regression and -82% by treatment-effects regression model. Patient demographic factors, injury severity and its consequences predict both cost and survival in trauma. The cost mortality effect was biased upwards by conventional least squares regression estimation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Near-infrared spectroscopy (NIRS) calibrations were developed for the discrimination of Chinese hawthorn (Crataegus pinnatifida Bge. var. major) fruit from three geographical regions as well as for the estimation of the total sugar, total acid, total phenolic content, and total antioxidant activity. Principal component analysis (PCA) was used for the discrimination of the fruit on the basis of their geographical origin. Three pattern recognition methods, linear discriminant analysis, partial least-squares-discriminant analysis, and back-propagation artificial neural networks, were applied to classify and compare these samples. Furthermore, three multivariate calibration models based on the first derivative NIR spectroscopy, partial least-squares regression, back-propagation artificial neural networks, and least-squares-support vector machines, were constructed for quantitative analysis of the four analytes, total sugar, total acid, total phenolic content, and total antioxidant activity, and validated by prediction data sets.