945 resultados para Maximum penalized likelihood estimates
Resumo:
This paper discusses the statistical analyses used to derive bridge live loads models for Hong Kong from a 10-year weigh-in-motion (WIM) data. The statistical concepts required and the terminologies adopted in the development of bridge live load models are introduced. This paper includes studies for representative vehicles from the large amount of WIM data in Hong Kong. Different load affecting parameters such as gross vehicle weights, axle weights, axle spacings, average daily number of trucks etc are first analyzed by various stochastic processes in order to obtain the mathematical distributions of these parameters. As a prerequisite to determine accurate bridge design loadings in Hong Kong, this study not only takes advantages of code formulation methods used internationally but also presents a new method for modelling collected WIM data using a statistical approach.
Resumo:
We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical VC dimension, empirical VC entropy, and margin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.
Resumo:
We investigate the use of certain data-dependent estimates of the complexity of a function class, called Rademacher and Gaussian complexities. In a decision theoretic setting, we prove general risk bounds in terms of these complexities. We consider function classes that can be expressed as combinations of functions from basis classes and show how the Rademacher and Gaussian complexities of such a function class can be bounded in terms of the complexity of the basis classes. We give examples of the application of these techniques in finding data-dependent risk bounds for decision trees, neural networks and support vector machines.
Resumo:
We study sample-based estimates of the expectation of the function produced by the empirical minimization algorithm. We investigate the extent to which one can estimate the rate of convergence of the empirical minimizer in a data dependent manner. We establish three main results. First, we provide an algorithm that upper bounds the expectation of the empirical minimizer in a completely data-dependent manner. This bound is based on a structural result due to Bartlett and Mendelson, which relates expectations to sample averages. Second, we show that these structural upper bounds can be loose, compared to previous bounds. In particular, we demonstrate a class for which the expectation of the empirical minimizer decreases as O(1/n) for sample size n, although the upper bound based on structural properties is Ω(1). Third, we show that this looseness of the bound is inevitable: we present an example that shows that a sharp bound cannot be universally recovered from empirical data.
Resumo:
We study Krylov subspace methods for approximating the matrix-function vector product φ(tA)b where φ(z) = [exp(z) - 1]/z. This product arises in the numerical integration of large stiff systems of differential equations by the Exponential Euler Method, where A is the Jacobian matrix of the system. Recently, this method has found application in the simulation of transport phenomena in porous media within mathematical models of wood drying and groundwater flow. We develop an a posteriori upper bound on the Krylov subspace approximation error and provide a new interpretation of a previously published error estimate. This leads to an alternative Krylov approximation to φ(tA)b, the so-called Harmonic Ritz approximant, which we find does not exhibit oscillatory behaviour of the residual error.
Resumo:
Understanding the relationship between diet, physical activity and health in humans requires accurate measurement of body composition and daily energy expenditure. Stable isotopes provide a means of measuring total body water and daily energy expenditure under free-living conditions. While the use of isotope ratio mass spectrometry (IRMS) for the analysis of 2H (Deuterium) and 18O (Oxygen-18) is well established in the field of human energy metabolism research, numerous questions remain regarding the factors which influence analytical and measurement error using this methodology. This thesis was comprised of four studies with the following emphases. The aim of Study 1 was to determine the analytical and measurement error of the IRMS with regard to sample handling under certain conditions. Study 2 involved the comparison of TEE (Total daily energy expenditure) using two commonly employed equations. Further, saliva and urine samples, collected at different times, were used to determine if clinically significant differences would occur. Study 3 was undertaken to determine the appropriate collection times for TBW estimates and derived body composition values. Finally, Study 4, a single case study to investigate if TEE measures are affected when the human condition changes due to altered exercise and water intake. The aim of Study 1 was to validate laboratory approaches to measure isotopic enrichment to ensure accurate (to international standards), precise (reproducibility of three replicate samples) and linear (isotope ratio was constant over the expected concentration range) results. This established the machine variability for the IRMS equipment in use at Queensland University for both TBW and TEE. Using either 0.4mL or 0.5mL sample volumes for both oxygen-18 and deuterium were statistically acceptable (p>0.05) and showed a within analytical variance of 5.8 Delta VSOW units for deuterium, 0.41 Delta VSOW units for oxygen-18. This variance was used as “within analytical noise” to determine sample deviations. It was also found that there was no influence of equilibration time on oxygen-18 or deuterium values when comparing the minimum (oxygen-18: 24hr; deuterium: 3 days) and maximum (oxygen-18: and deuterium: 14 days) equilibration times. With regard to preparation using the vacuum line, any order of preparation is suitable as the TEE values fall within 8% of each other regardless of preparation order. An 8% variation is acceptable for the TEE values due to biological and technical errors (Schoeller, 1988). However, for the automated line, deuterium must be assessed first followed by oxygen-18 as the automated machine line does not evacuate tubes but merely refills them with an injection of gas for a predetermined time. Any fractionation (which may occur for both isotopes), would cause a slight elevation in the values and hence a lower TEE. The purpose of the second and third study was to investigate the use of IRMS to measure the TEE and TBW of and to validate the current IRMS practices in use with regard to sample collection times of urine and saliva, the use of two TEE equations from different research centers and the body composition values derived from these TEE and TBW values. Following the collection of a fasting baseline urine and saliva sample, 10 people (8 women, 2 men) were dosed with a doubly labeled water does comprised of 1.25g 10% oxygen-18 and 0.1 g 100% deuterium/kg body weight. The samples were collected hourly for 12 hrs on the first day and then morning, midday, and evening samples were collected for the next 14 days. The samples were analyzed using an isotope ratio mass spectrometer. For the TBW, time to equilibration was determined using three commonly employed data analysis approaches. Isotopic equilibration was reached in 90% of the sample by hour 6, and in 100% of the sample by hour 7. With regard to the TBW estimations, the optimal time for urine collection was found to be between hours 4 and 10 as to where there was no significant difference between values. In contrast, statistically significant differences in TBW estimations were found between hours 1-3 and from 11-12 when compared with hours 4-10. Most of the individuals in this study were in equilibrium after 7 hours. The TEE equations of Prof Dale Scholler (Chicago, USA, IAEA) and Prof K.Westerterp were compared with that of Prof. Andrew Coward (Dunn Nutrition Centre). When comparing values derived from samples collected in the morning and evening there was no effect of time or equation on resulting TEE values. The fourth study was a pilot study (n=1) to test the variability in TEE as a result of manipulations in fluid consumption and level of physical activity; the magnitude of change which may be expected in a sedentary adult. Physical activity levels were manipulated by increasing the number of steps per day to mimic the increases that may result when a sedentary individual commences an activity program. The study was comprised of three sub-studies completed on the same individual over a period of 8 months. There were no significant changes in TBW across all studies, even though the elimination rates changed with the supplemented water intake and additional physical activity. The extra activity may not have sufficiently strenuous enough and the water intake high enough to cause a significant change in the TBW and hence the CO2 production and TEE values. The TEE values measured show good agreement based on the estimated values calculated on an RMR of 1455 kcal/day, a DIT of 10% of TEE and activity based on measured steps. The covariance values tracked when plotting the residuals were found to be representative of “well-behaved” data and are indicative of the analytical accuracy. The ratio and product plots were found to reflect the water turnover and CO2 production and thus could, with further investigation, be employed to identify the changes in physical activity.
Resumo:
Analytical expressions are derived for the mean and variance, of estimates of the bispectrum of a real-time series assuming a cosinusoidal model. The effects of spectral leakage, inherent in discrete Fourier transform operation when the modes present in the signal have a nonintegral number of wavelengths in the record, are included in the analysis. A single phase-coupled triad of modes can cause the bispectrum to have a nonzero mean value over the entire region of computation owing to leakage. The variance of bispectral estimates in the presence of leakage has contributions from individual modes and from triads of phase-coupled modes. Time-domain windowing reduces the leakage. The theoretical expressions for the mean and variance of bispectral estimates are derived in terms of a function dependent on an arbitrary symmetric time-domain window applied to the record. the number of data, and the statistics of the phase coupling among triads of modes. The theoretical results are verified by numerical simulations for simple test cases and applied to laboratory data to examine phase coupling in a hypothesis testing framework
Resumo:
This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.
Resumo:
...the probabilistic computer simulation study by Dunham and colleagues evaluating the impact of different cervical spine management (CSM) strategies on tetraplegia and brain injury outcomes.1 Based on literature findings, expert opinion and with use of advances programming techniques the authors conclude that early collar removal without cervical spine magnetic resonance imaging (MRI) is a preferable CSM strategy for comatose, blunt trauma patients with extremity movement and a negative cervical spine computed tomography(CT) scan. Although we do not have the required expertise to comment on the applied statistical approach, we would like to comment on one of the medical assumptions raised by the authors, namely the likelihood of tetraplegia in this specific population....
Resumo:
BACKGROUND: The relationship between temperature and mortality has been explored for decades and many temperature indicators have been applied separately. However, few data are available to show how the effects of different temperature indicators on different mortality categories, particularly in a typical subtropical climate. OBJECTIVE: To assess the associations between various temperature indicators and different mortality categories in Brisbane, Australia during 1996-2004. METHODS: We applied two methods to assess the threshold and temperature indicator for each age and death groups: mean temperature and the threshold assessed from all cause mortality was used for all mortality categories; the specific temperature indicator and the threshold for each mortality category were identified separately according to the minimisation of AIC. We conducted polynomial distributed lag non-linear model to identify effect estimates in mortality with one degree of temperature increase (or decrease) above (or below) the threshold on current days and lagged effects using both methods. RESULTS: Akaike's Information Criterion was minimized when mean temperature was used for all non-external deaths and deaths from 75 to 84 years; when minimum temperature was used for deaths from 0 to 64 years, 65-74 years, ≥ 85 years, and from the respiratory diseases; when maximum temperature was used for deaths from cardiovascular diseases. The effect estimates using certain temperature indicators were similar as mean temperature both for current day and lag effects. CONCLUSION: Different age groups and death categories were sensitive to different temperature indicators. However, the effect estimates from certain temperature indicators did not significantly differ from those of mean temperature.
Resumo:
Taxes are an important component of investing that is commonly overlooked in both the literature and in practice. For example, many understand that taxes will reduce an investment’s return, but less understood is the risk-sharing nature of taxes that also reduces the investment’s risk. This thesis examines how taxes affect the optimal asset allocation and asset location decision in an Australian environment. It advances the model of Horan & Al Zaman (2008), improving the method by which the present value of tax liabilities are calculated, by using an after-tax risk-free discount rate, and incorporating any new or reduced tax liabilities generated into its expected risk and return estimates. The asset allocation problem is examined for a range of different scenarios using Australian parameters, including different risk aversion levels, personal marginal tax rates, investment horizons, borrowing premiums, high or low inflation environments, and different starting cost bases. The findings support the Horan & Al Zaman (2008) conclusion that equities should be held in the taxable account. In fact, these findings are strengthened with most of the efficient frontier maximising equity holdings in the taxable account instead of only half. Furthermore, these findings transfer to the Australian case, where it is found that taxed Australian investors should always invest into equities first through the taxable account before investing in super. However, untaxed Australian investors should invest their equity first through superannuation. With borrowings allowed in the taxable account (no borrowing premium), Australian taxed investors should hold 100% of the superannuation account in the risk-free asset, while undertaking leverage in the taxable account to achieve the desired risk-return. Introducing a borrowing premium decreases the likelihood of holding 100% of super in the risk-free asset for taxable investors. The findings also suggest that the higher the marginal tax rate, the higher the borrowing premium in order to overcome this effect. Finally, as the investor’s marginal tax rate increases, the overall allocation to equities should increase due to the increased risk and return sharing caused by taxation, and in order to achieve the same risk/return level as the lower taxation level, the investor must take on more equity exposure. The investment horizon has a minimal impact on the optimal allocation decision in the absence of factors such as mean reversion and human capital.