174 resultados para Statistical Tolerance Analysis
Resumo:
Analytical expressions are derived for the mean and variance, of estimates of the bispectrum of a real-time series assuming a cosinusoidal model. The effects of spectral leakage, inherent in discrete Fourier transform operation when the modes present in the signal have a nonintegral number of wavelengths in the record, are included in the analysis. A single phase-coupled triad of modes can cause the bispectrum to have a nonzero mean value over the entire region of computation owing to leakage. The variance of bispectral estimates in the presence of leakage has contributions from individual modes and from triads of phase-coupled modes. Time-domain windowing reduces the leakage. The theoretical expressions for the mean and variance of bispectral estimates are derived in terms of a function dependent on an arbitrary symmetric time-domain window applied to the record. the number of data, and the statistics of the phase coupling among triads of modes. The theoretical results are verified by numerical simulations for simple test cases and applied to laboratory data to examine phase coupling in a hypothesis testing framework
An approach to statistical lip modelling for speaker identification via chromatic feature extraction
Resumo:
This paper presents a novel technique for the tracking of moving lips for the purpose of speaker identification. In our system, a model of the lip contour is formed directly from chromatic information in the lip region. Iterative refinement of contour point estimates is not required. Colour features are extracted from the lips via concatenated profiles taken around the lip contour. Reduction of order in lip features is obtained via principal component analysis (PCA) followed by linear discriminant analysis (LDA). Statistical speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments performed on the M2VTS1 database, show encouraging results
Resumo:
Multivariate volatility forecasts are an important input in many financial applications, in particular portfolio optimisation problems. Given the number of models available and the range of loss functions to discriminate between them, it is obvious that selecting the optimal forecasting model is challenging. The aim of this thesis is to thoroughly investigate how effective many commonly used statistical (MSE and QLIKE) and economic (portfolio variance and portfolio utility) loss functions are at discriminating between competing multivariate volatility forecasts. An analytical investigation of the loss functions is performed to determine whether they identify the correct forecast as the best forecast. This is followed by an extensive simulation study examines the ability of the loss functions to consistently rank forecasts, and their statistical power within tests of predictive ability. For the tests of predictive ability, the model confidence set (MCS) approach of Hansen, Lunde and Nason (2003, 2011) is employed. As well, an empirical study investigates whether simulation findings hold in a realistic setting. In light of these earlier studies, a major empirical study seeks to identify the set of superior multivariate volatility forecasting models from 43 models that use either daily squared returns or realised volatility to generate forecasts. This study also assesses how the choice of volatility proxy affects the ability of the statistical loss functions to discriminate between forecasts. Analysis of the loss functions shows that QLIKE, MSE and portfolio variance can discriminate between multivariate volatility forecasts, while portfolio utility cannot. An examination of the effective loss functions shows that they all can identify the correct forecast at a point in time, however, their ability to discriminate between competing forecasts does vary. That is, QLIKE is identified as the most effective loss function, followed by portfolio variance which is then followed by MSE. The major empirical analysis reports that the optimal set of multivariate volatility forecasting models includes forecasts generated from daily squared returns and realised volatility. Furthermore, it finds that the volatility proxy affects the statistical loss functions’ ability to discriminate between forecasts in tests of predictive ability. These findings deepen our understanding of how to choose between competing multivariate volatility forecasts.
Resumo:
Lower fruit and vegetable intake among socioeconomically disadvantaged groups has been well documented, and may be a consequence of a higher consumption of take-out foods. This study examined whether, and to what extent, take-out food consumption mediated (explained) the association between socioeconomic position and fruit and vegetable intake. A cross-sectional postal survey was conducted among 1500 randomly selected adults aged 25–64 years in Brisbane, Australia in 2009 (response rate = 63.7%, N = 903). A food frequency questionnaire assessed usual daily servings of fruits and vegetables (0 to 6), overall take-out consumption (times/week) and the consumption of 22 specific take-out items (never to ≥once/day). These specific take-out items were grouped into “less healthy” and “healthy” choices and indices were created for each type of choice (0 to 100). Socioeconomic position was ascertained by education. The analyses were performed using linear regression, and a bootstrap re-sampling approach estimated the statistical significance of the mediated effects. Mean daily serves of fruits and vegetables was 1.89 (SD 1.05) and 2.47 (SD 1.12) respectively. The least educated group were more likely to consume fewer serves of fruit (B= –0.39, p<0.001) and vegetables (B= –0.43, p<0.001) compared with the highest educated. The consumption of “less healthy” take-out food partly explained (mediated) education differences in fruit and vegetable intake; however, no mediating effects were observed for overall and “healthy” take-out consumption. Regular consumption of “less healthy” take-out items may contribute to socioeconomic differences in fruit and vegetable intake, possibly by displacing these foods.
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
The Request For Proposal (RFP) with the design‐build (DB) procurement arrangement is a document in which an owner develops his requirements and conveys the project scope to DB contractors. Owners should provide an appropriate level of design in DB RFPs to adequately describe their requirements without compromising the prospects for innovation. This paper examines and compares the different levels of owner‐provided design in DB RFPs by the content analysis of 84 requests for RFPs for public DB projects advertised between 2000 and 2010 with an aggregate contract value of over $5.4 billion. A statistical analysis was also conducted in order to explore the relationship between the proportion of owner‐provided design and other project information, including project type, advertisement time, project size, contractor selection method, procurement process and contract type. The results show that the majority (64.8%) of the RFPs provide less than 10% of the owner‐provided design. The owner‐provided design proportion has a significant association with project type, project size, contractor selection method and contract type. In addition, owners are generally providing less design in recent years than hitherto. The research findings also provide owners with perspectives to determine the appropriate level of owner‐provided design in DB RFPs.
Resumo:
A wireless sensor network system must have the ability to tolerate harsh environmental conditions and reduce communication failures. In a typical outdoor situation, the presence of wind can introduce movement in the foliage. This motion of vegetation structures causes large and rapid signal fading in the communication link and must be accounted for when deploying a wireless sensor network system in such conditions. This thesis examines the fading characteristics experienced by wireless sensor nodes due to the effect of varying wind speed in a foliage obstructed transmission path. It presents extensive measurement campaigns at two locations with the approach of a typical wireless sensor networks configuration. The significance of this research lies in the varied approaches of its different experiments, involving a variety of vegetation types, scenarios and the use of different polarisations (vertical and horizontal). Non–line of sight (NLoS) scenario conditions investigate the wind effect based on different vegetation densities including that of the Acacia tree, Dogbane tree and tall grass. Whereas the line of sight (LoS) scenario investigates the effect of wind when the grass is swaying and affecting the ground-reflected component of the signal. Vegetation type and scenarios are envisaged to simulate real life working conditions of wireless sensor network systems in outdoor foliated environments. The results from the measurements are presented in statistical models involving first and second order statistics. We found that in most of the cases, the fading amplitude could be approximated by both Lognormal and Nakagami distribution, whose m parameter was found to depend on received power fluctuations. Lognormal distribution is known as the result of slow fading characteristics due to shadowing. This study concludes that fading caused by variations in received power due to wind in wireless sensor networks systems are found to be insignificant. There is no notable difference in Nakagami m values for low, calm, and windy wind speed categories. It is also shown in the second order analysis, the duration of the deep fades are very short, 0.1 second for 10 dB attenuation below RMS level for vertical polarization and 0.01 second for 10 dB attenuation below RMS level for horizontal polarization. Another key finding is that the received signal strength for horizontal polarisation demonstrates more than 3 dB better performances than the vertical polarisation for LoS and near LoS (thin vegetation) conditions and up to 10 dB better for denser vegetation conditions.
Resumo:
Denaturation of tissues can provide a unique biological environment for regenerative medicine application only if minimal disruption of their microarchitecture is achieved during the decellularization process. The goal is to keep the structural integrity of such a construct as functional as the tissues from which they were derived. In this work, cartilage-on-bone laminates were decellularized through enzymatic, non-ionic and ionic protocols. This work investigated the effects of decellularization process on the microarchitecture of cartiligous extracellular matrix; determining the extent of how each process deteriorated the structural organization of the network. High resolution microscopy was used to capture cross-sectional images of samples prior to and after treatment. The variation of the microarchitecture was then analysed using a well defined fast Fourier image processing algorithm. Statistical analysis of the results revealed how significant the alternations among aforementioned protocols were (p < 0.05). Ranking the treatments by their effectiveness in disrupting the ECM integrity, they were ordered as: Trypsin> SDS> Triton X-100.
Resumo:
With increasing rate of shipping traffic, the risk of collisions in busy and congested port waters is expected to rise. However, due to low collision frequencies it is difficult to analyze such risk in a sound statistical manner. This study aims at examining the occurrence of traffic conflicts in order to understand the characteristics of vessels involved in navigational hazards. A binomial logit model was employed to evaluate the association of vessel attributes and the kinematic conditions with conflict severity levels. Results show a positive association for vessels of small gross tonnage, overall vessel length, vessel height and draft with conflict risk. Conflicts involving a pair of dynamic vessels sailing at low speeds also have similar effects.
Resumo:
The discovery of protein variation is an important strategy in disease diagnosis within the biological sciences. The current benchmark for elucidating information from multiple biological variables is the so called “omics” disciplines of the biological sciences. Such variability is uncovered by implementation of multivariable data mining techniques which come under two primary categories, machine learning strategies and statistical based approaches. Typically proteomic studies can produce hundreds or thousands of variables, p, per observation, n, depending on the analytical platform or method employed to generate the data. Many classification methods are limited by an n≪p constraint, and as such, require pre-treatment to reduce the dimensionality prior to classification. Recently machine learning techniques have gained popularity in the field for their ability to successfully classify unknown samples. One limitation of such methods is the lack of a functional model allowing meaningful interpretation of results in terms of the features used for classification. This is a problem that might be solved using a statistical model-based approach where not only is the importance of the individual protein explicit, they are combined into a readily interpretable classification rule without relying on a black box approach. Here we incorporate statistical dimension reduction techniques Partial Least Squares (PLS) and Principal Components Analysis (PCA) followed by both statistical and machine learning classification methods, and compared them to a popular machine learning technique, Support Vector Machines (SVM). Both PLS and SVM demonstrate strong utility for proteomic classification problems.
Resumo:
Introduction: The ability to regulate joint stiffness and coordinate movement during landing when impaired by muscle fatigue has important implications for knee function. Unfortunately, the literature examining fatigue effects on landing mechanics suffers from a lack of consensus. Inconsistent results can be attributed to variable fatigue models, as well as grouping variable responses between individuals when statistically detecting differences between conditions. There remains a need to examine fatigue effects on knee function during landing with attention to these methodological limitations. Aim: The purpose of this study therefore, was to examine the effects of isokinetic fatigue on pre-impact muscle activity and post-impact knee mechanics during landing using singlesubject analysis. Methodology: Sixteen male university students (22.6+3.2 yrs; 1.78+0.07 m; 75.7+6.3 kg) performed maximal concentric and eccentric knee extensions in a reciprocal manner on an isokinetic dynamometer and step-landing trials on 2 occasions. On the first occasion each participant performed 20 step-landing trials from a knee-high platform followed by 75 maximal contractions on the isokinetic dynamometer. The isokinetic data was used to calculate the operational definition of fatigue. On the second occasion, with a minimum rest of 14 days, participants performed 2 sets of 20 step landing trials, followed by isokinetic exercise until the operational definition of fatigue was met and a final post-fatigue set of 20 step-landing trials. Results: Single-subject analyses revealed that isokinetic fatigue of the quadriceps induced variable responses in pre impact activation of knee extensors and flexors (frequency, onset timing and amplitude) and post-impact knee mechanics(stiffness and coordination). In general however, isokinetic fatigue induced sig nificant (p<0.05) reductions in quadriceps activation frequency, delayed onset and increased amplitude. In addition, knee stiffness was significantly (p<0.05) increased in some individuals, as well as impaired sagittal coordination. Conclusions: Pre impact activation and post-impact mechanics were adjusted in patterns that were unique to the individual, which could not be identified using traditional group-based statistical analysis. The results suggested that individuals optimised knee function differently to satisfy competing demands, such as minimising energy expenditure, as well as maximising joint stability and sensory information.