Biblioteca Digital

136 resultados para Statistical Model

em Queensland University of Technology - ePrints Archive

A statistical model for combustion resonance from a DI diesel engine with applications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Introduced in this paper is a Bayesian model for isolating the resonant frequency from combustion chamber resonance. The model shown in this paper focused on characterising the initial rise in the resonant frequency to investigate the rise of in-cylinder bulk temperature associated with combustion. By resolving the model parameters, it is possible to determine: the start of pre-mixed combustion, the start of diffusion combustion, the initial resonant frequency, the resonant frequency as a function of crank angle, the in-cylinder bulk temperature as a function of crank angle and the trapped mass as a function of crank angle. The Bayesian method allows for individual cycles to be examined without cycle-averaging|allowing inter-cycle variability studies. Results are shown for a turbo-charged, common-rail compression ignition engine run at 2000 rpm and full load.

Statistical analysis and reduction of multiple access interference in MC-CDMA systems

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Multicarrier code division multiple access (MC-CDMA) is a very promising candidate for the multiple access scheme in fourth generation wireless communi- cation systems. During asynchronous transmission, multiple access interference (MAI) is a major challenge for MC-CDMA systems and significantly affects their performance. The main objectives of this thesis are to analyze the MAI in asyn- chronous MC-CDMA, and to develop robust techniques to reduce the MAI effect. Focus is first on the statistical analysis of MAI in asynchronous MC-CDMA. A new statistical model of MAI is developed. In the new model, the derivation of MAI can be applied to different distributions of timing offset, and the MAI power is modelled as a Gamma distributed random variable. By applying the new statistical model of MAI, a new computer simulation model is proposed. This model is based on the modelling of a multiuser system as a single user system followed by an additive noise component representing the MAI, which enables the new simulation model to significantly reduce the computation load during computer simulations. MAI reduction using slow frequency hopping (SFH) technique is the topic of the second part of the thesis. Two subsystems are considered. The first sub- system involves subcarrier frequency hopping as a group, which is referred to as GSFH/MC-CDMA. In the second subsystem, the condition of group hopping is dropped, resulting in a more general system, namely individual subcarrier frequency hopping MC-CDMA (ISFH/MC-CDMA). This research found that with the introduction of SFH, both of GSFH/MC-CDMA and ISFH/MC-CDMA sys- tems generate less MAI power than the basic MC-CDMA system during asyn- chronous transmission. Because of this, both SFH systems are shown to outper- form MC-CDMA in terms of BER. This improvement, however, is at the expense of spectral widening. In the third part of this thesis, base station polarization diversity, as another MAI reduction technique, is introduced to asynchronous MC-CDMA. The com- bined system is referred to as Pol/MC-CDMA. In this part a new optimum com- bining technique namely maximal signal-to-MAI ratio combining (MSMAIRC) is proposed to combine the signals in two base station antennas. With the applica- tion of MSMAIRC and in the absents of additive white Gaussian noise (AWGN), the resulting signal-to-MAI ratio (SMAIR) is not only maximized but also in- dependent of cross polarization discrimination (XPD) and antenna angle. In the case when AWGN is present, the performance of MSMAIRC is still affected by the XPD and antenna angle, but to a much lesser degree than the traditional maximal ratio combining (MRC). Furthermore, this research found that the BER performance for Pol/MC-CDMA can be further improved by changing the angle between the two receiving antennas. Hence the optimum antenna angles for both MSMAIRC and MRC are derived and their effects on the BER performance are compared. With the derived optimum antenna angle, the Pol/MC-CDMA system is able to obtain the lowest BER for a given XPD.

A comparison of methods for classifying clinical samples based on proteomics data : a case study for statistical and machine learning approaches

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The discovery of protein variation is an important strategy in disease diagnosis within the biological sciences. The current benchmark for elucidating information from multiple biological variables is the so called “omics” disciplines of the biological sciences. Such variability is uncovered by implementation of multivariable data mining techniques which come under two primary categories, machine learning strategies and statistical based approaches. Typically proteomic studies can produce hundreds or thousands of variables, p, per observation, n, depending on the analytical platform or method employed to generate the data. Many classification methods are limited by an n≪p constraint, and as such, require pre-treatment to reduce the dimensionality prior to classification. Recently machine learning techniques have gained popularity in the field for their ability to successfully classify unknown samples. One limitation of such methods is the lack of a functional model allowing meaningful interpretation of results in terms of the features used for classification. This is a problem that might be solved using a statistical model-based approach where not only is the importance of the individual protein explicit, they are combined into a readily interpretable classification rule without relying on a black box approach. Here we incorporate statistical dimension reduction techniques Partial Least Squares (PLS) and Principal Components Analysis (PCA) followed by both statistical and machine learning classification methods, and compared them to a popular machine learning technique, Support Vector Machines (SVM). Both PLS and SVM demonstrate strong utility for proteomic classification problems.

Detecting conversing groups of chatters : a model, algorithms, and tests

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Chatrooms, for example Internet Relay Chat, are generally multi-user, multi-channel and multiserver chat-systems which run over the Internet and provide a protocol for real-time text-based conferencing between users all over the world. While a well-trained human observer is able to understand who is chatting with whom, there are no efficient and accurate automated tools to determine the groups of users conversing with each other. A precursor to analysing evolving cyber-social phenomena is to first determine what the conversations are and which groups of chatters are involved in each conversation. We consider this problem in this paper. We propose an algorithm to discover all groups of users that are engaged in conversation. Our algorithms are based on a statistical model of a chatroom that is founded on our experience with real chatrooms. Our approach does not require any semantic analysis of the conversations, rather it is based purely on the statistical information contained in the sequence of posts. We improve the accuracy by applying some graph algorithms to clean the statistical information. We present some experimental results which indicate that one can automatically determine the conversing groups in a chatroom, purely on the basis of statistical analysis.

A microscopic simulation model to estimate bus rapid transit station bus capacity

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Bus Rapid Transit (BRT) station is the interface between passengers and services. The station is crucial to line operation as it is typically the only location where buses can pass each other. Congestion may occur here when buses maneuvering into and out of the platform lane interfere with bus flow, or when a queue of buses forms upstream of the platform lane blocking the passing lane. Further, some systems include operation where express buses do not observe the station, resulting in a proportion of non-stopping buses. It is important to understand the operation of the station under this type of operation and its effect on BRT line capacity. This study uses microscopic traffic simulation modeling to treat the BRT station operation and to analyze the relationship between station bus capacity and BRT line bus capacity. First, the simulation model is developed for the limit state scenario and then a statistical model is defined and calibrated for a specified range of controlled scenarios of dwell time characteristics. A field survey was conducted to verify the parameters such as dwell time, clearance time and coefficient of variation of dwell time to obtain relevant station bus capacity. The proposed model for BRT bus capacity provides a better understanding of BRT line capacity and is useful to transit authorities in BRT planning, design and operation.

Statistical modelling of district-level residential electricity use in NSW, Australia

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Electricity network investment and asset management require accurate estimation of future demand in energy consumption within specified service areas. For this purpose, simple models are typically developed to predict future trends in electricity consumption using various methods and assumptions. This paper presents a statistical model to predict electricity consumption in the residential sector at the Census Collection District (CCD) level over the state of New South Wales, Australia, based on spatial building and household characteristics. Residential household demographic and building data from the Australian Bureau of Statistics (ABS) and actual electricity consumption data from electricity companies are merged for 74 % of the 12,000 CCDs in the state. Eighty percent of the merged dataset is randomly set aside to establish the model using regression analysis, and the remaining 20 % is used to independently test the accuracy of model prediction against actual consumption. In 90 % of the cases, the predicted consumption is shown to be within 5 kWh per dwelling per day from actual values, with an overall state accuracy of -1.15 %. Given a future scenario with a shift in climate zone and a growth in population, the model is used to identify the geographical or service areas that are most likely to have increased electricity consumption. Such geographical representation can be of great benefit when assessing alternatives to the centralised generation of energy; having such a model gives a quantifiable method to selecting the 'most' appropriate system when a review or upgrade of the network infrastructure is required.

Statistical methods for electromyography data and associated problems

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This thesis proposes three novel models which extend the statistical methodology for motor unit number estimation, a clinical neurology technique. Motor unit number estimation is important in the treatment of degenerative muscular diseases and, potentially, spinal injury. Additionally, a recent and untested statistic to enable statistical model choice is found to be a practical alternative for larger datasets. The existing methods for dose finding in dual-agent clinical trials are found to be suitable only for designs of modest dimensions. The model choice case-study is the first of its kind containing interesting results using so-called unit information prior distributions.

Development of a log-quadratic model to describe microbial inactivation, illustrated by thermal inactivation of Clostridium botulinum

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In the commercial food industry, demonstration of microbiological safety and thermal process equivalence often involves a mathematical framework that assumes log-linear inactivation kinetics and invokes concepts of decimal reduction time (DT), z values, and accumulated lethality. However, many microbes, particularly spores, exhibit inactivation kinetics that are not log linear. This has led to alternative modeling approaches, such as the biphasic and Weibull models, that relax strong log-linear assumptions. Using a statistical framework, we developed a novel log-quadratic model, which approximates the biphasic and Weibull models and provides additional physiological interpretability. As a statistical linear model, the log-quadratic model is relatively simple to fit and straightforwardly provides confidence intervals for its fitted values. It allows a DT-like value to be derived, even from data that exhibit obvious "tailing." We also showed how existing models of non-log-linear microbial inactivation, such as the Weibull model, can fit into a statistical linear model framework that dramatically simplifies their solution. We applied the log-quadratic model to thermal inactivation data for the spore-forming bacterium Clostridium botulinum and evaluated its merits compared with those of popular previously described approaches. The log-quadratic model was used as the basis of a secondary model that can capture the dependence of microbial inactivation kinetics on temperature. This model, in turn, was linked to models of spore inactivation of Sapru et al. and Rodriguez et al. that posit different physiological states for spores within a population. We believe that the log-quadratic model provides a useful framework in which to test vitalistic and mechanistic hypotheses of inactivation by thermal and other processes. Copyright © 2009, American Society for Microbiology. All Rights Reserved.

An external field prior for the hidden Potts model with application to cone-beam computed tomography

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In images with low contrast-to-noise ratio (CNR), the information gain from the observed pixel values can be insufficient to distinguish foreground objects. A Bayesian approach to this problem is to incorporate prior information about the objects into a statistical model. A method for representing spatial prior information as an external field in a hidden Potts model is introduced. This prior distribution over the latent pixel labels is a mixture of Gaussian fields, centred on the positions of the objects at a previous point in time. It is particularly applicable in longitudinal imaging studies, where the manual segmentation of one image can be used as a prior for automatic segmentation of subsequent images. The method is demonstrated by application to cone-beam computed tomography (CT), an imaging modality that exhibits distortions in pixel values due to X-ray scatter. The external field prior results in a substantial improvement in segmentation accuracy, reducing the mean pixel misclassification rate for an electron density phantom from 87% to 6%. The method is also applied to radiotherapy patient data, demonstrating how to derive the external field prior in a clinical context.

Using Trauma Injury Severity Score (TRISS) variables to predict length of hospital stay following trauma in New Zealand

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Aim – To develop and assess the predictive capabilities of a statistical model that relates routinely collected Trauma Injury Severity Score (TRISS) variables to length of hospital stay (LOS) in survivors of traumatic injury. Method – Retrospective cohort study of adults who sustained a serious traumatic injury, and who survived until discharge from Auckland City, Middlemore, Waikato, or North Shore Hospitals between 2002 and 2006. Cubic-root transformed LOS was analysed using two-level mixed-effects regression models. Results – 1498 eligible patients were identified, 1446 (97%) injured from a blunt mechanism and 52 (3%) from a penetrating mechanism. For blunt mechanism trauma, 1096 (76%) were male, average age was 37 years (range: 15-94 years), and LOS and TRISS score information was available for 1362 patients. Spearman’s correlation and the median absolute prediction error between LOS and the original TRISS model was ρ=0.31 and 10.8 days, respectively, and between LOS and the final multivariable two-level mixed-effects regression model was ρ=0.38 and 6.0 days, respectively. Insufficient data were available for the analysis of penetrating mechanism models. Conclusions – Neither the original TRISS model nor the refined model has sufficient ability to accurately or reliably predict LOS. Additional predictor variables for LOS and other indicators for morbidity need to be considered.

A comparison of two business cycle dating methods

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We study the suggestion that Markov switching (MS) models should be used to determine cyclical turning points. A Kalman filter approximation is used to derive the dating rules implicit in such models. We compare these with dating rules in an algorithm that provides a good approximation to the chronology determined by the NBER. We find that there is very little that is attractive in the MS approach when compared with this algorithm. The most important difference relates to robustness. The MS approach depends on the validity of that statistical model. Our approach is valid in a wider range of circumstances.

Analysis of distrubances in large interconnected power systems

Relevância:

60.00% 60.00%

Publicador:

Resumo:

World economies increasingly demand reliable and economical power supply and distribution. To achieve this aim the majority of power systems are becoming interconnected, with several power utilities supplying the one large network. One problem that occurs in a large interconnected power system is the regular occurrence of system disturbances which can result in the creation of intra-area oscillating modes. These modes can be regarded as the transient responses of the power system to excitation, which are generally characterised as decaying sinusoids. For a power system operating ideally these transient responses would ideally would have a “ring-down” time of 10-15 seconds. Sometimes equipment failures disturb the ideal operation of power systems and oscillating modes with ring-down times greater than 15 seconds arise. The larger settling times associated with such “poorly damped” modes cause substantial power flows between generation nodes, resulting in significant physical stresses on the power distribution system. If these modes are not just poorly damped but “negatively damped”, catastrophic failures of the system can occur. To ensure system stability and security of large power systems, the potentially dangerous oscillating modes generated from disturbances (such as equipment failure) must be quickly identified. The power utility must then apply appropriate damping control strategies. In power system monitoring there exist two facets of critical interest. The first is the estimation of modal parameters for a power system in normal, stable, operation. The second is the rapid detection of any substantial changes to this normal, stable operation (because of equipment breakdown for example). Most work to date has concentrated on the first of these two facets, i.e. on modal parameter estimation. Numerous modal parameter estimation techniques have been proposed and implemented, but all have limitations [1-13]. One of the key limitations of all existing parameter estimation methods is the fact that they require very long data records to provide accurate parameter estimates. This is a particularly significant problem after a sudden detrimental change in damping. One simply cannot afford to wait long enough to collect the large amounts of data required for existing parameter estimators. Motivated by this gap in the current body of knowledge and practice, the research reported in this thesis focuses heavily on rapid detection of changes (i.e. on the second facet mentioned above). This thesis reports on a number of new algorithms which can rapidly flag whether or not there has been a detrimental change to a stable operating system. It will be seen that the new algorithms enable sudden modal changes to be detected within quite short time frames (typically about 1 minute), using data from power systems in normal operation. The new methods reported in this thesis are summarised below. The Energy Based Detector (EBD): The rationale for this method is that the modal disturbance energy is greater for lightly damped modes than it is for heavily damped modes (because the latter decay more rapidly). Sudden changes in modal energy, then, imply sudden changes in modal damping. Because the method relies on data from power systems in normal operation, the modal disturbances are random. Accordingly, the disturbance energy is modelled as a random process (with the parameters of the model being determined from the power system under consideration). A threshold is then set based on the statistical model. The energy method is very simple to implement and is computationally efficient. It is, however, only able to determine whether or not a sudden modal deterioration has occurred; it cannot identify which mode has deteriorated. For this reason the method is particularly well suited to smaller interconnected power systems that involve only a single mode. Optimal Individual Mode Detector (OIMD): As discussed in the previous paragraph, the energy detector can only determine whether or not a change has occurred; it cannot flag which mode is responsible for the deterioration. The OIMD seeks to address this shortcoming. It uses optimal detection theory to test for sudden changes in individual modes. In practice, one can have an OIMD operating for all modes within a system, so that changes in any of the modes can be detected. Like the energy detector, the OIMD is based on a statistical model and a subsequently derived threshold test. The Kalman Innovation Detector (KID): This detector is an alternative to the OIMD. Unlike the OIMD, however, it does not explicitly monitor individual modes. Rather it relies on a key property of a Kalman filter, namely that the Kalman innovation (the difference between the estimated and observed outputs) is white as long as the Kalman filter model is valid. A Kalman filter model is set to represent a particular power system. If some event in the power system (such as equipment failure) causes a sudden change to the power system, the Kalman model will no longer be valid and the innovation will no longer be white. Furthermore, if there is a detrimental system change, the innovation spectrum will display strong peaks in the spectrum at frequency locations associated with changes. Hence the innovation spectrum can be monitored to both set-off an “alarm” when a change occurs and to identify which modal frequency has given rise to the change. The threshold for alarming is based on the simple Chi-Squared PDF for a normalised white noise spectrum [14, 15]. While the method can identify the mode which has deteriorated, it does not necessarily indicate whether there has been a frequency or damping change. The PPM discussed next can monitor frequency changes and so can provide some discrimination in this regard. The Polynomial Phase Method (PPM): In [16] the cubic phase (CP) function was introduced as a tool for revealing frequency related spectral changes. This thesis extends the cubic phase function to a generalised class of polynomial phase functions which can reveal frequency related spectral changes in power systems. A statistical analysis of the technique is performed. When applied to power system analysis, the PPM can provide knowledge of sudden shifts in frequency through both the new frequency estimate and the polynomial phase coefficient information. This knowledge can be then cross-referenced with other detection methods to provide improved detection benchmarks.

Spectral coding methods for speech compression and speaker identification

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis investigates aspects of encoding the speech spectrum at low bit rates, with extensions to the effect of such coding on automatic speaker identification. Vector quantization (VQ) is a technique for jointly quantizing a block of samples at once, in order to reduce the bit rate of a coding system. The major drawback in using VQ is the complexity of the encoder. Recent research has indicated the potential applicability of the VQ method to speech when product code vector quantization (PCVQ) techniques are utilized. The focus of this research is the efficient representation, calculation and utilization of the speech model as stored in the PCVQ codebook. In this thesis, several VQ approaches are evaluated, and the efficacy of two training algorithms is compared experimentally. It is then shown that these productcode vector quantization algorithms may be augmented with lossless compression algorithms, thus yielding an improved overall compression rate. An approach using a statistical model for the vector codebook indices for subsequent lossless compression is introduced. This coupling of lossy compression and lossless compression enables further compression gain. It is demonstrated that this approach is able to reduce the bit rate requirement from the current 24 bits per 20 millisecond frame to below 20, using a standard spectral distortion metric for comparison. Several fast-search VQ methods for use in speech spectrum coding have been evaluated. The usefulness of fast-search algorithms is highly dependent upon the source characteristics and, although previous research has been undertaken for coding of images using VQ codebooks trained with the source samples directly, the product-code structured codebooks for speech spectrum quantization place new constraints on the search methodology. The second major focus of the research is an investigation of the effect of lowrate spectral compression methods on the task of automatic speaker identification. The motivation for this aspect of the research arose from a need to simultaneously preserve the speech quality and intelligibility and to provide for machine-based automatic speaker recognition using the compressed speech. This is important because there are several emerging applications of speaker identification where compressed speech is involved. Examples include mobile communications where the speech has been highly compressed, or where a database of speech material has been assembled and stored in compressed form. Although these two application areas have the same objective - that of maximizing the identification rate - the starting points are quite different. On the one hand, the speech material used for training the identification algorithm may or may not be available in compressed form. On the other hand, the new test material on which identification is to be based may only be available in compressed form. Using the spectral parameters which have been stored in compressed form, two main classes of speaker identification algorithm are examined. Some studies have been conducted in the past on bandwidth-limited speaker identification, but the use of short-term spectral compression deserves separate investigation. Combining the major aspects of the research, some important design guidelines for the construction of an identification model when based on the use of compressed speech are put forward.

Validation of FHWA crash models for rural intersections - Lessons learned

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A national-level safety analysis tool is needed to complement existing analytical tools for assessment of the safety impacts of roadway design alternatives. FHWA has sponsored the development of the Interactive Highway Safety Design Model (IHSDM), which is roadway design and redesign software that estimates the safety effects of alternative designs. Considering the importance of IHSDM in shaping the future of safety-related transportation investment decisions, FHWA justifiably sponsored research with the sole intent of independently validating some of the statistical models and algorithms in IHSDM. Statistical model validation aims to accomplish many important tasks, including (a) assessment of the logical defensibility of proposed models, (b) assessment of the transferability of models over future time periods and across different geographic locations, and (c) identification of areas in which future model improvements should be made. These three activities are reported for five proposed types of rural intersection crash prediction models. The internal validation of the model revealed that the crash models potentially suffer from omitted variables that affect safety, site selection and countermeasure selection bias, poorly measured and surrogate variables, and misspecification of model functional forms. The external validation indicated the inability of models to perform on par with model estimation performance. Recommendations for improving the state of the practice from this research include the systematic conduct of carefully designed before-and-after studies, improvements in data standardization and collection practices, and the development of analytical methods to combine the results of before-and-after studies with cross-sectional studies in a meaningful and useful way.

Derivation of motor vehicle tailpipe particle emission factors suitable for modelling urban fleet emissions and air quality assessments

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background, aim, and scope Urban motor vehicle fleets are a major source of particulate matter pollution, especially of ultrafine particles (diameters < 0.1 µm), and exposure to particulate matter has known serious health effects. A considerable body of literature is available on vehicle particle emission factors derived using a wide range of different measurement methods for different particle sizes, conducted in different parts of the world. Therefore the choice as to which are the most suitable particle emission factors to use in transport modelling and health impact assessments presented as a very difficult task. The aim of this study was to derive a comprehensive set of tailpipe particle emission factors for different vehicle and road type combinations, covering the full size range of particles emitted, which are suitable for modelling urban fleet emissions. Materials and methods A large body of data available in the international literature on particle emission factors for motor vehicles derived from measurement studies was compiled and subjected to advanced statistical analysis, to determine the most suitable emission factors to use in modelling urban fleet emissions. Results This analysis resulted in the development of five statistical models which explained 86%, 93%, 87%, 65% and 47% of the variation in published emission factors for particle number, particle volume, PM1, PM2.5 and PM10 respectively. A sixth model for total particle mass was proposed but no significant explanatory variables were identified in the analysis. From the outputs of these statistical models, the most suitable particle emission factors were selected. This selection was based on examination of the statistical robustness of the statistical model outputs, including consideration of conservative average particle emission factors with the lowest standard errors, narrowest 95% confidence intervals and largest sample sizes, and the explanatory model variables, which were Vehicle Type (all particle metrics), Instrumentation (particle number and PM2.5), Road Type (PM10) and Size Range Measured and Speed Limit on the Road (particle volume). Discussion A multiplicity of factors need to be considered in determining emission factors that are suitable for modelling motor vehicle emissions, and this study derived a set of average emission factors suitable for quantifying motor vehicle tailpipe particle emissions in developed countries. Conclusions The comprehensive set of tailpipe particle emission factors presented in this study for different vehicle and road type combinations enable the full size range of particles generated by fleets to be quantified, including ultrafine particles (measured in terms of particle number). These emission factors have particular application for regions which may have a lack of funding to undertake measurements, or insufficient measurement data upon which to derive emission factors for their region. Recommendations and perspectives In urban areas motor vehicles continue to be a major source of particulate matter pollution and of ultrafine particles. It is critical that in order to manage this major pollution source methods are available to quantify the full size range of particles emitted for traffic modelling and health impact assessments.

«
1
2
3
4
5
6
7
8
9
10
»