966 resultados para Bayesian point estimate
Resumo:
Genetic research of complex diseases is a challenging, but exciting, area of research. The early development of the research was limited, however, until the completion of the Human Genome and HapMap projects, along with the reduction in the cost of genotyping, which paves the way for understanding the genetic composition of complex diseases. In this thesis, we focus on the statistical methods for two aspects of genetic research: phenotype definition for diseases with complex etiology and methods for identifying potentially associated Single Nucleotide Polymorphisms (SNPs) and SNP-SNP interactions. With regard to phenotype definition for diseases with complex etiology, we firstly investigated the effects of different statistical phenotyping approaches on the subsequent analysis. In light of the findings, and the difficulties in validating the estimated phenotype, we proposed two different methods for reconciling phenotypes of different models using Bayesian model averaging as a coherent mechanism for accounting for model uncertainty. In the second part of the thesis, the focus is turned to the methods for identifying associated SNPs and SNP interactions. We review the use of Bayesian logistic regression with variable selection for SNP identification and extended the model for detecting the interaction effects for population based case-control studies. In this part of study, we also develop a machine learning algorithm to cope with the large scale data analysis, namely modified Logic Regression with Genetic Program (MLR-GEP), which is then compared with the Bayesian model, Random Forests and other variants of logic regression.
Resumo:
Circuit-breakers (CBs) are subject to electrical stresses with restrikes during capacitor bank operation. Stresses are caused by the overvoltages across CBs, the interrupting currents and the rate of rise of recovery voltage (RRRV). Such electrical stresses also depend on the types of system grounding and the types of dielectric strength curves. The aim of this study is to demonstrate a restrike waveform predictive model for a SF6 CB that considered the types of system grounding: grounded and non-grounded and the computation accuracy comparison on the application of the cold withstand dielectric strength and the hot recovery dielectric strength curve including the POW (point-on-wave) recommendations to make an assessment of increasing the CB remaining life. The simulation of SF6 CB stresses in a typical 400 kV system was undertaken and the results in the applications are presented. The simulated restrike waveforms produced with the identified features using wavelet transform can be used for restrike diagnostic algorithm development with wavelet transform to locate a substation with breaker restrikes. This study found that the hot withstand dielectric strength curve has less magnitude than the cold withstand dielectric strength curve for restrike simulation results. Computation accuracy improved with the hot withstand dielectric strength and POW controlled switching can increase the life for a SF6 CB.
Resumo:
Markov chain Monte Carlo (MCMC) estimation provides a solution to the complex integration problems that are faced in the Bayesian analysis of statistical problems. The implementation of MCMC algorithms is, however, code intensive and time consuming. We have developed a Python package, which is called PyMCMC, that aids in the construction of MCMC samplers and helps to substantially reduce the likelihood of coding error, as well as aid in the minimisation of repetitive code. PyMCMC contains classes for Gibbs, Metropolis Hastings, independent Metropolis Hastings, random walk Metropolis Hastings, orientational bias Monte Carlo and slice samplers as well as specific modules for common models such as a module for Bayesian regression analysis. PyMCMC is straightforward to optimise, taking advantage of the Python libraries Numpy and Scipy, as well as being readily extensible with C or Fortran.
Resumo:
BACKGROUND - High-density lipoprotein (HDL) protects against arterial atherothrombosis, but it is unknown whether it protects against recurrent venous thromboembolism. METHODS AND RESULTS - We studied 772 patients after a first spontaneous venous thromboembolism (average follow-up 48 months) and recorded the end point of symptomatic recurrent venous thromboembolism, which developed in 100 of the 772 patients. The relationship between plasma lipoprotein parameters and recurrence was evaluated. Plasma apolipoproteins AI and B were measured by immunoassays for all subjects. Compared with those without recurrence, patients with recurrence had lower mean (±SD) levels of apolipoprotein AI (1.12±0.22 versus 1.23±0.27 mg/mL, P<0.001) but similar apolipoprotein B levels. The relative risk of recurrence was 0.87 (95% CI, 0.80 to 0.94) for each increase of 0.1 mg/mL in plasma apolipoprotein AI. Compared with patients with apolipoprotein AI levels in the lowest tertile (<1.07 mg/mL), the relative risk of recurrence was 0.46 (95% CI, 0.27 to 0.77) for the highest-tertile patients (apolipoprotein AI >1.30 mg/mL) and 0.78 (95% CI, 0.50 to 1.22) for midtertile patients (apolipoprotein AI of 1.07 to 1.30 mg/mL). Using nuclear magnetic resonance, we determined the levels of 10 major lipoprotein subclasses and HDL cholesterol for 71 patients with recurrence and 142 matched patients without recurrence. We found a strong trend for association between recurrence and low levels of HDL particles and HDL cholesterol. CONCLUSIONS - Patients with high levels of apolipoprotein AI and HDL have a decreased risk of recurrent venous thromboembolism. © 2007 American Heart Association, Inc.
Resumo:
We study model selection strategies based on penalized empirical loss minimization. We point out a tight relationship between error estimation and data-based complexity penalization: any good error estimate may be converted into a data-based penalty function and the performance of the estimate is governed by the quality of the error estimate. We consider several penalty functions, involving error estimates on independent test data, empirical VC dimension, empirical VC entropy, and margin-based quantities. We also consider the maximal difference between the error on the first half of the training data and the second half, and the expected maximal discrepancy, a closely related capacity estimate that can be calculated by Monte Carlo integration. Maximal discrepancy penalty functions are appealing for pattern classification problems, since their computation is equivalent to empirical risk minimization over the training data with some labels flipped.
Resumo:
In semisupervised learning (SSL), a predictive model is learn from a collection of labeled data and a typically much larger collection of unlabeled data. These paper presented a framework called multi-view point cloud regularization (MVPCR), which unifies and generalizes several semisupervised kernel methods that are based on data-dependent regularization in reproducing kernel Hilbert spaces (RKHSs). Special cases of MVPCR include coregularized least squares (CoRLS), manifold regularization (MR), and graph-based SSL. An accompanying theorem shows how to reduce any MVPCR problem to standard supervised learning with a new multi-view kernel.
Resumo:
The flood flow in urbanised areas constitutes a major hazard to the population and infrastructure as seen during the summer 2010-2011 floods in Queensland (Australia). Flood flows in urban environments have been studied relatively recently, although no study considered the impact of turbulence in the flow. During the 12-13 January 2011 flood of the Brisbane River, some turbulence measurements were conducted in an inundated urban environment in Gardens Point Road next to Brisbane's central business district (CBD) at relatively high frequency (50 Hz). The properties of the sediment flood deposits were characterised and the acoustic Doppler velocimeter unit was calibrated to obtain both instantaneous velocity components and suspended sediment concentration in the same sampling volume with the same temporal resolution. While the flow motion in Gardens Point Road was subcritical, the water elevations and velocities fluctuated with a distinctive period between 50 and 80 s. The low frequency fluctuations were linked with some local topographic effects: i.e, some local choke induced by an upstream constriction between stairwells caused some slow oscillations with a period close to the natural sloshing period of the car park. The instantaneous velocity data were analysed using a triple decomposition, and the same triple decomposition was applied to the water depth, velocity flux, suspended sediment concentration and suspended sediment flux data. The velocity fluctuation data showed a large energy component in the slow fluctuation range. For the first two tests at z = 0.35 m, the turbulence data suggested some isotropy. At z = 0.083 m, on the other hand, the findings indicated some flow anisotropy. The suspended sediment concentration (SSC) data presented a general trend with increasing SSC for decreasing water depth. During a test (T4), some long -period oscillations were observed with a period about 18 minutes. The cause of these oscillations remains unknown to the authors. The last test (T5) took place in very shallow waters and high suspended sediment concentrations. It is suggested that the flow in the car park was disconnected from the main channel. Overall the flow conditions at the sampling sites corresponded to a specific momentum between 0.2 to 0.4 m2 which would be near the upper end of the scale for safe evacuation of individuals in flooded areas. But the authors do not believe the evacuation of individuals in Gardens Point Road would have been safe because of the intense water surges and flow turbulence. More generally any criterion for safe evacuation solely based upon the flow velocity, water depth or specific momentum cannot account for the hazards caused by the flow turbulence, water depth fluctuations and water surges.
Resumo:
Axon guidance by molecular gradients plays a crucial role in wiring up the nervous system. However, the mechanisms axons use to detect gradients are largely unknown. We first develop a Bayesian “ideal observer” analysis of gradient detection by axons, based on the hypothesis that a principal constraint on gradient detection is intrinsic receptor binding noise. Second, from this model, we derive an equation predicting how the degree of response of an axon to a gradient should vary with gradient steepness and absolute concentration. Third, we confirm this prediction quantitatively by performing the first systematic experimental analysis of how axonal response varies with both these quantities. These experiments demonstrate a degree of sensitivity much higher than previously reported for any chemotacting system. Together, these results reveal both the quantitative constraints that must be satisfied for effective axonal guidance and the computational principles that may be used by the underlying signal transduction pathways, and allow predictions for the degree of response of axons to gradients in a wide variety of in vivo and in vitro settings.