978 resultados para Approximate Bayesian Computation
Resumo:
Objective We aimed to predict sub-national spatial variation in numbers of people infected with Schistosoma haematobium, and associated uncertainties, in Burkina Faso, Mali and Niger, prior to implementation of national control programmes. Methods We used national field survey datasets covering a contiguous area 2,750 × 850 km, from 26,790 school-aged children (5–14 years) in 418 schools. Bayesian geostatistical models were used to predict prevalence of high and low intensity infections and associated 95% credible intervals (CrI). Numbers infected were determined by multiplying predicted prevalence by numbers of school-aged children in 1 km2 pixels covering the study area. Findings Numbers of school-aged children with low-intensity infections were: 433,268 in Burkina Faso, 872,328 in Mali and 580,286 in Niger. Numbers with high-intensity infections were: 416,009 in Burkina Faso, 511,845 in Mali and 254,150 in Niger. 95% CrIs (indicative of uncertainty) were wide; e.g. the mean number of boys aged 10–14 years infected in Mali was 140,200 (95% CrI 6200, 512,100). Conclusion National aggregate estimates for numbers infected mask important local variation, e.g. most S. haematobium infections in Niger occur in the Niger River valley. Prevalence of high-intensity infections was strongly clustered in foci in western and central Mali, north-eastern and northwestern Burkina Faso and the Niger River valley in Niger. Populations in these foci are likely to carry the bulk of the urinary schistosomiasis burden and should receive priority for schistosomiasis control. Uncertainties in predicted prevalence and numbers infected should be acknowledged and taken into consideration by control programme planners.
Resumo:
Definition of disease phenotype is a necessary preliminary to research into genetic causes of a complex disease. Clinical diagnosis of migraine is currently based on diagnostic criteria developed by the International Headache Society. Previously, we examined the natural clustering of these diagnostic symptoms using latent class analysis (LCA) and found that a four-class model was preferred. However, the classes can be ordered such that all symptoms progressively intensify, suggesting that a single continuous variable representing disease severity may provide a better model. Here, we compare two models: item response theory and LCA, each constructed within a Bayesian context. A deviance information criterion is used to assess model fit. We phenotyped our population sample using these models, estimated heritability and conducted genome-wide linkage analysis using Merlin-qtl. LCA with four classes was again preferred. After transformation, phenotypic trait values derived from both models are highly correlated (correlation = 0.99) and consequently results from subsequent genetic analyses were similar. Heritability was estimated at 0.37, while multipoint linkage analysis produced genome-wide significant linkage to chromosome 7q31-q33 and suggestive linkage to chromosomes 1 and 2. We argue that such continuous measures are a powerful tool for identifying genes contributing to migraine susceptibility.
Resumo:
Perez-Losada et al. [1] analyzed 72 complete genomes corresponding to nine mammalian (67 strains) and 2 avian (5 strains) polyomavirus species using maximum likelihood and Bayesian methods of phylogenetic inference. Because some data of 2 genomes in their work are now not available in GenBank, in this work, we analyze the phylogenetic relationship of the remaining 70 complete genomes corresponding to nine mammalian (65 strains) and two avian (5 strains) polyomavirus species using a dynamical language model approach developed by our group (Yu et al., [26]). This distance method does not require sequence alignment for deriving species phylogeny based on overall similarities of the complete genomes. Our best tree separates the bird polyomaviruses (avian polyomaviruses and goose hemorrhagic polymaviruses) from the mammalian polyomaviruses, which supports the idea of splitting the genus into two subgenera. Such a split is consistent with the different viral life strategies of each group. In the mammalian polyomavirus subgenera, mouse polyomaviruses (MPV), simian viruses 40 (SV40), BK viruses (BKV) and JC viruses (JCV) are grouped as different branches as expected. The topology of our best tree is quite similar to that of the tree constructed by Perez-Losada et al.
Resumo:
Industrial applications of the simulated-moving-bed (SMB) chromatographic technology have brought an emergent demand to improve the SMB process operation for higher efficiency and better robustness. Improved process modelling and more-efficient model computation will pave a path to meet this demand. However, the SMB unit operation exhibits complex dynamics, leading to challenges in SMB process modelling and model computation. One of the significant problems is how to quickly obtain the steady state of an SMB process model, as process metrics at the steady state are critical for process design and real-time control. The conventional computation method, which solves the process model cycle by cycle and takes the solution only when a cyclic steady state is reached after a certain number of switching, is computationally expensive. Adopting the concept of quasi-envelope (QE), this work treats the SMB operation as a pseudo-oscillatory process because of its large number of continuous switching. Then, an innovative QE computation scheme is developed to quickly obtain the steady state solution of an SMB model for any arbitrary initial condition. The QE computation scheme allows larger steps to be taken for predicting the slow change of the starting state within each switching. Incorporating with the wavelet-based technique, this scheme is demonstrated to be effective and efficient for an SMB sugar separation process. Moreover, investigations are also carried out on when the computation scheme should be activated and how the convergence of the scheme is affected by a variable stepsize.
Resumo:
Object tracking systems require accurate segmentation of the objects from the background for effective tracking. Motion segmentation or optical flow can be used to segment incoming images. Whilst optical flow allows multiple moving targets to be separated based on their individual velocities, optical flow techniques are prone to errors caused by changing lighting and occlusions, both common in a surveillance environment. Motion segmentation techniques are more robust to fluctuating lighting and occlusions, but don't provide information on the direction of the motion. In this paper we propose a combined motion segmentation/optical flow algorithm for use in object tracking. The proposed algorithm uses the motion segmentation results to inform the optical flow calculations and ensure that optical flow is only calculated in regions of motion, and improve the performance of the optical flow around the edge of moving objects. Optical flow is calculated at pixel resolution and tracking of flow vectors is employed to improve performance and detect discontinuities, which can indicate the location of overlaps between objects. The algorithm is evaluated by attempting to extract a moving target within the flow images, given expected horizontal and vertical movement (i.e. the algorithms intended use for object tracking). Results show that the proposed algorithm outperforms other widely used optical flow techniques for this surveillance application.
Resumo:
The main objective of this PhD was to further develop Bayesian spatio-temporal models (specifically the Conditional Autoregressive (CAR) class of models), for the analysis of sparse disease outcomes such as birth defects. The motivation for the thesis arose from problems encountered when analyzing a large birth defect registry in New South Wales. The specific components and related research objectives of the thesis were developed from gaps in the literature on current formulations of the CAR model, and health service planning requirements. Data from a large probabilistically-linked database from 1990 to 2004, consisting of fields from two separate registries: the Birth Defect Registry (BDR) and Midwives Data Collection (MDC) were used in the analyses in this thesis. The main objective was split into smaller goals. The first goal was to determine how the specification of the neighbourhood weight matrix will affect the smoothing properties of the CAR model, and this is the focus of chapter 6. Secondly, I hoped to evaluate the usefulness of incorporating a zero-inflated Poisson (ZIP) component as well as a shared-component model in terms of modeling a sparse outcome, and this is carried out in chapter 7. The third goal was to identify optimal sampling and sample size schemes designed to select individual level data for a hybrid ecological spatial model, and this is done in chapter 8. Finally, I wanted to put together the earlier improvements to the CAR model, and along with demographic projections, provide forecasts for birth defects at the SLA level. Chapter 9 describes how this is done. For the first objective, I examined a series of neighbourhood weight matrices, and showed how smoothing the relative risk estimates according to similarity by an important covariate (i.e. maternal age) helped improve the model’s ability to recover the underlying risk, as compared to the traditional adjacency (specifically the Queen) method of applying weights. Next, to address the sparseness and excess zeros commonly encountered in the analysis of rare outcomes such as birth defects, I compared a few models, including an extension of the usual Poisson model to encompass excess zeros in the data. This was achieved via a mixture model, which also encompassed the shared component model to improve on the estimation of sparse counts through borrowing strength across a shared component (e.g. latent risk factor/s) with the referent outcome (caesarean section was used in this example). Using the Deviance Information Criteria (DIC), I showed how the proposed model performed better than the usual models, but only when both outcomes shared a strong spatial correlation. The next objective involved identifying the optimal sampling and sample size strategy for incorporating individual-level data with areal covariates in a hybrid study design. I performed extensive simulation studies, evaluating thirteen different sampling schemes along with variations in sample size. This was done in the context of an ecological regression model that incorporated spatial correlation in the outcomes, as well as accommodating both individual and areal measures of covariates. Using the Average Mean Squared Error (AMSE), I showed how a simple random sample of 20% of the SLAs, followed by selecting all cases in the SLAs chosen, along with an equal number of controls, provided the lowest AMSE. The final objective involved combining the improved spatio-temporal CAR model with population (i.e. women) forecasts, to provide 30-year annual estimates of birth defects at the Statistical Local Area (SLA) level in New South Wales, Australia. The projections were illustrated using sixteen different SLAs, representing the various areal measures of socio-economic status and remoteness. A sensitivity analysis of the assumptions used in the projection was also undertaken. By the end of the thesis, I will show how challenges in the spatial analysis of rare diseases such as birth defects can be addressed, by specifically formulating the neighbourhood weight matrix to smooth according to a key covariate (i.e. maternal age), incorporating a ZIP component to model excess zeros in outcomes and borrowing strength from a referent outcome (i.e. caesarean counts). An efficient strategy to sample individual-level data and sample size considerations for rare disease will also be presented. Finally, projections in birth defect categories at the SLA level will be made.
Resumo:
Ecological problems are typically multi faceted and need to be addressed from a scientific and a management perspective. There is a wealth of modelling and simulation software available, each designed to address a particular aspect of the issue of concern. Choosing the appropriate tool, making sense of the disparate outputs, and taking decisions when little or no empirical data is available, are everyday challenges facing the ecologist and environmental manager. Bayesian Networks provide a statistical modelling framework that enables analysis and integration of information in its own right as well as integration of a variety of models addressing different aspects of a common overall problem. There has been increased interest in the use of BNs to model environmental systems and issues of concern. However, the development of more sophisticated BNs, utilising dynamic and object oriented (OO) features, is still at the frontier of ecological research. Such features are particularly appealing in an ecological context, since the underlying facts are often spatial and temporal in nature. This thesis focuses on an integrated BN approach which facilitates OO modelling. Our research devises a new heuristic method, the Iterative Bayesian Network Development Cycle (IBNDC), for the development of BN models within a multi-field and multi-expert context. Expert elicitation is a popular method used to quantify BNs when data is sparse, but expert knowledge is abundant. The resulting BNs need to be substantiated and validated taking this uncertainty into account. Our research demonstrates the application of the IBNDC approach to support these aspects of BN modelling. The complex nature of environmental issues makes them ideal case studies for the proposed integrated approach to modelling. Moreover, they lend themselves to a series of integrated sub-networks describing different scientific components, combining scientific and management perspectives, or pooling similar contributions developed in different locations by different research groups. In southern Africa the two largest free-ranging cheetah (Acinonyx jubatus) populations are in Namibia and Botswana, where the majority of cheetahs are located outside protected areas. Consequently, cheetah conservation in these two countries is focussed primarily on the free-ranging populations as well as the mitigation of conflict between humans and cheetahs. In contrast, in neighbouring South Africa, the majority of cheetahs are found in fenced reserves. Nonetheless, conflict between humans and cheetahs remains an issue here. Conservation effort in South Africa is also focussed on managing the geographically isolated cheetah populations as one large meta-population. Relocation is one option among a suite of tools used to resolve human-cheetah conflict in southern Africa. Successfully relocating captured problem cheetahs, and maintaining a viable free-ranging cheetah population, are two environmental issues in cheetah conservation forming the first case study in this thesis. The second case study involves the initiation of blooms of Lyngbya majuscula, a blue-green algae, in Deception Bay, Australia. L. majuscula is a toxic algal bloom which has severe health, ecological and economic impacts on the community located in the vicinity of this algal bloom. Deception Bay is an important tourist destination with its proximity to Brisbane, Australia’s third largest city. Lyngbya is one of several algae considered to be a Harmful Algal Bloom (HAB). This group of algae includes other widespread blooms such as red tides. The occurrence of Lyngbya blooms is not a local phenomenon, but blooms of this toxic weed occur in coastal waters worldwide. With the increase in frequency and extent of these HAB blooms, it is important to gain a better understanding of the underlying factors contributing to the initiation and sustenance of these blooms. This knowledge will contribute to better management practices and the identification of those management actions which could prevent or diminish the severity of these blooms.
Resumo:
This thesis is about the derivation of the addition law on an arbitrary elliptic curve and efficiently adding points on this elliptic curve using the derived addition law. The outcomes of this research guarantee practical speedups in higher level operations which depend on point additions. In particular, the contributions immediately find applications in cryptology. Mastered by the 19th century mathematicians, the study of the theory of elliptic curves has been active for decades. Elliptic curves over finite fields made their way into public key cryptography in late 1980’s with independent proposals by Miller [Mil86] and Koblitz [Kob87]. Elliptic Curve Cryptography (ECC), following Miller’s and Koblitz’s proposals, employs the group of rational points on an elliptic curve in building discrete logarithm based public key cryptosystems. Starting from late 1990’s, the emergence of the ECC market has boosted the research in computational aspects of elliptic curves. This thesis falls into this same area of research where the main aim is to speed up the additions of rational points on an arbitrary elliptic curve (over a field of large characteristic). The outcomes of this work can be used to speed up applications which are based on elliptic curves, including cryptographic applications in ECC. The aforementioned goals of this thesis are achieved in five main steps. As the first step, this thesis brings together several algebraic tools in order to derive the unique group law of an elliptic curve. This step also includes an investigation of recent computer algebra packages relating to their capabilities. Although the group law is unique, its evaluation can be performed using abundant (in fact infinitely many) formulae. As the second step, this thesis progresses the finding of the best formulae for efficient addition of points. In the third step, the group law is stated explicitly by handling all possible summands. The fourth step presents the algorithms to be used for efficient point additions. In the fifth and final step, optimized software implementations of the proposed algorithms are presented in order to show that theoretical speedups of step four can be practically obtained. In each of the five steps, this thesis focuses on five forms of elliptic curves over finite fields of large characteristic. A list of these forms and their defining equations are given as follows: (a) Short Weierstrass form, y2 = x3 + ax + b, (b) Extended Jacobi quartic form, y2 = dx4 + 2ax2 + 1, (c) Twisted Hessian form, ax3 + y3 + 1 = dxy, (d) Twisted Edwards form, ax2 + y2 = 1 + dx2y2, (e) Twisted Jacobi intersection form, bs2 + c2 = 1, as2 + d2 = 1, These forms are the most promising candidates for efficient computations and thus considered in this work. Nevertheless, the methods employed in this thesis are capable of handling arbitrary elliptic curves. From a high level point of view, the following outcomes are achieved in this thesis. - Related literature results are brought together and further revisited. For most of the cases several missed formulae, algorithms, and efficient point representations are discovered. - Analogies are made among all studied forms. For instance, it is shown that two sets of affine addition formulae are sufficient to cover all possible affine inputs as long as the output is also an affine point in any of these forms. In the literature, many special cases, especially interactions with points at infinity were omitted from discussion. This thesis handles all of the possibilities. - Several new point doubling/addition formulae and algorithms are introduced, which are more efficient than the existing alternatives in the literature. Most notably, the speed of extended Jacobi quartic, twisted Edwards, and Jacobi intersection forms are improved. New unified addition formulae are proposed for short Weierstrass form. New coordinate systems are studied for the first time. - An optimized implementation is developed using a combination of generic x86-64 assembly instructions and the plain C language. The practical advantages of the proposed algorithms are supported by computer experiments. - All formulae, presented in the body of this thesis, are checked for correctness using computer algebra scripts together with details on register allocations.
Resumo:
Longitudinal data, where data are repeatedly observed or measured on a temporal basis of time or age provides the foundation of the analysis of processes which evolve over time, and these can be referred to as growth or trajectory models. One of the traditional ways of looking at growth models is to employ either linear or polynomial functional forms to model trajectory shape, and account for variation around an overall mean trend with the inclusion of random eects or individual variation on the functional shape parameters. The identification of distinct subgroups or sub-classes (latent classes) within these trajectory models which are not based on some pre-existing individual classification provides an important methodology with substantive implications. The identification of subgroups or classes has a wide application in the medical arena where responder/non-responder identification based on distinctly diering trajectories delivers further information for clinical processes. This thesis develops Bayesian statistical models and techniques for the identification of subgroups in the analysis of longitudinal data where the number of time intervals is limited. These models are then applied to a single case study which investigates the neuropsychological cognition for early stage breast cancer patients undergoing adjuvant chemotherapy treatment from the Cognition in Breast Cancer Study undertaken by the Wesley Research Institute of Brisbane, Queensland. Alternative formulations to the linear or polynomial approach are taken which use piecewise linear models with a single turning point, change-point or knot at a known time point and latent basis models for the non-linear trajectories found for the verbal memory domain of cognitive function before and after chemotherapy treatment. Hierarchical Bayesian random eects models are used as a starting point for the latent class modelling process and are extended with the incorporation of covariates in the trajectory profiles and as predictors of class membership. The Bayesian latent basis models enable the degree of recovery post-chemotherapy to be estimated for short and long-term followup occasions, and the distinct class trajectories assist in the identification of breast cancer patients who maybe at risk of long-term verbal memory impairment.