967 resultados para Sequential analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a sequential Monte Carlo algorithm for Bayesian sequential experimental design applied to generalised non-linear models for discrete data. The approach is computationally convenient in that the information of newly observed data can be incorporated through a simple re-weighting step. We also consider a flexible parametric model for the stimulus-response relationship together with a newly developed hybrid design utility that can produce more robust estimates of the target stimulus in the presence of substantial model and parameter uncertainty. The algorithm is applied to hypothetical clinical trial or bioassay scenarios. In the discussion, potential generalisations of the algorithm are suggested to possibly extend its applicability to a wide variety of scenarios

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Complex networks have been studied extensively due to their relevance to many real-world systems such as the world-wide web, the internet, biological and social systems. During the past two decades, studies of such networks in different fields have produced many significant results concerning their structures, topological properties, and dynamics. Three well-known properties of complex networks are scale-free degree distribution, small-world effect and self-similarity. The search for additional meaningful properties and the relationships among these properties is an active area of current research. This thesis investigates a newer aspect of complex networks, namely their multifractality, which is an extension of the concept of selfsimilarity. The first part of the thesis aims to confirm that the study of properties of complex networks can be expanded to a wider field including more complex weighted networks. Those real networks that have been shown to possess the self-similarity property in the existing literature are all unweighted networks. We use the proteinprotein interaction (PPI) networks as a key example to show that their weighted networks inherit the self-similarity from the original unweighted networks. Firstly, we confirm that the random sequential box-covering algorithm is an effective tool to compute the fractal dimension of complex networks. This is demonstrated on the Homo sapiens and E. coli PPI networks as well as their skeletons. Our results verify that the fractal dimension of the skeleton is smaller than that of the original network due to the shortest distance between nodes is larger in the skeleton, hence for a fixed box-size more boxes will be needed to cover the skeleton. Then we adopt the iterative scoring method to generate weighted PPI networks of five species, namely Homo sapiens, E. coli, yeast, C. elegans and Arabidopsis Thaliana. By using the random sequential box-covering algorithm, we calculate the fractal dimensions for both the original unweighted PPI networks and the generated weighted networks. The results show that self-similarity is still present in generated weighted PPI networks. This implication will be useful for our treatment of the networks in the third part of the thesis. The second part of the thesis aims to explore the multifractal behavior of different complex networks. Fractals such as the Cantor set, the Koch curve and the Sierspinski gasket are homogeneous since these fractals consist of a geometrical figure which repeats on an ever-reduced scale. Fractal analysis is a useful method for their study. However, real-world fractals are not homogeneous; there is rarely an identical motif repeated on all scales. Their singularity may vary on different subsets; implying that these objects are multifractal. Multifractal analysis is a useful way to systematically characterize the spatial heterogeneity of both theoretical and experimental fractal patterns. However, the tools for multifractal analysis of objects in Euclidean space are not suitable for complex networks. In this thesis, we propose a new box covering algorithm for multifractal analysis of complex networks. This algorithm is demonstrated in the computation of the generalized fractal dimensions of some theoretical networks, namely scale-free networks, small-world networks, random networks, and a kind of real networks, namely PPI networks of different species. Our main finding is the existence of multifractality in scale-free networks and PPI networks, while the multifractal behaviour is not confirmed for small-world networks and random networks. As another application, we generate gene interactions networks for patients and healthy people using the correlation coefficients between microarrays of different genes. Our results confirm the existence of multifractality in gene interactions networks. This multifractal analysis then provides a potentially useful tool for gene clustering and identification. The third part of the thesis aims to investigate the topological properties of networks constructed from time series. Characterizing complicated dynamics from time series is a fundamental problem of continuing interest in a wide variety of fields. Recent works indicate that complex network theory can be a powerful tool to analyse time series. Many existing methods for transforming time series into complex networks share a common feature: they define the connectivity of a complex network by the mutual proximity of different parts (e.g., individual states, state vectors, or cycles) of a single trajectory. In this thesis, we propose a new method to construct networks of time series: we define nodes by vectors of a certain length in the time series, and weight of edges between any two nodes by the Euclidean distance between the corresponding two vectors. We apply this method to build networks for fractional Brownian motions, whose long-range dependence is characterised by their Hurst exponent. We verify the validity of this method by showing that time series with stronger correlation, hence larger Hurst exponent, tend to have smaller fractal dimension, hence smoother sample paths. We then construct networks via the technique of horizontal visibility graph (HVG), which has been widely used recently. We confirm a known linear relationship between the Hurst exponent of fractional Brownian motion and the fractal dimension of the corresponding HVG network. In the first application, we apply our newly developed box-covering algorithm to calculate the generalized fractal dimensions of the HVG networks of fractional Brownian motions as well as those for binomial cascades and five bacterial genomes. The results confirm the monoscaling of fractional Brownian motion and the multifractality of the rest. As an additional application, we discuss the resilience of networks constructed from time series via two different approaches: visibility graph and horizontal visibility graph. Our finding is that the degree distribution of VG networks of fractional Brownian motions is scale-free (i.e., having a power law) meaning that one needs to destroy a large percentage of nodes before the network collapses into isolated parts; while for HVG networks of fractional Brownian motions, the degree distribution has exponential tails, implying that HVG networks would not survive the same kind of attack.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is a big challenge to guarantee the quality of discovered relevance features in text documents for describing user preferences because of the large number of terms, patterns, and noise. Most existing popular text mining and classification methods have adopted term-based approaches. However, they have all suffered from the problems of polysemy and synonymy. Over the years, people have often held the hypothesis that pattern-based methods should perform better than term- based ones in describing user preferences, but many experiments do not support this hypothesis. This research presents a promising method, Relevance Feature Discovery (RFD), for solving this challenging issue. It discovers both positive and negative patterns in text documents as high-level features in order to accurately weight low-level features (terms) based on their specificity and their distributions in the high-level features. The thesis also introduces an adaptive model (called ARFD) to enhance the exibility of using RFD in adaptive environment. ARFD automatically updates the system's knowledge based on a sliding window over new incoming feedback documents. It can efficiently decide which incoming documents can bring in new knowledge into the system. Substantial experiments using the proposed models on Reuters Corpus Volume 1 and TREC topics show that the proposed models significantly outperform both the state-of-the-art term-based methods underpinned by Okapi BM25, Rocchio or Support Vector Machine and other pattern-based methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fusion techniques have received considerable attention for achieving lower error rates with biometrics. A fused classifier architecture based on sequential integration of multi-instance and multi-sample fusion schemes allows controlled trade-off between false alarms and false rejects. Expressions for each type of error for the fused system have previously been derived for the case of statistically independent classifier decisions. It is shown in this paper that the performance of this architecture can be improved by modelling the correlation between classifier decisions. Correlation modelling also enables better tuning of fusion model parameters, ‘N’, the number of classifiers and ‘M’, the number of attempts/samples, and facilitates the determination of error bounds for false rejects and false accepts for each specific user. Error trade-off performance of the architecture is evaluated using HMM based speaker verification on utterances of individual digits. Results show that performance is improved for the case of favourable correlated decisions. The architecture investigated here is directly applicable to speaker verification from spoken digit strings such as credit card numbers in telephone or voice over internet protocol based applications. It is also applicable to other biometric modalities such as finger prints and handwriting samples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fusion techniques have received considerable attention for achieving performance improvement with biometrics. While a multi-sample fusion architecture reduces false rejects, it also increases false accepts. This impact on performance also depends on the nature of subsequent attempts, i.e., random or adaptive. Expressions for error rates are presented and experimentally evaluated in this work by considering the multi-sample fusion architecture for text-dependent speaker verification using HMM based digit dependent speaker models. Analysis incorporating correlation modeling demonstrates that the use of adaptive samples improves overall fusion performance compared to randomly repeated samples. For a text dependent speaker verification system using digit strings, sequential decision fusion of seven instances with three random samples is shown to reduce the overall error of the verification system by 26% which can be further reduced by 6% for adaptive samples. This analysis novel in its treatment of random and adaptive multiple presentations within a sequential fused decision architecture, is also applicable to other biometric modalities such as finger prints and handwriting samples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statistical dependence between classifier decisions is often shown to improve performance over statistically independent decisions. Though the solution for favourable dependence between two classifier decisions has been derived, the theoretical analysis for the general case of 'n' client and impostor decision fusion has not been presented before. This paper presents the expressions developed for favourable dependence of multi-instance and multi-sample fusion schemes that employ 'AND' and 'OR' rules. The expressions are experimentally evaluated by considering the proposed architecture for text-dependent speaker verification using HMM based digit dependent speaker models. The improvement in fusion performance is found to be higher when digit combinations with favourable client and impostor decisions are used for speaker verification. The total error rate of 20% for fusion of independent decisions is reduced to 2.1% for fusion of decisions that are favourable for both client and impostors. The expressions developed here are also applicable to other biometric modalities, such as finger prints and handwriting samples, for reliable identity verification.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The position of housing demand and supply is not consistent. The Australian situation counters the experience demonstrated in many other parts of the world in the aftermath of the Global Financial Crisis, with residential housing prices proving particularly resilient. A seemingly inexorable housing demand remains a critical issue affecting the socio-economic landscape. Underpinned by high levels of population growth fuelled by immigration, and further buoyed by sustained historically low interest rates, increasing income levels, and increased government assistance for first home buyers, this strong housing demand level ensures problems related to housing affordability continue almost unabated. A significant, but less visible factor impacting housing affordability relates to holding costs. Although only one contributor in the housing affordability matrix, the nature and extent of holding cost impact requires elucidation: for example, the computation and methodology behind the calculation of holding costs varies widely - and in some instances completely ignored. In addition, ambiguity exists in terms of the inclusion of various elements that comprise holding costs, thereby affecting the assessment of their relative contribution. Such anomalies may be explained by considering that assessment is conducted over time in an ever-changing environment. A strong relationship with opportunity cost - in turn dependant inter alia upon prevailing inflation and / or interest rates - adds further complexity. By extending research in the general area of housing affordability, this thesis seeks to provide a detailed investigation of those elements related to holding costs specifically in the context of midsized (i.e. between 15-200 lots) greenfield residential property developments in South East Queensland. With the dimensions of holding costs and their influence over housing affordability determined, the null hypothesis H0 that holding costs are not passed on can be addressed. Arriving at these conclusions involves the development of robust economic and econometric models which seek to clarify the componentry impacts of holding cost elements. An explanatory sequential design research methodology has been adopted, whereby the compilation and analysis of quantitative data and the development of an economic model is informed by the subsequent collection and analysis of primarily qualitative data derived from surveying development related organisations. Ultimately, there are significant policy implications in relation to the framework used in Australian jurisdictions that promote, retain, or otherwise maximise, the opportunities for affordable housing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of Wireless Sensor Networks (WSNs) for Structural Health Monitoring (SHM) has become a promising approach due to many advantages such as low cost, fast and flexible deployment. However, inherent technical issues such as data synchronization error and data loss have prevented these distinct systems from being extensively used. Recently, several SHM-oriented WSNs have been proposed and believed to be able to overcome a large number of technical uncertainties. Nevertheless, there is limited research verifying the applicability of those WSNs with respect to demanding SHM applications like modal analysis and damage identification. This paper first presents a brief review of the most inherent uncertainties of the SHM-oriented WSN platforms and then investigates their effects on outcomes and performance of the most robust Output-only Modal Analysis (OMA) techniques when employing merged data from multiple tests. The two OMA families selected for this investigation are Frequency Domain Decomposition (FDD) and Data-driven Stochastic Subspace Identification (SSI-data) due to the fact that they both have been widely applied in the past decade. Experimental accelerations collected by a wired sensory system on a large-scale laboratory bridge model are initially used as clean data before being contaminated by different data pollutants in sequential manner to simulate practical SHM-oriented WSN uncertainties. The results of this study show the robustness of FDD and the precautions needed for SSI-data family when dealing with SHM-WSN uncertainties. Finally, the use of the measurement channel projection for the time-domain OMA techniques and the preferred combination of the OMA techniques to cope with the SHM-WSN uncertainties is recommended.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To evaluate the prescribing practices of Australian dispensing doctors (DDs) and to explore their interpretations of the findings. Design, participants and setting: Sequential explanatory mixed methods. The quantitative phase comprised analysis of Pharmaceutical Benefits Scheme (PBS) claims data of DDs and non-DDs, 1 July 2005 30 June 2007. The qualitative phase involved semi-structured interviews with DDs in rural and remote general practice across Australian states, August 2009 February 2010. Main outcome measures: The number of PBS prescriptions per 1000 patients and use of Regulation 24 of the National Health (Pharmaceutical Benefits) Regulations 1960 (r. 24); DDs' interpretation of the findings. Results: 72 DDs' and 1080 non-DDs' PBS claims data were analysed quantitatively. DDs issued fewer prescriptions per 1000 patients (9452 v 15057; P = 0.003), even with a similar proportion of concessional patients and patients aged >65 years in their populations. DDs issued significantly more r. 24 prescriptions per 1000 prescriptions than non-DDs (314 v 67; P=0.008). Interviews with 22 DDs explained that the fewer prescriptions were due to perceived expectation from their peers regarding prescribing norms and the need to generate less administrative paperwork in small practices. Conclusions: Contrary to overseas findings, we found no evidence that Australian DDs overprescribed because of their additional dispensing role. MJA 2011; 195: 172-175

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Classifier selection is a problem encountered by multi-biometric systems that aim to improve performance through fusion of decisions. A particular decision fusion architecture that combines multiple instances (n classifiers) and multiple samples (m attempts at each classifier) has been proposed in previous work to achieve controlled trade-off between false alarms and false rejects. Although analysis on text-dependent speaker verification has demonstrated better performance for fusion of decisions with favourable dependence compared to statistically independent decisions, the performance is not always optimal. Given a pool of instances, best performance with this architecture is obtained for certain combination of instances. Heuristic rules and diversity measures have been commonly used for classifier selection but it is shown that optimal performance is achieved for the `best combination performance' rule. As the search complexity for this rule increases exponentially with the addition of classifiers, a measure - the sequential error ratio (SER) - is proposed in this work that is specifically adapted to the characteristics of sequential fusion architecture. The proposed measure can be used to select a classifier that is most likely to produce a correct decision at each stage. Error rates for fusion of text-dependent HMM based speaker models using SER are compared with other classifier selection methodologies. SER is shown to achieve near optimal performance for sequential fusion of multiple instances with or without the use of multiple samples. The methodology applies to multiple speech utterances for telephone or internet based access control and to other systems such as multiple finger print and multiple handwriting sample based identity verification systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective To evaluate methods for monitoring monthly aggregated hospital adverse event data that display clustering, non-linear trends and possible autocorrelation. Design Retrospective audit. Setting The Northern Hospital, Melbourne, Australia. Participants 171,059 patients admitted between January 2001 and December 2006. Measurements The analysis is illustrated with 72 months of patient fall injury data using a modified Shewhart U control chart, and charts derived from a quasi-Poisson generalised linear model (GLM) and a generalised additive mixed model (GAMM) that included an approximate upper control limit. Results The data were overdispersed and displayed a downward trend and possible autocorrelation. The downward trend was followed by a predictable period after December 2003. The GLM-estimated incidence rate ratio was 0.98 (95% CI 0.98 to 0.99) per month. The GAMM-fitted count fell from 12.67 (95% CI 10.05 to 15.97) in January 2001 to 5.23 (95% CI 3.82 to 7.15) in December 2006 (p<0.001). The corresponding values for the GLM were 11.9 and 3.94. Residual plots suggested that the GLM underestimated the rate at the beginning and end of the series and overestimated it in the middle. The data suggested a more rapid rate fall before 2004 and a steady state thereafter, a pattern reflected in the GAMM chart. The approximate upper two-sigma equivalent control limit in the GLM and GAMM charts identified 2 months that showed possible special-cause variation. Conclusion Charts based on GAMM analysis are a suitable alternative to Shewhart U control charts with these data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Approximate Bayesian Computation’ (ABC) represents a powerful methodology for the analysis of complex stochastic systems for which the likelihood of the observed data under an arbitrary set of input parameters may be entirely intractable – the latter condition rendering useless the standard machinery of tractable likelihood-based, Bayesian statistical inference [e.g. conventional Markov chain Monte Carlo (MCMC) simulation]. In this paper, we demonstrate the potential of ABC for astronomical model analysis by application to a case study in the morphological transformation of high-redshift galaxies. To this end, we develop, first, a stochastic model for the competing processes of merging and secular evolution in the early Universe, and secondly, through an ABC-based comparison against the observed demographics of massive (Mgal > 1011 M⊙) galaxies (at 1.5 < z < 3) in the Cosmic Assembly Near-IR Deep Extragalatic Legacy Survey (CANDELS)/Extended Groth Strip (EGS) data set we derive posterior probability densities for the key parameters of this model. The ‘Sequential Monte Carlo’ implementation of ABC exhibited herein, featuring both a self-generating target sequence and self-refining MCMC kernel, is amongst the most efficient of contemporary approaches to this important statistical algorithm. We highlight as well through our chosen case study the value of careful summary statistic selection, and demonstrate two modern strategies for assessment and optimization in this regard. Ultimately, our ABC analysis of the high-redshift morphological mix returns tight constraints on the evolving merger rate in the early Universe and favours major merging (with disc survival or rapid reformation) over secular evolution as the mechanism most responsible for building up the first generation of bulges in early-type discs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most of the existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudo-data from the model at each iteration. However, the computational cost of these simulations can be prohibitive for high dimensional data. An important example is the Potts model, which is commonly used in image analysis. Images encountered in real world applications can have millions of pixels, therefore scalability is a major concern. We apply ABC with a synthetic likelihood to the hidden Potts model with additive Gaussian noise. Using a pre-processing step, we fit a binding function to model the relationship between the model parameters and the synthetic likelihood parameters. Our numerical experiments demonstrate that the precomputed binding function dramatically improves the scalability of ABC, reducing the average runtime required for model fitting from 71 hours to only 7 minutes. We also illustrate the method by estimating the smoothing parameter for remotely sensed satellite imagery. Without precomputation, Bayesian inference is impractical for datasets of that scale.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Progression of spinal deformity in children was studied with Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) to identify how gravity affects the deformity and to determine the full three-dimensional character of the deformity. The CT study showed that gravity is significant in deformity progression in some patients which has implications for clinical patient management. The world first MRI study showed that the standard clinical measure used to define the extent of the deformity is inadequate and further use of three-dimensional MRI should be considered by spinal surgeons.