255 resultados para sequential coalescence


Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the overwhelming increase in the amount of texts on the web, it is almost impossible for people to keep abreast of up-to-date information. Text mining is a process by which interesting information is derived from text through the discovery of patterns and trends. Text mining algorithms are used to guarantee the quality of extracted knowledge. However, the extracted patterns using text or data mining algorithms or methods leads to noisy patterns and inconsistency. Thus, different challenges arise, such as the question of how to understand these patterns, whether the model that has been used is suitable, and if all the patterns that have been extracted are relevant. Furthermore, the research raises the question of how to give a correct weight to the extracted knowledge. To address these issues, this paper presents a text post-processing method, which uses a pattern co-occurrence matrix to find the relation between extracted patterns in order to reduce noisy patterns. The main objective of this paper is not only reducing the number of closed sequential patterns, but also improving the performance of pattern mining as well. The experimental results on Reuters Corpus Volume 1 data collection and TREC filtering topics show that the proposed method is promising.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classifier selection is a problem encountered by multi-biometric systems that aim to improve performance through fusion of decisions. A particular decision fusion architecture that combines multiple instances (n classifiers) and multiple samples (m attempts at each classifier) has been proposed in previous work to achieve controlled trade-off between false alarms and false rejects. Although analysis on text-dependent speaker verification has demonstrated better performance for fusion of decisions with favourable dependence compared to statistically independent decisions, the performance is not always optimal. Given a pool of instances, best performance with this architecture is obtained for certain combination of instances. Heuristic rules and diversity measures have been commonly used for classifier selection but it is shown that optimal performance is achieved for the `best combination performance' rule. As the search complexity for this rule increases exponentially with the addition of classifiers, a measure - the sequential error ratio (SER) - is proposed in this work that is specifically adapted to the characteristics of sequential fusion architecture. The proposed measure can be used to select a classifier that is most likely to produce a correct decision at each stage. Error rates for fusion of text-dependent HMM based speaker models using SER are compared with other classifier selection methodologies. SER is shown to achieve near optimal performance for sequential fusion of multiple instances with or without the use of multiple samples. The methodology applies to multiple speech utterances for telephone or internet based access control and to other systems such as multiple finger print and multiple handwriting sample based identity verification systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Reliability of the performance of biometric identity verification systems remains a significant challenge. Individual biometric samples of the same person (identity class) are not identical at each presentation and performance degradation arises from intra-class variability and inter-class similarity. These limitations lead to false accepts and false rejects that are dependent. It is therefore difficult to reduce the rate of one type of error without increasing the other. The focus of this dissertation is to investigate a method based on classifier fusion techniques to better control the trade-off between the verification errors using text-dependent speaker verification as the test platform. A sequential classifier fusion architecture that integrates multi-instance and multisample fusion schemes is proposed. This fusion method enables a controlled trade-off between false alarms and false rejects. For statistically independent classifier decisions, analytical expressions for each type of verification error are derived using base classifier performances. As this assumption may not be always valid, these expressions are modified to incorporate the correlation between statistically dependent decisions from clients and impostors. The architecture is empirically evaluated by applying the proposed architecture for text dependent speaker verification using the Hidden Markov Model based digit dependent speaker models in each stage with multiple attempts for each digit utterance. The trade-off between the verification errors is controlled using the parameters, number of decision stages (instances) and the number of attempts at each decision stage (samples), fine-tuned on evaluation/tune set. The statistical validation of the derived expressions for error estimates is evaluated on test data. The performance of the sequential method is further demonstrated to depend on the order of the combination of digits (instances) and the nature of repetitive attempts (samples). The false rejection and false acceptance rates for proposed fusion are estimated using the base classifier performances, the variance in correlation between classifier decisions and the sequence of classifiers with favourable dependence selected using the 'Sequential Error Ratio' criteria. The error rates are better estimated by incorporating user-dependent (such as speaker-dependent thresholds and speaker-specific digit combinations) and class-dependent (such as clientimpostor dependent favourable combinations and class-error based threshold estimation) information. The proposed architecture is desirable in most of the speaker verification applications such as remote authentication, telephone and internet shopping applications. The tuning of parameters - the number of instances and samples - serve both the security and user convenience requirements of speaker-specific verification. The architecture investigated here is applicable to verification using other biometric modalities such as handwriting, fingerprints and key strokes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis presents a sequential pattern based model (PMM) to detect news topics from a popular microblogging platform, Twitter. PMM captures key topics and measures their importance using pattern properties and Twitter characteristics. This study shows that PMM outperforms traditional term-based models, and can potentially be implemented as a decision support system. The research contributes to news detection and addresses the challenging issue of extracting information from short and noisy text.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a unified sequential Monte Carlo (SMC) framework for performing sequential experimental design for discriminating between a set of models. The model discrimination utility that we advocate is fully Bayesian and based upon the mutual information. SMC provides a convenient way to estimate the mutual information. Our experience suggests that the approach works well on either a set of discrete or continuous models and outperforms other model discrimination approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dose-finding trials are a form of clinical data collection process in which the primary objective is to estimate an optimum dose of an investigational new drug when given to a patient. This thesis develops and explores three novel dose-finding design methodologies. All design methodologies presented in this thesis are pragmatic. They use statistical models, incorporate clinicians' prior knowledge efficiently, and prematurely stop a trial for safety or futility reasons. Designing actual dose-finding trials using these methodologies will minimize practical difficulties, improve efficiency of dose estimation, be flexible to stop early and reduce possible patient discomfort or harm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sequential Design Molecular Weight Range Functional Monomers: Possibilities, Limits, and Challenges Block Copolymers: Combinations, Block Lengths, and Purities Modular Design End-Group Chemistry Ligation Protocols Conclusions

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We investigated memories of room-sized spatial layouts learned by sequentially or simultaneously viewing objects from a stationary position. In three experiments, sequential viewing (one or two objects at a time) yielded subsequent memory performance that was equivalent or superior to simultaneous viewing of all objects, even though sequential viewing lacked direct access to the entire layout. This finding was replicated by replacing sequential viewing with directed viewing in which all objects were presented simultaneously and participants’ attention was externally focused on each object sequentially, indicating that the advantage of sequential viewing over simultaneous viewing may have originated from focal attention to individual object locations. These results suggest that memory representation of object-to-object relations can be constructed efficiently by encoding each object location separately, when those locations are defined within a single spatial reference system. These findings highlight the importance of considering object presentation procedures when studying spatial learning mechanisms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective To evaluate methods for monitoring monthly aggregated hospital adverse event data that display clustering, non-linear trends and possible autocorrelation. Design Retrospective audit. Setting The Northern Hospital, Melbourne, Australia. Participants 171,059 patients admitted between January 2001 and December 2006. Measurements The analysis is illustrated with 72 months of patient fall injury data using a modified Shewhart U control chart, and charts derived from a quasi-Poisson generalised linear model (GLM) and a generalised additive mixed model (GAMM) that included an approximate upper control limit. Results The data were overdispersed and displayed a downward trend and possible autocorrelation. The downward trend was followed by a predictable period after December 2003. The GLM-estimated incidence rate ratio was 0.98 (95% CI 0.98 to 0.99) per month. The GAMM-fitted count fell from 12.67 (95% CI 10.05 to 15.97) in January 2001 to 5.23 (95% CI 3.82 to 7.15) in December 2006 (p<0.001). The corresponding values for the GLM were 11.9 and 3.94. Residual plots suggested that the GLM underestimated the rate at the beginning and end of the series and overestimated it in the middle. The data suggested a more rapid rate fall before 2004 and a steady state thereafter, a pattern reflected in the GAMM chart. The approximate upper two-sigma equivalent control limit in the GLM and GAMM charts identified 2 months that showed possible special-cause variation. Conclusion Charts based on GAMM analysis are a suitable alternative to Shewhart U control charts with these data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A computationally efficient sequential Monte Carlo algorithm is proposed for the sequential design of experiments for the collection of block data described by mixed effects models. The difficulty in applying a sequential Monte Carlo algorithm in such settings is the need to evaluate the observed data likelihood, which is typically intractable for all but linear Gaussian models. To overcome this difficulty, we propose to unbiasedly estimate the likelihood, and perform inference and make decisions based on an exact-approximate algorithm. Two estimates are proposed: using Quasi Monte Carlo methods and using the Laplace approximation with importance sampling. Both of these approaches can be computationally expensive, so we propose exploiting parallel computational architectures to ensure designs can be derived in a timely manner. We also extend our approach to allow for model uncertainty. This research is motivated by important pharmacological studies related to the treatment of critically ill patients.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A new transdimensional Sequential Monte Carlo (SMC) algorithm called SM- CVB is proposed. In an SMC approach, a weighted sample of particles is generated from a sequence of probability distributions which ‘converge’ to the target distribution of interest, in this case a Bayesian posterior distri- bution. The approach is based on the use of variational Bayes to propose new particles at each iteration of the SMCVB algorithm in order to target the posterior more efficiently. The variational-Bayes-generated proposals are not limited to a fixed dimension. This means that the weighted particle sets that arise can have varying dimensions thereby allowing us the option to also estimate an appropriate dimension for the model. This novel algorithm is outlined within the context of finite mixture model estimation. This pro- vides a less computationally demanding alternative to using reversible jump Markov chain Monte Carlo kernels within an SMC approach. We illustrate these ideas in a simulated data analysis and in applications.