26 resultados para Log cabins.
Resumo:
This letter presents data from triaxial tests conducted as part of a research programme into the stress-strain behaviour of clays and silts at Cambridge University. To support findings from earlier research using databases of soil tests, eighteen CIU triaxial tests on speswhite kaolin were performed to confirm an assumed link between mobilisation strain (γ M=2) and overconsolidation ratio (OCR). In the moderate shear stress range (0.2c u to 0.8c u) the test data are essentially linear on log-log plots. Both the slopes and intercepts of these lines are simple functions of OCR.
Resumo:
Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance.
Resumo:
Conventional Hidden Markov models generally consist of a Markov chain observed through a linear map corrupted by additive noise. This general class of model has enjoyed a huge and diverse range of applications, for example, speech processing, biomedical signal processing and more recently quantitative finance. However, a lesser known extension of this general class of model is the so-called Factorial Hidden Markov Model (FHMM). FHMMs also have diverse applications, notably in machine learning, artificial intelligence and speech recognition [13, 17]. FHMMs extend the usual class of HMMs, by supposing the partially observed state process is a finite collection of distinct Markov chains, either statistically independent or dependent. There is also considerable current activity in applying collections of partially observed Markov chains to complex action recognition problems, see, for example, [6]. In this article we consider the Maximum Likelihood (ML) parameter estimation problem for FHMMs. Much of the extant literature concerning this problem presents parameter estimation schemes based on full data log-likelihood EM algorithms. This approach can be slow to converge and often imposes heavy demands on computer memory. The latter point is particularly relevant for the class of FHMMs where state space dimensions are relatively large. The contribution in this article is to develop new recursive formulae for a filter-based EM algorithm that can be implemented online. Our new formulae are equivalent ML estimators, however, these formulae are purely recursive and so, significantly reduce numerical complexity and memory requirements. A computer simulation is included to demonstrate the performance of our results. © Taylor & Francis Group, LLC.
Resumo:
Unbiased location- and scale-invariant `elemental' estimators for the GPD tail parameter are constructed. Each involves three log-spacings. The estimators are unbiased for finite sample sizes, even as small as N=3. It is shown that the elementals form a complete basis for unbiased location- and scale-invariant estimators constructed from linear combinations of log-spacings. Preliminary numerical evidence is presented which suggests that elemental combinations can be constructed which are consistent estimators of the tail parameter for samples drawn from the pure GPD family.
Resumo:
In a companion paper (McRobie(2013) arxiv:1304.3918), a simple set of `elemental' estimators was presented for the Generalized Pareto tail parameter. Each elemental estimator: involves only three log-spacings; is absolutely unbiased for all values of the tail parameter; is location- and scale-invariant; and is valid for all sample sizes $N$, even as small as $N= 3$. It was suggested that linear combinations of such elementals could then be used to construct efficient unbiased estimators. In this paper, the analogous mathematical approach is taken to the Generalised Extreme Value (GEV) distribution. The resulting elemental estimators, although not absolutely unbiased, are found to have very small bias, and may thus provide a useful basis for the construction of efficient estimators.
Identifying cancer subtypes in glioblastoma by combining genomic, transcriptomic and epigenomic data
Resumo:
We present a nonparametric Bayesian method for disease subtype discovery in multi-dimensional cancer data. Our method can simultaneously analyse a wide range of data types, allowing for both agreement and disagreement between their underlying clustering structure. It includes feature selection and infers the most likely number of disease subtypes, given the data. We apply the method to 277 glioblastoma samples from The Cancer Genome Atlas, for which there are gene expression, copy number variation, methylation and microRNA data. We identify 8 distinct consensus subtypes and study their prognostic value for death, new tumour events, progression and recurrence. The consensus subtypes are prognostic of tumour recurrence (log-rank p-value of $3.6 \times 10^{-4}$ after correction for multiple hypothesis tests). This is driven principally by the methylation data (log-rank p-value of $2.0 \times 10^{-3}$) but the effect is strengthened by the other 3 data types, demonstrating the value of integrating multiple data types. Of particular note is a subtype of 47 patients characterised by very low levels of methylation. This subtype has very low rates of tumour recurrence and no new events in 10 years of follow up. We also identify a small gene expression subtype of 6 patients that shows particularly poor survival outcomes. Additionally, we note a consensus subtype that showly a highly distinctive data signature and suggest that it is therefore a biologically distinct subtype of glioblastoma. The code is available from https://sites.google.com/site/multipledatafusion/
Resumo:
Convergence analysis of consensus algorithms is revisited in the light of the Hilbert distance. The Lyapunov function used in the early analysis by Tsitsiklis is shown to be the Hilbert distance to consensus in log coordinates. Birkhoff theorem, which proves contraction of the Hilbert metric for any positive homogeneous monotone map, provides an early yet general convergence result for consensus algorithms. Because Birkhoff theorem holds in arbitrary cones, we extend consensus algorithms to the cone of positive definite matrices. The proposed generalization finds applications in the convergence analysis of quantum stochastic maps, which are a generalization of stochastic maps to non-commutative probability spaces. ©2010 IEEE.
Resumo:
Statistically planar turbulent partially premixed flames for different initial intensities of decaying turbulence have been simulated for global equivalence ratios = 0.7 and 1.0 using three-dimensional, simplified chemistry-based direct numerical simulations (DNS). The simulation parameters are chosen such that the flames represent the thin reaction zones regime combustion. A random bimodal distribution of equivalence ratio is introduced in the unburned gas ahead of the flame to account for the mixture inhomogeneity. The results suggest that the probability density functions (PDFs) of the mixture fraction gradient magnitude |Δξ| (i.e., P(|Δξ|)) can be reasonably approximated using a log-normal distribution. However, this presumed PDF distribution captures only the qualitative nature of the PDF of the reaction progress variable gradient magnitude |Δc| (i.e., P(|Δc|)). It has been found that a bivariate log-normal distribution does not sufficiently capture the quantitative behavior of the joint PDF of |Δξ| and |Δc| (i.e., P(|Δξ|, |Δc|)), and the agreement with the DNS data has been found to be poor in certain regions of the flame brush, particularly toward the burned gas side of the flame brush. Moreover, the variables |Δξ| and |Δc| show appreciable correlation toward the burned gas side of the flame brush. These findings are corroborated further using a DNS data of a lifted jet flame to study the flame geometry dependence of these statistics. © 2013 Copyright Taylor and Francis Group, LLC.
Resumo:
An analysis is presented of a database of 67 tests on 21 clays and silts of undrained shear stress-strain data of fine-grained soils. Normalizations of secant G in terms of initial mean effective stress p9 (i.e., G=p9 versus log g) or undrained shear strength cu (i.e., G=cu versus log g) are shown to be much less successful in reducing the scatter between different clays than the approach that uses the maximum shear modulus,Gmax, a technique still not universally adopted by geotechnical researchers and constitutive modelers. Analysis of semiempirical expressions forGmax is presented and a simple expression that uses only a void-ratio function and a confining-stress function is proposed. This is shown to be superior to a Hardin-style equation, and the void ratio function is demonstrated as an alternative to an overconsolidation ratio (OCR) function. To derive correlations that offer reliable estimates of secant stiffness at any required magnitude of working strain, secant shear modulus G is normalized with respect to its small-strain value Gmax, and shear strain g is normalized with respect to a reference strain gref at which this stiffness has halved. The data are corrected to two standard strain rates to reduce the discrepancy between data obtained from static and cyclic testing. The reference strain gref is approximated as a function of the plasticity index.Aunique normalized shear modulus reduction curve in the shape of a modified hyperbola is fitted to all the available data up to shear strains of the order of 1%. As a result, good estimates can be made of the modulus reduction G/Gmax ±30% across all strain levels in approximately 90% of the cases studied. New design charts are proposed to update the commonly used design curves. © 2013 American Society of Civil Engineers.
Resumo:
State-of-the-art speech recognisers are usually based on hidden Markov models (HMMs). They model a hidden symbol sequence with a Markov process, with the observations independent given that sequence. These assumptions yield efficient algorithms, but limit the power of the model. An alternative model that allows a wide range of features, including word- and phone-level features, is a log-linear model. To handle, for example, word-level variable-length features, the original feature vectors must be segmented into words. Thus, decoding must find the optimal combination of segmentation of the utterance into words and word sequence. Features must therefore be extracted for each possible segment of audio. For many types of features, this becomes slow. In this paper, long-span features are derived from the likelihoods of word HMMs. Derivatives of the log-likelihoods, which break the Markov assumption, are appended. Previously, decoding with this model took cubic time in the length of the sequence, and longer for higher-order derivatives. This paper shows how to decode in quadratic time. © 2013 IEEE.
Resumo:
Performance on visual working memory tasks decreases as more items need to be remembered. Over the past decade, a debate has unfolded between proponents of slot models and slotless models of this phenomenon (Ma, Husain, Bays (Nature Neuroscience 17, 347-356, 2014). Zhang and Luck (Nature 453, (7192), 233-235, 2008) and Anderson, Vogel, and Awh (Attention, Perception, Psychophys 74, (5), 891-910, 2011) noticed that as more items need to be remembered, "memory noise" seems to first increase and then reach a "stable plateau." They argued that three summary statistics characterizing this plateau are consistent with slot models, but not with slotless models. Here, we assess the validity of their methods. We generated synthetic data both from a leading slot model and from a recent slotless model and quantified model evidence using log Bayes factors. We found that the summary statistics provided at most 0.15 % of the expected model evidence in the raw data. In a model recovery analysis, a total of more than a million trials were required to achieve 99 % correct recovery when models were compared on the basis of summary statistics, whereas fewer than 1,000 trials were sufficient when raw data were used. Therefore, at realistic numbers of trials, plateau-related summary statistics are highly unreliable for model comparison. Applying the same analyses to subject data from Anderson et al. (Attention, Perception, Psychophys 74, (5), 891-910, 2011), we found that the evidence in the summary statistics was at most 0.12 % of the evidence in the raw data and far too weak to warrant any conclusions. The evidence in the raw data, in fact, strongly favored the slotless model. These findings call into question claims about working memory that are based on summary statistics.