894 resultados para gaussian mixture model
Resumo:
A physically motivated statistical model is used to diagnose variability and trends in wintertime ( October - March) Global Precipitation Climatology Project (GPCP) pentad (5-day mean) precipitation. Quasi-geostrophic theory suggests that extratropical precipitation amounts should depend multiplicatively on the pressure gradient, saturation specific humidity, and the meridional temperature gradient. This physical insight has been used to guide the development of a suitable statistical model for precipitation using a mixture of generalized linear models: a logistic model for the binary occurrence of precipitation and a Gamma distribution model for the wet day precipitation amount. The statistical model allows for the investigation of the role of each factor in determining variations and long-term trends. Saturation specific humidity q(s) has a generally negative effect on global precipitation occurrence and with the tropical wet pentad precipitation amount, but has a positive relationship with the pentad precipitation amount at mid- and high latitudes. The North Atlantic Oscillation, a proxy for the meridional temperature gradient, is also found to have a statistically significant positive effect on precipitation over much of the Atlantic region. Residual time trends in wet pentad precipitation are extremely sensitive to the choice of the wet pentad threshold because of increasing trends in low-amplitude precipitation pentads; too low a choice of threshold can lead to a spurious decreasing trend in wet pentad precipitation amounts. However, for not too small thresholds, it is found that the meridional temperature gradient is an important factor for explaining part of the long-term trend in Atlantic precipitation.
Resumo:
We investigate the performance of phylogenetic mixture models in reducing a well-known and pervasive artifact of phylogenetic inference known as the node-density effect, comparing them to partitioned analyses of the same data. The node-density effect refers to the tendency for the amount of evolutionary change in longer branches of phylogenies to be underestimated compared to that in regions of the tree where there are more nodes and thus branches are typically shorter. Mixture models allow more than one model of sequence evolution to describe the sites in an alignment without prior knowledge of the evolutionary processes that characterize the data or how they correspond to different sites. If multiple evolutionary patterns are common in sequence evolution, mixture models may be capable of reducing node-density effects by characterizing the evolutionary processes more accurately. In gene-sequence alignments simulated to have heterogeneous patterns of evolution, we find that mixture models can reduce node-density effects to negligible levels or remove them altogether, performing as well as partitioned analyses based on the known simulated patterns. The mixture models achieve this without knowledge of the patterns that generated the data and even in some cases without specifying the full or true model of sequence evolution known to underlie the data. The latter result is especially important in real applications, as the true model of evolution is seldom known. We find the same patterns of results for two real data sets with evidence of complex patterns of sequence evolution: mixture models substantially reduced node-density effects and returned better likelihoods compared to partitioning models specifically fitted to these data. We suggest that the presence of more than one pattern of evolution in the data is a common source of error in phylogenetic inference and that mixture models can often detect these patterns even without prior knowledge of their presence in the data. Routine use of mixture models alongside other approaches to phylogenetic inference may often reveal hidden or unexpected patterns of sequence evolution and can improve phylogenetic inference.
Resumo:
This article is about modeling count data with zero truncation. A parametric count density family is considered. The truncated mixture of densities from this family is different from the mixture of truncated densities from the same family. Whereas the former model is more natural to formulate and to interpret, the latter model is theoretically easier to treat. It is shown that for any mixing distribution leading to a truncated mixture, a (usually different) mixing distribution can be found so. that the associated mixture of truncated densities equals the truncated mixture, and vice versa. This implies that the likelihood surfaces for both situations agree, and in this sense both models are equivalent. Zero-truncated count data models are used frequently in the capture-recapture setting to estimate population size, and it can be shown that the two Horvitz-Thompson estimators, associated with the two models, agree. In particular, it is possible to achieve strong results for mixtures of truncated Poisson densities, including reliable, global construction of the unique NPMLE (nonparametric maximum likelihood estimator) of the mixing distribution, implying a unique estimator for the population size. The benefit of these results lies in the fact that it is valid to work with the mixture of truncated count densities, which is less appealing for the practitioner but theoretically easier. Mixtures of truncated count densities form a convex linear model, for which a developed theory exists, including global maximum likelihood theory as well as algorithmic approaches. Once the problem has been solved in this class, it might readily be transformed back to the original problem by means of an explicitly given mapping. Applications of these ideas are given, particularly in the case of the truncated Poisson family.
Resumo:
A physically motivated statistical model is used to diagnose variability and trends in wintertime ( October - March) Global Precipitation Climatology Project (GPCP) pentad (5-day mean) precipitation. Quasi-geostrophic theory suggests that extratropical precipitation amounts should depend multiplicatively on the pressure gradient, saturation specific humidity, and the meridional temperature gradient. This physical insight has been used to guide the development of a suitable statistical model for precipitation using a mixture of generalized linear models: a logistic model for the binary occurrence of precipitation and a Gamma distribution model for the wet day precipitation amount. The statistical model allows for the investigation of the role of each factor in determining variations and long-term trends. Saturation specific humidity q(s) has a generally negative effect on global precipitation occurrence and with the tropical wet pentad precipitation amount, but has a positive relationship with the pentad precipitation amount at mid- and high latitudes. The North Atlantic Oscillation, a proxy for the meridional temperature gradient, is also found to have a statistically significant positive effect on precipitation over much of the Atlantic region. Residual time trends in wet pentad precipitation are extremely sensitive to the choice of the wet pentad threshold because of increasing trends in low-amplitude precipitation pentads; too low a choice of threshold can lead to a spurious decreasing trend in wet pentad precipitation amounts. However, for not too small thresholds, it is found that the meridional temperature gradient is an important factor for explaining part of the long-term trend in Atlantic precipitation.
Resumo:
This paper introduces a new fast, effective and practical model structure construction algorithm for a mixture of experts network system utilising only process data. The algorithm is based on a novel forward constrained regression procedure. Given a full set of the experts as potential model bases, the structure construction algorithm, formed on the forward constrained regression procedure, selects the most significant model base one by one so as to minimise the overall system approximation error at each iteration, while the gate parameters in the mixture of experts network system are accordingly adjusted so as to satisfy the convex constraints required in the derivation of the forward constrained regression procedure. The procedure continues until a proper system model is constructed that utilises some or all of the experts. A pruning algorithm of the consequent mixture of experts network system is also derived to generate an overall parsimonious construction algorithm. Numerical examples are provided to demonstrate the effectiveness of the new algorithms. The mixture of experts network framework can be applied to a wide variety of applications ranging from multiple model controller synthesis to multi-sensor data fusion.
Resumo:
A new incremental four-dimensional variational (4D-Var) data assimilation algorithm is introduced. The algorithm does not require the computationally expensive integrations with the nonlinear model in the outer loops. Nonlinearity is accounted for by modifying the linearization trajectory of the observation operator based on integrations with the tangent linear (TL) model. This allows us to update the linearization trajectory of the observation operator in the inner loops at negligible computational cost. As a result the distinction between inner and outer loops is no longer necessary. The key idea on which the proposed 4D-Var method is based is that by using Gaussian quadrature it is possible to get an exact correspondence between the nonlinear time evolution of perturbations and the time evolution in the TL model. It is shown that J-point Gaussian quadrature can be used to derive the exact adjoint-based observation impact equations and furthermore that it is straightforward to account for the effect of multiple outer loops in these equations if the proposed 4D-Var method is used. The method is illustrated using a three-level quasi-geostrophic model and the Lorenz (1996) model.
Resumo:
The assimilation of observations with a forecast is often heavily influenced by the description of the error covariances associated with the forecast. When a temperature inversion is present at the top of the boundary layer (BL), a significant part of the forecast error may be described as a vertical positional error (as opposed to amplitude error normally dealt with in data assimilation). In these cases, failing to account for positional error explicitly is shown t o r esult in an analysis for which the inversion structure is erroneously weakened and degraded. In this article, a new assimilation scheme is proposed to explicitly include the positional error associated with an inversion. This is done through the introduction of an extra control variable to allow position errors in the a priori to be treated simultaneously with the usual amplitude errors. This new scheme, referred to as the ‘floating BL scheme’, is applied to the one-dimensional (vertical) variational assimilation of temperature. The floating BL scheme is tested with a series of idealised experiments a nd with real data from radiosondes. For each idealised experiment, the floating BL scheme gives an analysis which has the inversion structure and position in agreement with the truth, and outperforms the a ssimilation which accounts only for forecast a mplitude error. When the floating BL scheme is used to assimilate a l arge sample of radiosonde data, its ability to give an analysis with an inversion height in better agreement with that observed is confirmed. However, it is found that the use of Gaussian statistics is an inappropriate description o f t he error statistics o f t he extra c ontrol variable. This problem is alleviated by incorporating a non-Gaussian description of the new control variable in the new scheme. Anticipated challenges in implementing the scheme operationally are discussed towards the end of the article.
Resumo:
The prebiotic Bimuno (R) is a mixture containing galactooligosaccharide, produced by the galactosyltransferase activity of Bifidobacterium bifidum NCIMB 41 .vertical bar 71 in the presence of lactose. Previous studies have implicated prebiotics in reducing infections by enteric pathogens, thus it was hypothesized that Bimuno (R) may confer some protection in the murine host from Salmonella enterica serovar Typhimurium (S. Typhimurium) infection. In this study, infection caused by S. Typhimurium SL1344nal(r) in the presence or absence of Bimuno (R) was assessed using tissue culture assays, a murine ligated ileal gut loop model and a murine oral challenge model. In tissue culture adherence and invasion assays with HT-29-1 6E cells, the presence of similar to 2 mM Bimuno) significantly reduced the invasion of S. Typhimuriurn SL1 344nal(r) (p < 0.0001). In the murine ligated ileal gut loops, the presence of Bimuno (R) prevented colonization and the associated pathology of S. Typhimurium. In the BALB/c mouse mocel, the oral delivery of Bimuno prior to challenge with S. Typhimurium resulted in significant reductions in colonization in the five organs sampled, with highly significant reductions being observed in the spleen at 72 and 96 h post-challenge (P=0.0002, < 0.0001, respectively). Collectively, the results indicate that Bimuno (R) significantly reduced the colonization and pathology associated with S. Typhimurium infection in a murine model system, possibly by reducing the invasion of the pathogen into host cells.
Resumo:
Experiments assimilating the RAPID dataset of deep temperature and salinity profiles at 26.5°N on the western and eastern Atlantic boundaries into a 1° global NEMO ocean model have been performed. The meridional overturning circulation (MOC) is then assessed against the transports calculated directly from observations. The best initialization found for this short period was obtained by assimilating the EN3 upper-ocean hydrography database prior to 2004, after which different methods of assimilating 5-day average RAPID profiles at the western boundary were tested. The model MOC is strengthened by ∼ 2 Sv giving closer agreement with the RAPID array transports, when the western boundary profiles are assimilated only below 900 m (the approximate depth of the Florida Straits, which are not well resolved) and when the T,S observations are spread meridionally from 10 to 35°N along the deep western boundary. The use of boundary-focused covariances has the largest impact on the assimilation results, otherwise using more conventional Gaussian covariances has a very local impact on the MOC at 26°N with strong adverse impacts on the MOC stream function at higher and lower latitudes. Even using boundary-focused covariances only enables the MOC to be strengthened for ∼ 2 years, after which the increased transport of warm waters leads to a negative feedback on water formation in the subpolar gyre which then reduces the MOC. This negative feedback can be mitigated if EN3 hydrography data continue to be assimilated along with the RAPID array boundary data. Copyright © 2012 Royal Meteorological Society and Crown in the right of Canada.
Resumo:
This paper introduces a new adaptive nonlinear equalizer relying on a radial basis function (RBF) model, which is designed based on the minimum bit error rate (MBER) criterion, in the system setting of the intersymbol interference channel plus a co-channel interference. Our proposed algorithm is referred to as the on-line mixture of Gaussians estimator aided MBER (OMG-MBER) equalizer. Specifically, a mixture of Gaussians based probability density function (PDF) estimator is used to model the PDF of the decision variable, for which a novel on-line PDF update algorithm is derived to track the incoming data. With the aid of this novel on-line mixture of Gaussians based sample-by-sample updated PDF estimator, our adaptive nonlinear equalizer is capable of updating its equalizer’s parameters sample by sample to aim directly at minimizing the RBF nonlinear equalizer’s achievable bit error rate (BER). The proposed OMG-MBER equalizer significantly outperforms the existing on-line nonlinear MBER equalizer, known as the least bit error rate equalizer, in terms of both the convergence speed and the achievable BER, as is confirmed in our simulation study
Resumo:
Numerical simulations are presented of the ion distribution functions seen by middle-altitude spacecraft in the low-latitude boundary layer (LLBL) and cusp regions when reconnection is, or has recently been, taking place at the equatorial magnetopause. From the evolution of the distribution function with time elapsed since the field line was opened, both the observed energy/observation-time and pitch-angle/energy dispersions are well reproduced. Distribution functions showing a mixture of magnetosheath and magnetospheric ions, often thought to be a signature of the LLBL, are found on newly opened field lines as a natural consequence of the magnetopause effects on the ions and their flight times. In addition, it is shown that the extent of the source region of the magnetosheath ions that are detected by a satellite is a function of the sensitivity of the ion instrument . If the instrument one-count level is high (and/or solar-wind densities are low), the cusp ion precipitation detected comes from a localised region of the mid-latitude magnetopause (around the magnetic cusp), even though the reconnection takes place at the equatorial magnetopause. However, if the instrument sensitivity is high enough, then ions injected from a large segment of the dayside magnetosphere (in the relevant hemisphere) will be detected in the cusp. Ion precipitation classed as LLBL is shown to arise from the low-latitude magnetopause, irrespective of the instrument sensitivity. Adoption of threshold flux definitions has the same effect as instrument sensitivity in artificially restricting the apparent source region.
Resumo:
Learning low dimensional manifold from highly nonlinear data of high dimensionality has become increasingly important for discovering intrinsic representation that can be utilized for data visualization and preprocessing. The autoencoder is a powerful dimensionality reduction technique based on minimizing reconstruction error, and it has regained popularity because it has been efficiently used for greedy pretraining of deep neural networks. Compared to Neural Network (NN), the superiority of Gaussian Process (GP) has been shown in model inference, optimization and performance. GP has been successfully applied in nonlinear Dimensionality Reduction (DR) algorithms, such as Gaussian Process Latent Variable Model (GPLVM). In this paper we propose the Gaussian Processes Autoencoder Model (GPAM) for dimensionality reduction by extending the classic NN based autoencoder to GP based autoencoder. More interestingly, the novel model can also be viewed as back constrained GPLVM (BC-GPLVM) where the back constraint smooth function is represented by a GP. Experiments verify the performance of the newly proposed model.
Effects of orange juice formulation on prebiotic functionality using an in vitro colonic model sytem
Resumo:
A three-stage continuous fermentative colonic model system was used to monitor in vitro the effect of different orange juice formulations on prebiotic activity. Three different juices with and without Bimuno, a GOS mixture containing galactooligosaccharides (B-GOS) were assessed in terms of their ability to induce a bifidogenic microbiota. The recipe development was based on incorporating 2.75g B-GOS into a 250 ml serving of juice (65°Brix of concentrate juice). Alongside the production of B-GOS juice, a control juice - orange juice without any additional Bimuno and a positive control juice, containing all the components of Bimuno (glucose, galactose and lactose) in the same relative proportions with the exception of B-GOS were developed. Ion Exchange Chromotography analysis was used to test the maintenance of bimuno components after the production process. Data showed that sterilisation had no significant effect on concentration of B-GOS and simple sugars. The three juice formulations were digested under conditions resembling the gastric and small intestinal environments. Main bacterial groups of the faecal microbiota were evaluated throughout the colonic model study using 16S rRNA-based fluorescence in situ hybridization (FISH). Potential effects of supplementation of the juices on microbial metabolism were studied measuring short chain fatty acids (SCFAs) using gas chromatography. Furthermore, B-GOS juices showed positive modulations of the microbiota composition and metabolic activity. In particular, numbers of faecal bifidobacteria and lactobacilli were significantly higher when B-GOS juice was fermented compared to controls. Furthermore, fermentation of B-GOS juice resulted in an increase in Roseburia subcluster and concomitantly increased butyrate production, which is of potential benefit to the host. In conclusion, this study has shown B-GOS within orange juice can have a beneficial effect on the fecal microbiota.
Resumo:
The disadvantage of the majority of data assimilation schemes is the assumption that the conditional probability density function of the state of the system given the observations [posterior probability density function (PDF)] is distributed either locally or globally as a Gaussian. The advantage, however, is that through various different mechanisms they ensure initial conditions that are predominantly in linear balance and therefore spurious gravity wave generation is suppressed. The equivalent-weights particle filter is a data assimilation scheme that allows for a representation of a potentially multimodal posterior PDF. It does this via proposal densities that lead to extra terms being added to the model equations and means the advantage of the traditional data assimilation schemes, in generating predominantly balanced initial conditions, is no longer guaranteed. This paper looks in detail at the impact the equivalent-weights particle filter has on dynamical balance and gravity wave generation in a primitive equation model. The primary conclusions are that (i) provided the model error covariance matrix imposes geostrophic balance, then each additional term required by the equivalent-weights particle filter is also geostrophically balanced; (ii) the relaxation term required to ensure the particles are in the locality of the observations has little effect on gravity waves and actually induces a reduction in gravity wave energy if sufficiently large; and (iii) the equivalent-weights term, which leads to the particles having equivalent significance in the posterior PDF, produces a change in gravity wave energy comparable to the stochastic model error. Thus, the scheme does not produce significant spurious gravity wave energy and so has potential for application in real high-dimensional geophysical applications.
Resumo:
The deterpenation of bergamot essential oil can be performed by liquid liquid extraction using hydrous ethanol as the solvent. A ternary mixture composed of 1-methyl-4-prop-1-en-2-yl-cydohexene (limonene), 3,7-dimethylocta-1,6-dien-3-yl-acetate (linalyl acetate), and 3,7-dimethylocta-1,6-dien-3-ol (linalool), three major compounds commonly found in bergamot oil, was used to simulate this essential oil. Liquid liquid equilibrium data were experimentally determined for systems containing essential oil compounds, ethanol, and water at 298.2 K and are reported in this paper. The experimental data were correlated using the NRTL and UNIQUAC models, and the mean deviations between calculated and experimental data were lower than 0.0062 in all systems, indicating the good descriptive quality of the molecular models. To verify the effect of the water mass fraction in the solvent and the linalool mass fraction in the terpene phase on the distribution coefficients of the essential oil compounds, nonlinear regression analyses were performed, obtaining mathematical models with correlation coefficient values higher than 0.99. The results show that as the water content in the solvent phase increased, the kappa value decreased, regardless of the type of compound studied. Conversely, as the linalool content increased, the distribution coefficients of hydrocarbon terpene and ester also increased. However, the linalool distribution coefficient values were negatively affected when the terpene alcohol content increased in the terpene phase.