971 resultados para Gaussian mixture models
Resumo:
The Asian summer monsoon is a high dimensional and highly nonlinear phenomenon involving considerable moisture transport towards land from the ocean, and is critical for the whole region. We have used daily ECMWF reanalysis (ERA-40) sea-level pressure (SLP) anomalies to the seasonal cycle, over the region 50-145°E, 20°S-35°N to study the nonlinearity of the Asian monsoon using Isomap. We have focused on the two-dimensional embedding of the SLP anomalies for ease of interpretation. Unlike the unimodality obtained from tests performed in empirical orthogonal function space, the probability density function, within the two-dimensional Isomap space, turns out to be bimodal. But a clustering procedure applied to the SLP data reveals support for three clusters, which are identified using a three-component bivariate Gaussian mixture model. The modes are found to appear similar to active and break phases of the monsoon over South Asia in addition to a third phase, which shows active conditions over the Western North Pacific. Using the low-level wind field anomalies the active phase over South Asia is found to be characterised by a strengthening and an eastward extension of the Somali jet whereas during the break phase the Somali jet is weakened near southern India, while the monsoon trough in northern India also weakens. Interpretation is aided using the APHRODITE gridded land precipitation product for monsoon Asia. The effect of large-scale seasonal mean monsoon and lower boundary forcing, in the form of ENSO, is also investigated and discussed. The outcome here is that ENSO is shown to perturb the intraseasonal regimes, in agreement with conceptual ideas.
Resumo:
Scene classification based on latent Dirichlet allocation (LDA) is a more general modeling method known as a bag of visual words, in which the construction of a visual vocabulary is a crucial quantization process to ensure success of the classification. A framework is developed using the following new aspects: Gaussian mixture clustering for the quantization process, the use of an integrated visual vocabulary (IVV), which is built as the union of all centroids obtained from the separate quantization process of each class, and the usage of some features, including edge orientation histogram, CIELab color moments, and gray-level co-occurrence matrix (GLCM). The experiments are conducted on IKONOS images with six semantic classes (tree, grassland, residential, commercial/industrial, road, and water). The results show that the use of an IVV increases the overall accuracy (OA) by 11 to 12% and 6% when it is implemented on the selected and all features, respectively. The selected features of CIELab color moments and GLCM provide a better OA than the implementation over CIELab color moment or GLCM as individuals. The latter increases the OA by only ∼2 to 3%. Moreover, the results show that the OA of LDA outperforms the OA of C4.5 and naive Bayes tree by ∼20%. © 2014 Society of Photo-Optical Instrumentation Engineers (SPIE) [DOI: 10.1117/1.JRS.8.083690]
Resumo:
We develop a process-based model for the dispersion of a passive scalar in the turbulent flow around the buildings of a city centre. The street network model is based on dividing the airspace of the streets and intersections into boxes, within which the turbulence renders the air well mixed. Mean flow advection through the network of street and intersection boxes then mediates further lateral dispersion. At the same time turbulent mixing in the vertical detrains scalar from the streets and intersections into the turbulent boundary layer above the buildings. When the geometry is regular, the street network model has an analytical solution that describes the variation in concentration in a near-field downwind of a single source, where the majority of scalar lies below roof level. The power of the analytical solution is that it demonstrates how the concentration is determined by only three parameters. The plume direction parameter describes the branching of scalar at the street intersections and hence determines the direction of the plume centreline, which may be very different from the above-roof wind direction. The transmission parameter determines the distance travelled before the majority of scalar is detrained into the atmospheric boundary layer above roof level and conventional atmospheric turbulence takes over as the dominant mixing process. Finally, a normalised source strength multiplies this pattern of concentration. This analytical solution converges to a Gaussian plume after a large number of intersections have been traversed, providing theoretical justification for previous studies that have developed empirical fits to Gaussian plume models. The analytical solution is shown to compare well with very high-resolution simulations and with wind tunnel experiments, although re-entrainment of scalar previously detrained into the boundary layer above roofs, which is not accounted for in the analytical solution, is shown to become an important process further downwind from the source.
Resumo:
In this paper we introduce a parametric model for handling lifetime data where an early lifetime can be related to the infant-mortality failure or to the wear processes but we do not know which risk is responsible for the failure. The maximum likelihood approach and the sampling-based approach are used to get the inferences of interest. Some special cases of the proposed model are studied via Monte Carlo methods for size and power of hypothesis tests. To illustrate the proposed methodology, we introduce an example consisting of a real data set.
Resumo:
We present a Bayesian approach for modeling heterogeneous data and estimate multimodal densities using mixtures of Skew Student-t-Normal distributions [Gomez, H.W., Venegas, O., Bolfarine, H., 2007. Skew-symmetric distributions generated by the distribution function of the normal distribution. Environmetrics 18, 395-407]. A stochastic representation that is useful for implementing a MCMC-type algorithm and results about existence of posterior moments are obtained. Marginal likelihood approximations are obtained, in order to compare mixture models with different number of component densities. Data sets concerning the Gross Domestic Product per capita (Human Development Report) and body mass index (National Health and Nutrition Examination Survey), previously studied in the related literature, are analyzed. (c) 2008 Elsevier B.V. All rights reserved.
Resumo:
A relativistic four-component study was performed for the XeF(2) molecule by using the Dirac-Coulomb (DC) Hamiltonian and the relativistic adapted Gaussian basis sets (RAGBSs). The comparison of bond lengths obtained showed that relativistic effects on this property are small (increase of only 0.01 angstrom) while the contribution of electron correlation, obtained at CCSD(T) or CCSD-T levels, is more important (increase of 0.05 angstrom). Electron correlation is also dominant over relativistic effects for dissociation energies. Moreover, the correlation-relativity interaction is shown to be negligible for these properties. The electron affinity, the first ionization potential and the double ionization potential are obtained by means of the Fock-space coupled cluster (FSCC) method, resulting in DC-CCSD-T values of 0.3 eV, 12.5 eV and 32.3 eV, respectively. Vibrational frequencies and some anharmonicity constants were also calculated under the four-component formalism by means of standard perturbation equations. All these molecular properties are, in general, ill satisfactory agreement with available experimental results. Finally, a partition in terms of charge-charge flux-dipole flux (CCFDF) contributions derived by means of the quantum theory of atoms in molecules (QTAIM) in non-relativistic QCISD(FC)/3-21G* calculations was carried out for XeF(2) and KrF(2). This analysis showed that the most remarkable difference between both molecules lies on the charge flux contribution to the asymmetric stretching mode, which is negligible in KrF(2) but important in XeF(2). (c) 2008 Elsevier B.V. All rights reserved.
Resumo:
In Survival Analysis, long duration models allow for the estimation of the healing fraction, which represents a portion of the population immune to the event of interest. Here we address classical and Bayesian estimation based on mixture models and promotion time models, using different distributions (exponential, Weibull and Pareto) to model failure time. The database used to illustrate the implementations is described in Kersey et al. (1987) and it consists of a group of leukemia patients who underwent a certain type of transplant. The specific implementations used were numeric optimization by BFGS as implemented in R (base::optim), Laplace approximation (own implementation) and Gibbs sampling as implemented in Winbugs. We describe the main features of the models used, the estimation methods and the computational aspects. We also discuss how different prior information can affect the Bayesian estimates
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The aim of this work is to discriminate vegetation classes throught remote sensing images from the satellite CBERS-2, related to winter and summer seasons in the Campos Gerais region Paraná State, Brazil. The vegetation cover of the region presents different kinds of vegetations: summer and winter cultures, reforestation areas, natural areas and pasture. Supervised classification techniques like Maximum Likelihood Classifier (MLC) and Decision Tree were evaluated, considering a set of attributes from images, composed by bands of the CCD sensor (1, 2, 3, 4), vegetation indices (CTVI, DVI, GEMI, NDVI, SR, SAVI, TVI), mixture models (soil, shadow, vegetation) and the two first main components. The evaluation of the classifications accuracy was made using the classification error matrix and the kappa coefficient. It was defined a high discriminatory level during the classes definition, in order to allow separation of different kinds of winter and summer crops. The classification accuracy by decision tree was 94.5% and the kappa coefficient was 0.9389 for the scene 157/128. For the scene 158/127, the values were 88% and 0.8667, respectively. The classification accuracy by MLC was 84.86% and the kappa coefficient was 0.8099 for the scene 157/128. For the scene 158/127, the values were 77.90% and 0.7476, respectively. The results showed a better performance of the Decision Tree classifier than MLC, especially to the classes related to cultivated crops, indicating the use of the Decision Tree classifier to the vegetation cover mapping including different kinds of crops.
Resumo:
Many recent survival studies propose modeling data with a cure fraction, i.e., data in which part of the population is not susceptible to the event of interest. This event may occur more than once for the same individual (recurrent event). We then have a scenario of recurrent event data in the presence of a cure fraction, which may appear in various areas such as oncology, finance, industries, among others. This paper proposes a multiple time scale survival model to analyze recurrent events using a cure fraction. The objective is analyzing the efficiency of certain interventions so that the studied event will not happen again in terms of covariates and censoring. All estimates were obtained using a sampling-based approach, which allows information to be input beforehand with lower computational effort. Simulations were done based on a clinical scenario in order to observe some frequentist properties of the estimation procedure in the presence of small and moderate sample sizes. An application of a well-known set of real mammary tumor data is provided.
Resumo:
In this paper, we proposed a new three-parameter long-term lifetime distribution induced by a latent complementary risk framework with decreasing, increasing and unimodal hazard function, the long-term complementary exponential geometric distribution. The new distribution arises from latent competing risk scenarios, where the lifetime associated scenario, with a particular risk, is not observable, rather we observe only the maximum lifetime value among all risks, and the presence of long-term survival. The properties of the proposed distribution are discussed, including its probability density function and explicit algebraic formulas for its reliability, hazard and quantile functions and order statistics. The parameter estimation is based on the usual maximum-likelihood approach. A simulation study assesses the performance of the estimation procedure. We compare the new distribution with its particular cases, as well as with the long-term Weibull distribution on three real data sets, observing its potential and competitiveness in comparison with some usual long-term lifetime distributions.
Resumo:
Neurally adjusted ventilatory assist (NAVA) delivers airway pressure (P(aw)) in proportion to the electrical activity of the diaphragm (EAdi) using an adjustable proportionality constant (NAVA level, cm·H(2)O/μV). During systematic increases in the NAVA level, feedback-controlled down-regulation of the EAdi results in a characteristic two-phased response in P(aw) and tidal volume (Vt). The transition from the 1st to the 2nd response phase allows identification of adequate unloading of the respiratory muscles with NAVA (NAVA(AL)). We aimed to develop and validate a mathematical algorithm to identify NAVA(AL). P(aw), Vt, and EAdi were recorded while systematically increasing the NAVA level in 19 adult patients. In a multistep approach, inspiratory P(aw) peaks were first identified by dividing the EAdi into inspiratory portions using Gaussian mixture modeling. Two polynomials were then fitted onto the curves of both P(aw) peaks and Vt. The beginning of the P(aw) and Vt plateaus, and thus NAVA(AL), was identified at the minimum of squared polynomial derivative and polynomial fitting errors. A graphical user interface was developed in the Matlab computing environment. Median NAVA(AL) visually estimated by 18 independent physicians was 2.7 (range 0.4 to 5.8) cm·H(2)O/μV and identified by our model was 2.6 (range 0.6 to 5.0) cm·H(2)O/μV. NAVA(AL) identified by our model was below the range of visually estimated NAVA(AL) in two instances and was above in one instance. We conclude that our model identifies NAVA(AL) in most instances with acceptable accuracy for application in clinical routine and research.
Resumo:
We present a new approach for corpus-based speech enhancement that significantly improves over a method published by Xiao and Nickel in 2010. Corpus-based enhancement systems do not merely filter an incoming noisy signal, but resynthesize its speech content via an inventory of pre-recorded clean signals. The goal of the procedure is to perceptually improve the sound of speech signals in background noise. The proposed new method modifies Xiao's method in four significant ways. Firstly, it employs a Gaussian mixture model (GMM) instead of a vector quantizer in the phoneme recognition front-end. Secondly, the state decoding of the recognition stage is supported with an uncertainty modeling technique. With the GMM and the uncertainty modeling it is possible to eliminate the need for noise dependent system training. Thirdly, the post-processing of the original method via sinusoidal modeling is replaced with a powerful cepstral smoothing operation. And lastly, due to the improvements of these modifications, it is possible to extend the operational bandwidth of the procedure from 4 kHz to 8 kHz. The performance of the proposed method was evaluated across different noise types and different signal-to-noise ratios. The new method was able to significantly outperform traditional methods, including the one by Xiao and Nickel, in terms of PESQ scores and other objective quality measures. Results of subjective CMOS tests over a smaller set of test samples support our claims.
Resumo:
Boston Harbor has had a history of poor water quality, including contamination by enteric pathogens. We conduct a statistical analysis of data collected by the Massachusetts Water Resources Authority (MWRA) between 1996 and 2002 to evaluate the effects of court-mandated improvements in sewage treatment. Motivated by the ineffectiveness of standard Poisson mixture models and their zero-inflated counterparts, we propose a new negative binomial model for time series of Enterococcus counts in Boston Harbor, where nonstationarity and autocorrelation are modeled using a nonparametric smooth function of time in the predictor. Without further restrictions, this function is not identifiable in the presence of time-dependent covariates; consequently we use a basis orthogonal to the space spanned by the covariates and use penalized quasi-likelihood (PQL) for estimation. We conclude that Enterococcus counts were greatly reduced near the Nut Island Treatment Plant (NITP) outfalls following the transfer of wastewaters from NITP to the Deer Island Treatment Plant (DITP) and that the transfer of wastewaters from Boston Harbor to the offshore diffusers in Massachusetts Bay reduced the Enterococcus counts near the DITP outfalls.