54 resultados para Statistical inference
Resumo:
This paper attempts to unravel any relations that may exist between turbulent shear flows and statistical mechanics through a detailed numerical investigation in the simplest case where both can be well defined. The flow considered for the purpose is the two-dimensional (2D) temporal free shear layer with a velocity difference Delta U across it, statistically homogeneous in the streamwise direction (x) and evolving from a plane vortex sheet in the direction normal to it (y) in a periodic-in-x domain L x +/-infinity. Extensive computer simulations of the flow are carried out through appropriate initial-value problems for a ``vortex gas'' comprising N point vortices of the same strength (gamma = L Delta U/N) and sign. Such a vortex gas is known to provide weak solutions of the Euler equation. More than ten different initial-condition classes are investigated using simulations involving up to 32 000 vortices, with ensemble averages evaluated over up to 10(3) realizations and integration over 10(4)L/Delta U. The temporal evolution of such a system is found to exhibit three distinct regimes. In Regime I the evolution is strongly influenced by the initial condition, sometimes lasting a significant fraction of L/Delta U. Regime III is a long-time domain-dependent evolution towards a statistically stationary state, via ``violent'' and ``slow'' relaxations P.-H. Chavanis, Physica A 391, 3657 (2012)], over flow time scales of order 10(2) and 10(4)L/Delta U, respectively (for N = 400). The final state involves a single structure that stochastically samples the domain, possibly constituting a ``relative equilibrium.'' The vortex distribution within the structure follows a nonisotropic truncated form of the Lundgren-Pointin (L-P) equilibrium distribution (with negatively high temperatures; L-P parameter lambda close to -1). The central finding is that, in the intermediate Regime II, the spreading rate of the layer is universal over the wide range of cases considered here. The value (in terms of momentum thickness) is 0.0166 +/- 0.0002 times Delta U. Regime II, extensively studied in the turbulent shear flow literature as a self-similar ``equilibrium'' state, is, however, a part of the rapid nonequilibrium evolution of the vortex-gas system, which we term ``explosive'' as it lasts less than one L/Delta U. Regime II also exhibits significant values of N-independent two-vortex correlations, indicating that current kinetic theories that neglect correlations or consider them as O(1/N) cannot describe this regime. The evolution of the layer thickness in present simulations in Regimes I and II agree with the experimental observations of spatially evolving (3D Navier-Stokes) shear layers. Further, the vorticity-stream-function relations in Regime III are close to those computed in 2D Navier-Stokes temporal shear layers J. Sommeria, C. Staquet, and R. Robert, J. Fluid Mech. 233, 661 (1991)]. These findings suggest the dominance of what may be called the Kelvin-Biot-Savart mechanism in determining the growth of the free shear layer through large-scale momentum and vorticity dispersal.
Resumo:
Inference of molecular function of proteins is the fundamental task in the quest for understanding cellular processes. The task is getting increasingly difficult with thousands of new proteins discovered each day. The difficulty arises primarily due to lack of high-throughput experimental technique for assessing protein molecular function, a lacunae that computational approaches are trying hard to fill. The latter too faces a major bottleneck in absence of clear evidence based on evolutionary information. Here we propose a de novo approach to annotate protein molecular function through structural dynamics match for a pair of segments from two dissimilar proteins, which may share even <10% sequence identity. To screen these matches, corresponding 1 mu s coarse-grained (CG) molecular dynamics trajectories were used to compute normalized root-mean-square-fluctuation graphs and select mobile segments, which were, thereafter, matched for all pairs using unweighted three-dimensional autocorrelation vectors. Our in-house custom-built forcefield (FF), extensively validated against dynamics information obtained from experimental nuclear magnetic resonance data, was specifically used to generate the CG dynamics trajectories. The test for correspondence of dynamics-signature of protein segments and function revealed 87% true positive rate and 93.5% true negative rate, on a dataset of 60 experimentally validated proteins, including moonlighting proteins and those with novel functional motifs. A random test against 315 unique fold/function proteins for a negative test gave >99% true recall. A blind prediction on a novel protein appears consistent with additional evidences retrieved therein. This is the first proof-of-principle of generalized use of structural dynamics for inferring protein molecular function leveraging our custom-made CG FF, useful to all. (C) 2014 Wiley Periodicals, Inc.
Resumo:
Several statistical downscaling models have been developed in the past couple of decades to assess the hydrologic impacts of climate change by projecting the station-scale hydrological variables from large-scale atmospheric variables simulated by general circulation models (GCMs). This paper presents and compares different statistical downscaling models that use multiple linear regression (MLR), positive coefficient regression (PCR), stepwise regression (SR), and support vector machine (SVM) techniques for estimating monthly rainfall amounts in the state of Florida. Mean sea level pressure, air temperature, geopotential height, specific humidity, U wind, and V wind are used as the explanatory variables/predictors in the downscaling models. Data for these variables are obtained from the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis dataset and the Canadian Centre for Climate Modelling and Analysis (CCCma) Coupled Global Climate Model, version 3 (CGCM3) GCM simulations. The principal component analysis (PCA) and fuzzy c-means clustering method (FCM) are used as part of downscaling model to reduce the dimensionality of the dataset and identify the clusters in the data, respectively. Evaluation of the performances of the models using different error and statistical measures indicates that the SVM-based model performed better than all the other models in reproducing most monthly rainfall statistics at 18 sites. Output from the third-generation CGCM3 GCM for the A1B scenario was used for future projections. For the projection period 2001-10, MLR was used to relate variables at the GCM and NCEP grid scales. Use of MLR in linking the predictor variables at the GCM and NCEP grid scales yielded better reproduction of monthly rainfall statistics at most of the stations (12 out of 18) compared to those by spatial interpolation technique used in earlier studies.
Resumo:
Frequent episode discovery is one of the methods used for temporal pattern discovery in sequential data. An episode is a partially ordered set of nodes with each node associated with an event type. For more than a decade, algorithms existed for episode discovery only when the associated partial order is total (serial episode) or trivial (parallel episode). Recently, the literature has seen algorithms for discovering episodes with general partial orders. In frequent pattern mining, the threshold beyond which a pattern is inferred to be interesting is typically user-defined and arbitrary. One way of addressing this issue in the pattern mining literature has been based on the framework of statistical hypothesis testing. This paper presents a method of assessing statistical significance of episode patterns with general partial orders. A method is proposed to calculate thresholds, on the non-overlapped frequency, beyond which an episode pattern would be inferred to be statistically significant. The method is first explained for the case of injective episodes with general partial orders. An injective episode is one where event-types are not allowed to repeat. Later it is pointed out how the method can be extended to the class of all episodes. The significance threshold calculations for general partial order episodes proposed here also generalize the existing significance results for serial episodes. Through simulations studies, the usefulness of these statistical thresholds in pruning uninteresting patterns is illustrated. (C) 2014 Elsevier Inc. All rights reserved.
Resumo:
We formulate a natural model of loops and isolated vertices for arbitrary planar graphs, which we call the monopole-dimer model. We show that the partition function of this model can be expressed as a determinant. We then extend the method of Kasteleyn and Temperley-Fisher to calculate the partition function exactly in the case of rectangular grids. This partition function turns out to be a square of a polynomial with positive integer coefficients when the grid lengths are even. Finally, we analyse this formula in the infinite volume limit and show that the local monopole density, free energy and entropy can be expressed in terms of well-known elliptic functions. Our technique is a novel determinantal formula for the partition function of a model of isolated vertices and loops for arbitrary graphs.
Resumo:
Facial emotions are the most expressive way to display emotions. Many algorithms have been proposed which employ a particular set of people (usually a database) to both train and test their model. This paper focuses on the challenging task of database independent emotion recognition, which is a generalized case of subject-independent emotion recognition. The emotion recognition system employed in this work is a Meta-Cognitive Neuro-Fuzzy Inference System (McFIS). McFIS has two components, a neuro-fuzzy inference system, which is the cognitive component and a self-regulatory learning mechanism, which is the meta-cognitive component. The meta-cognitive component, monitors the knowledge in the neuro-fuzzy inference system and decides on what-to-learn, when-to-learn and how-to-learn the training samples, efficiently. For each sample, the McFIS decides whether to delete the sample without being learnt, use it to add/prune or update the network parameter or reserve it for future use. This helps the network avoid over-training and as a result improve its generalization performance over untrained databases. In this study, we extract pixel based emotion features from well-known (Japanese Female Facial Expression) JAFFE and (Taiwanese Female Expression Image) TFEID database. Two sets of experiment are conducted. First, we study the individual performance of both databases on McFIS based on 5-fold cross validation study. Next, in order to study the generalization performance, McFIS trained on JAFFE database is tested on TFEID and vice-versa. The performance The performance comparison in both experiments against SVNI classifier gives promising results.
Resumo:
It is well known that wrist pulse signals contain information about the status of health of a person and hence diagnosis based on pulse signals has assumed great importance since long time. In this paper the efficacy of signal processing techniques in extracting useful information from wrist pulse signals has been demonstrated by using signals recorded under two different experimental conditions viz. before lunch condition and after lunch condition. We have used Pearson's product-moment correlation coefficient, which is an effective measure of phase synchronization, in making a statistical analysis of wrist pulse signals. Contour plots and box plots are used to illustrate various differences. Two-sample t-tests show that the correlations show statistically significant differences between the groups. Results show that the correlation coefficient is effective in distinguishing the changes taking place after having lunch. This paper demonstrates the ability of the wrist pulse signals in detecting changes occurring under two different conditions. The study assumes importance in view of limited literature available on the analysis of wrist pulse signals in the case of food intake and also in view of its potential health care applications.
Resumo:
In this paper, we consider the problem of power allocation in MIMO wiretap channel for secrecy in the presence of multiple eavesdroppers. Perfect knowledge of the destination channel state information (CSI) and only the statistical knowledge of the eavesdroppers CSI are assumed. We first consider the MIMO wiretap channel with Gaussian input. Using Jensen's inequality, we transform the secrecy rate max-min optimization problem to a single maximization problem. We use generalized singular value decomposition and transform the problem to a concave maximization problem which maximizes the sum secrecy rate of scalar wiretap channels subject to linear constraints on the transmit covariance matrix. We then consider the MIMO wiretap channel with finite-alphabet input. We show that the transmit covariance matrix obtained for the case of Gaussian input, when used in the MIMO wiretap channel with finite-alphabet input, can lead to zero secrecy rate at high transmit powers. We then propose a power allocation scheme with an additional power constraint which alleviates this secrecy rate loss problem, and gives non-zero secrecy rates at high transmit powers.
Resumo:
Diffusion-a measure of dynamics, and entropy-a measure of disorder in the system are found to be intimately correlated in many systems, and the correlation is often strongly non-linear. We explore the origin of this complex dependence by studying diffusion of a point Brownian particle on a model potential energy surface characterized by ruggedness. If we assume that the ruggedness has a Gaussian distribution, then for this model, one can obtain the excess entropy exactly for any dimension. By using the expression for the mean first passage time, we present a statistical mechanical derivation of the well-known and well-tested scaling relation proposed by Rosenfeld between diffusion and excess entropy. In anticipation that Rosenfeld diffusion-entropy scaling (RDES) relation may continue to be valid in higher dimensions (where the mean first passage time approach is not available), we carry out an effective medium approximation (EMA) based analysis of the effective transition rate and hence of the effective diffusion coefficient. We show that the EMA expression can be used to derive the RDES scaling relation for any dimension higher than unity. However, RDES is shown to break down in the presence of spatial correlation among the energy landscape values. (C) 2015 AIP Publishing LLC.