888 resultados para Reproducing kernel
Resumo:
This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to re-balance the class distribution of the original imbalanced data set under the principle that synthetic data sample follows the same statistical properties. Based on the over-sampled training data, the radial basis function (RBF) classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier’s structure and the parameters of RBF kernels are determined using a particle swarm optimisation algorithm based on the criterion of minimising the leave-one-out misclassification rate. The effectiveness of the proposed PDFOS approach is demonstrated by the empirical study on several imbalanced data sets.
Resumo:
We consider a three dimensional system consisting of a large number of small spherical particles, distributed in a range of sizes and heights (with uniform distribution in the horizontal direction). Particles move vertically at a size-dependent terminal velocity. They are either allowed to merge whenever they cross or there is a size ratio criterion enforced to account for collision efficiency. Such a system may be described, in mean field approximation, by the Smoluchowski kinetic equation with a differential sedimentation kernel. We obtain self-similar steady-state and time-dependent solutions to the kinetic equation, using methods borrowed from weak turbulence theory. Analytical results are compared with direct numerical simulations (DNS) of moving and merging particles, and a good agreement is found.
Resumo:
We study the solutions of the Smoluchowski coagulation equation with a regularization term which removes clusters from the system when their mass exceeds a specified cutoff size, M. We focus primarily on collision kernels which would exhibit an instantaneous gelation transition in the absence of any regularization. Numerical simulations demonstrate that for such kernels with monodisperse initial data, the regularized gelation time decreasesas M increases, consistent with the expectation that the gelation time is zero in the unregularized system. This decrease appears to be a logarithmically slow function of M, indicating that instantaneously gelling kernels may still be justifiable as physical models despite the fact that they are highly singular in the absence of a cutoff. We also study the case when a source of monomers is introduced in the regularized system. In this case a stationary state is reached. We present a complete analytic description of this regularized stationary state for the model kernel, K(m1,m2)=max{m1,m2}ν, which gels instantaneously when M→∞ if ν>1. The stationary cluster size distribution decays as a stretched exponential for small cluster sizes and crosses over to a power law decay with exponent ν for large cluster sizes. The total particle density in the stationary state slowly vanishes as [(ν−1)logM]−1/2 when M→∞. The approach to the stationary state is nontrivial: Oscillations about the stationary state emerge from the interplay between the monomer injection and the cutoff, M, which decay very slowly when M is large. A quantitative analysis of these oscillations is provided for the addition model which describes the situation in which clusters can only grow by absorbing monomers.
Resumo:
Monte Carlo algorithms often aim to draw from a distribution π by simulating a Markov chain with transition kernel P such that π is invariant under P. However, there are many situations for which it is impractical or impossible to draw from the transition kernel P. For instance, this is the case with massive datasets, where is it prohibitively expensive to calculate the likelihood and is also the case for intractable likelihood models arising from, for example, Gibbs random fields, such as those found in spatial statistics and network analysis. A natural approach in these cases is to replace P by an approximation Pˆ. Using theory from the stability of Markov chains we explore a variety of situations where it is possible to quantify how ’close’ the chain given by the transition kernel Pˆ is to the chain given by P . We apply these results to several examples from spatial statistics and network analysis.
Resumo:
European researchers across heterogeneous disciplines voice concerns and argue for new paths towards a brighter future regarding scientific and knowledge creation and communication. Recently, in biological and natural sciences concerns have been expressed that major threats are intentionally ignored. These threats are challenging Europe’s future sustainability towards creating knowledge that effectively deals with emerging social, environmental, health, and economic problems of a planetary scope. Within social science circles however, the root cause regarding the above challenges, have been linked with macro level forces of neo-liberal ways of valuing and relevant rules in academia and beyond which we take for granted. These concerns raised by heterogeneous scholars in natural and the applied social sciences concern the ethics of today’s research and academic integrity. Applying Bourdieu’s sociology may not allow an optimistic lens if change is possible. Rather than attributing the replication of neo-liberal habitus in intentional agent and institutional choices, Bourdieu’s work raises the importance of thoughtlessly internalised habits in human and social action. Accordingly, most action within a given paradigm (in this case, neo-liberalism) is understood as habituated, i.e. unconsciously reproducing external social fields, even ill-defined ways of valuing. This essay analyses these and how they may help critically analyse the current habitus surrounding research and knowledge production, evaluation, and communication and related aspects of academic freedom. Although it is acknowledged that transformation is not easy, the essay presents arguments and recent theory paths to suggest that change nevertheless may be a realistic hope once certain action logics are encouraged.
A benchmark-driven modelling approach for evaluating deployment choices on a multi-core architecture
Resumo:
The complexity of current and emerging architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven model is developed for a simple shallow water code on a Cray XE6 system, to explore how deployment choices such as domain decomposition and core affinity affect performance. The resource sharing present in modern multi-core architectures adds various levels of heterogeneity to the system. Shared resources often includes cache, memory, network controllers and in some cases floating point units (as in the AMD Bulldozer), which mean that the access time depends on the mapping of application tasks, and the core's location within the system. Heterogeneity further increases with the use of hardware-accelerators such as GPUs and the Intel Xeon Phi, where many specialist cores are attached to general-purpose cores. This trend for shared resources and non-uniform cores is expected to continue into the exascale era. The complexity of these systems means that various runtime scenarios are possible, and it has been found that under-populating nodes, altering the domain decomposition and non-standard task to core mappings can dramatically alter performance. To find this out, however, is often a process of trial and error. To better inform this process, a performance model was developed for a simple regular grid-based kernel code, shallow. The code comprises two distinct types of work, loop-based array updates and nearest-neighbour halo-exchanges. Separate performance models were developed for each part, both based on a similar methodology. Application specific benchmarks were run to measure performance for different problem sizes under different execution scenarios. These results were then fed into a performance model that derives resource usage for a given deployment scenario, with interpolation between results as necessary.
Resumo:
A procedure (concurrent multiplicative-additive objective analysis scheme [CMA-OAS]) is proposed for operational rainfall estimation using rain gauges and radar data. On the basis of a concurrent multiplicative-additive (CMA) decomposition of the spatially nonuniform radar bias, within-storm variability of rainfall and fractional coverage of rainfall are taken into account. Thus both spatially nonuniform radar bias, given that rainfall is detected, and bias in radar detection of rainfall are handled. The interpolation procedure of CMA-OAS is built on Barnes' objective analysis scheme (OAS), whose purpose is to estimate a filtered spatial field of the variable of interest through a successive correction of residuals resulting from a Gaussian kernel smoother applied on spatial samples. The CMA-OAS, first, poses an optimization problem at each gauge-radar support point to obtain both a local multiplicative-additive radar bias decomposition and a regionalization parameter. Second, local biases and regionalization parameters are integrated into an OAS to estimate the multisensor rainfall at the ground level. The procedure is suited to relatively sparse rain gauge networks. To show the procedure, six storms are analyzed at hourly steps over 10,663 km2. Results generally indicated an improved quality with respect to other methods evaluated: a standard mean-field bias adjustment, a spatially variable adjustment with multiplicative factors, and ordinary cokriging.
Resumo:
An efficient data based-modeling algorithm for nonlinear system identification is introduced for radial basis function (RBF) neural networks with the aim of maximizing generalization capability based on the concept of leave-one-out (LOO) cross validation. Each of the RBF kernels has its own kernel width parameter and the basic idea is to optimize the multiple pairs of regularization parameters and kernel widths, each of which is associated with a kernel, one at a time within the orthogonal forward regression (OFR) procedure. Thus, each OFR step consists of one model term selection based on the LOO mean square error (LOOMSE), followed by the optimization of the associated kernel width and regularization parameter, also based on the LOOMSE. Since like our previous state-of-the-art local regularization assisted orthogonal least squares (LROLS) algorithm, the same LOOMSE is adopted for model selection, our proposed new OFR algorithm is also capable of producing a very sparse RBF model with excellent generalization performance. Unlike our previous LROLS algorithm which requires an additional iterative loop to optimize the regularization parameters as well as an additional procedure to optimize the kernel width, the proposed new OFR algorithm optimizes both the kernel widths and regularization parameters within the single OFR procedure, and consequently the required computational complexity is dramatically reduced. Nonlinear system identification examples are included to demonstrate the effectiveness of this new approach in comparison to the well-known approaches of support vector machine and least absolute shrinkage and selection operator as well as the LROLS algorithm.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion for the finite mixture model. Since the constraint on the mixing coefficients of the finite mixture model is on the multinomial manifold, we use the well-known Riemannian trust-region (RTR) algorithm for solving this problem. The first- and second-order Riemannian geometry of the multinomial manifold are derived and utilized in the RTR algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with an accuracy competitive with those of existing kernel density estimators.
Resumo:
A new class of parameter estimation algorithms is introduced for Gaussian process regression (GPR) models. It is shown that the integration of the GPR model with probability distance measures of (i) the integrated square error and (ii) Kullback–Leibler (K–L) divergence are analytically tractable. An efficient coordinate descent algorithm is proposed to iteratively estimate the kernel width using golden section search which includes a fast gradient descent algorithm as an inner loop to estimate the noise variance. Numerical examples are included to demonstrate the effectiveness of the new identification approaches.
Resumo:
In low-temperature anti-ferromagnetic LaMnO3, strong and localized electronic interactions among Mn 3d electrons prevent a satisfactory description from standard local density and generalized gradient approximations in density functional theory calculations. Here we show that the strong on-site electronic interactions are described well only by using direct and exchange corrections to the intra-orbital Coulomb potential. Only DFT+U calculations with explicit exchange corrections produce a balanced picture of electronic, magnetic and structural observables in agreement with experiment. To understand the reason, a rewriting of the functional form of the +U corrections is presented that leads to a more physical and transparent understanding of the effect of these correction terms. The approach highlights the importance of Hund’s coupling (intra-orbital exchange) in providing anisotropy across the occupation and energy eigenvalues of the Mn d states. This intra-orbital exchange is the key to fully activating the Jahn-Teller distortion, reproducing the experimental band gap and stabilizing the correct magnetic ground state in LaMnO3. The best parameter values for LaMnO3 within the DFT(PBEsol)+U framework are determined to be U = 8 eV and J = 1.9 eV.
Resumo:
The concept of resilience has emerged out of a complex literature that has sought to make sense of an increasingly interconnected world that appears ever more beset by crises. Resilience’s appeal is reflected by the burgeoning mass of literature that has appeared on the subject in the past five years. However, there is ongoing debate surrounding its usage, with some commentators claiming that the term is inherently too conservative a one to be usefully applied to situations of vulnerability in which more radical social change is required. This article extends existing efforts to formulate more transformative notions of resilience by reframing it as a double-edged outcome of the pre-reflective and critical ways in which actors draw upon their internal structures following the occurrence of a negative event, thus reproducing or changing the external structural context that gave rise to the event in the first place. By employing a structuration-inspired analysis to the study of small-scale farmer responses to a flood-induced resettlement programme in central Mozambique, the article presents a systematic approach to the examination of resilience in light of this reframing. The case study findings suggest that more attention should be paid to the facilitative, as well as constraining, nature of structures if vulnerable populations are to be assisted in their efforts to exert transformative capacity over the wider conditions that give rise to their difficulties.
Investigating the relationship between Eurasian snow and the Arctic Oscillation with data and models
Resumo:
Recent research suggests Eurasian snow-covered area (SCA) influences the Arctic Oscillation (AO) via the polar vortex. This could be important for Northern Hemisphere winter season forecasting. A fairly strong negative correlation between October SCA and the AO, based on both monthly and daily observational data, has been noted in the literature. While reproducing these previous links when using the same data, we find no further evidence of the link when using an independent satellite data source, or when using a climate model.
Resumo:
A generalization of Arakawa and Schubert's convective quasi-equilibrium principle is presented for a closure formulation of mass-flux convection parameterization. The original principle is based on the budget of the cloud work function. This principle is generalized by considering the budget for a vertical integral of an arbitrary convection-related quantity. The closure formulation includes Arakawa and Schubert's quasi-equilibrium, as well as both CAPE and moisture closures as special cases. The formulation also includes new possibilities for considering vertical integrals that are dependent on convective-scale variables, such as the moisture within convection. The generalized convective quasi-equilibrium is defined by a balance between large-scale forcing and convective response for a given vertically-integrated quantity. The latter takes the form of a convolution of a kernel matrix and a mass-flux spectrum, as in the original convective quasi-equilibrium. The kernel reduces to a scalar when either a bulk formulation is adopted, or only large-scale variables are considered within the vertical integral. Various physical implications of the generalized closure are discussed. These include the possibility that precipitation might be considered as a potentially-significant contribution to the large-scale forcing. Two dicta are proposed as guiding physical principles for the specifying a suitable vertically-integrated quantity.
Resumo:
A set of four eddy-permitting global ocean reanalyses produced in the framework of the MyOcean project have been compared over the altimetry period 1993–2011. The main differences among the reanalyses used here come from the data assimilation scheme implemented to control the ocean state by inserting reprocessed observations of sea surface temperature (SST), in situ temperature and salinity profiles, sea level anomaly and sea-ice concentration. A first objective of this work includes assessing the interannual variability and trends for a series of parameters, usually considered in the community as essential ocean variables: SST, sea surface salinity, temperature and salinity averaged over meaningful layers of the water column, sea level, transports across pre-defined sections, and sea ice parameters. The eddy-permitting nature of the global reanalyses allows also to estimate eddy kinetic energy. The results show that in general there is a good consistency between the different reanalyses. An intercomparison against experiments without data assimilation was done during the MyOcean project and we conclude that data assimilation is crucial for correctly simulating some quantities such as regional trends of sea level as well as the eddy kinetic energy. A second objective is to show that the ensemble mean of reanalyses can be evaluated as one single system regarding its reliability in reproducing the climate signals, where both variability and uncertainties are assessed through the ensemble spread and signal-to-noise ratio. The main advantage of having access to several reanalyses differing in the way data assimilation is performed is that it becomes possible to assess part of the total uncertainty. Given the fact that we use very similar ocean models and atmospheric forcing, we can conclude that the spread of the ensemble of reanalyses is mainly representative of our ability to gauge uncertainty in the assimilation methods. This uncertainty changes a lot from one ocean parameter to another, especially in global indices. However, despite several caveats in the design of the multi-system ensemble, the main conclusion from this study is that an eddy-permitting multi-system ensemble approach has become mature and our results provide a first step towards a systematic comparison of eddy-permitting global ocean reanalyses aimed at providing robust conclusions on the recent evolution of the oceanic state.