904 resultados para Random coefficient multinomial logit
Resumo:
In order to validate the reported precision of space‐based atmospheric composition measurements, validation studies often focus on measurements in the tropical stratosphere, where natural variability is weak. The scatter in tropical measurements can then be used as an upper limit on single‐profile measurement precision. Here we introduce a method of quantifying the scatter of tropical measurements which aims to minimize the effects of short‐term atmospheric variability while maintaining large enough sample sizes that the results can be taken as representative of the full data set. We apply this technique to measurements of O3, HNO3, CO, H2O, NO, NO2, N2O, CH4, CCl2F2, and CCl3F produced by the Atmospheric Chemistry Experiment–Fourier Transform Spectrometer (ACE‐FTS). Tropical scatter in the ACE‐FTS retrievals is found to be consistent with the reported random errors (RREs) for H2O and CO at altitudes above 20 km, validating the RREs for these measurements. Tropical scatter in measurements of NO, NO2, CCl2F2, and CCl3F is roughly consistent with the RREs as long as the effect of outliers in the data set is reduced through the use of robust statistics. The scatter in measurements of O3, HNO3, CH4, and N2O in the stratosphere, while larger than the RREs, is shown to be consistent with the variability simulated in the Canadian Middle Atmosphere Model. This result implies that, for these species, stratospheric measurement scatter is dominated by natural variability, not random error, which provides added confidence in the scientific value of single‐profile measurements.
Resumo:
Ensemble learning can be used to increase the overall classification accuracy of a classifier by generating multiple base classifiers and combining their classification results. A frequently used family of base classifiers for ensemble learning are decision trees. However, alternative approaches can potentially be used, such as the Prism family of algorithms that also induces classification rules. Compared with decision trees, Prism algorithms generate modular classification rules that cannot necessarily be represented in the form of a decision tree. Prism algorithms produce a similar classification accuracy compared with decision trees. However, in some cases, for example, if there is noise in the training and test data, Prism algorithms can outperform decision trees by achieving a higher classification accuracy. However, Prism still tends to overfit on noisy data; hence, ensemble learners have been adopted in this work to reduce the overfitting. This paper describes the development of an ensemble learner using a member of the Prism family as the base classifier to reduce the overfitting of Prism algorithms on noisy datasets. The developed ensemble classifier is compared with a stand-alone Prism classifier in terms of classification accuracy and resistance to noise.
Resumo:
Peatland habitats are important carbon stocks that also have the potential to be significant sources of greenhouse gases, particularly when subject to changes such as artificial drainage and application of fertilizer. Models aiming to estimate greenhouse gas release from peatlands require an accurate estimate of the diffusion coefficient of gas transport through soil (Ds). The availability of specific measurements for peatland soils is currently limited. This study measured Ds for a peat soil with an overlying clay horizon and compared values with those from widely available models. The Ds value of a sandy loam reference soil was measured for comparison. Using the Currie (1960) method, Ds was measured between an air-filled porosity (ϵ) range of 0 and 0.5 cm3 cm−3. Values of Ds for the peat cores ranged between 3.2 × 10−4 and 4.4 × 10−3 m2 hour−1, for loamy clay cores between 0 and 4.7 × 10−3 m2 hour−1 and for the sandy reference soil they were between 5.4 × 10−4 and 3.4 × 10−3 m2 hour−1. The agreement of measured and modelled values of relative diffusivity (Ds/D0, with D0 the diffusion coefficient through free air) varied with soil type; however, the Campbell (1985) model provided the best replication of measured values for all soils. This research therefore suggests that the use of the Campbell model in the absence of accurately measured Ds and porosity values for a study soil would be appropriate. Future research into methods to reduce shrinkage of peat during measurement and therefore allow measurement of Ds for a greater range of ϵ would be beneficial.
Resumo:
We study the homogeneous Riemann-Hilbert problem with a vanishing scalar-valued continuous coefficient. We characterize non-existence of nontrivial solutions in the case where the coefficient has its values along several rays starting from the origin. As a consequence, some results on injectivity and existence of eigenvalues of Toeplitz operators in Hardy spaces are obtained.
Resumo:
In the present paper we study the approximation of functions with bounded mixed derivatives by sparse tensor product polynomials in positive order tensor product Sobolev spaces. We introduce a new sparse polynomial approximation operator which exhibits optimal convergence properties in L2 and tensorized View the MathML source simultaneously on a standard k-dimensional cube. In the special case k=2 the suggested approximation operator is also optimal in L2 and tensorized H1 (without essential boundary conditions). This allows to construct an optimal sparse p-version FEM with sparse piecewise continuous polynomial splines, reducing the number of unknowns from O(p2), needed for the full tensor product computation, to View the MathML source, required for the suggested sparse technique, preserving the same optimal convergence rate in terms of p. We apply this result to an elliptic differential equation and an elliptic integral equation with random loading and compute the covariances of the solutions with View the MathML source unknowns. Several numerical examples support the theoretical estimates.
Resumo:
In order to examine metacognitive accuracy (i.e., the relationship between metacognitive judgment and memory performance), researchers often rely on by-participant analysis, where metacognitive accuracy (e.g., resolution, as measured by the gamma coefficient or signal detection measures) is computed for each participant and the computed values are entered into group-level statistical tests such as the t-test. In the current work, we argue that the by-participant analysis, regardless of the accuracy measurements used, would produce a substantial inflation of Type-1 error rates, when a random item effect is present. A mixed-effects model is proposed as a way to effectively address the issue, and our simulation studies examining Type-1 error rates indeed showed superior performance of mixed-effects model analysis as compared to the conventional by-participant analysis. We also present real data applications to illustrate further strengths of mixed-effects model analysis. Our findings imply that caution is needed when using the by-participant analysis, and recommend the mixed-effects model analysis.
Resumo:
In this paper we develop and apply methods for the spectral analysis of non-selfadjoint tridiagonal infinite and finite random matrices, and for the spectral analysis of analogous deterministic matrices which are pseudo-ergodic in the sense of E. B. Davies (Commun. Math. Phys. 216 (2001), 687–704). As a major application to illustrate our methods we focus on the “hopping sign model” introduced by J. Feinberg and A. Zee (Phys. Rev. E 59 (1999), 6433–6443), in which the main objects of study are random tridiagonal matrices which have zeros on the main diagonal and random ±1’s as the other entries. We explore the relationship between spectral sets in the finite and infinite matrix cases, and between the semi-infinite and bi-infinite matrix cases, for example showing that the numerical range and p-norm ε - pseudospectra (ε > 0, p ∈ [1,∞] ) of the random finite matrices converge almost surely to their infinite matrix counterparts, and that the finite matrix spectra are contained in the infinite matrix spectrum Σ. We also propose a sequence of inclusion sets for Σ which we show is convergent to Σ, with the nth element of the sequence computable by calculating smallest singular values of (large numbers of) n×n matrices. We propose similar convergent approximations for the 2-norm ε -pseudospectra of the infinite random matrices, these approximations sandwiching the infinite matrix pseudospectra from above and below.
Resumo:
A new sparse kernel density estimator is introduced based on the minimum integrated square error criterion for the finite mixture model. Since the constraint on the mixing coefficients of the finite mixture model is on the multinomial manifold, we use the well-known Riemannian trust-region (RTR) algorithm for solving this problem. The first- and second-order Riemannian geometry of the multinomial manifold are derived and utilized in the RTR algorithm. Numerical examples are employed to demonstrate that the proposed approach is effective in constructing sparse kernel density estimators with an accuracy competitive with those of existing kernel density estimators.
Resumo:
The induction of classification rules from previously unseen examples is one of the most important data mining tasks in science as well as commercial applications. In order to reduce the influence of noise in the data, ensemble learners are often applied. However, most ensemble learners are based on decision tree classifiers which are affected by noise. The Random Prism classifier has recently been proposed as an alternative to the popular Random Forests classifier, which is based on decision trees. Random Prism is based on the Prism family of algorithms, which is more robust to noise. However, like most ensemble classification approaches, Random Prism also does not scale well on large training data. This paper presents a thorough discussion of Random Prism and a recently proposed parallel version of it called Parallel Random Prism. Parallel Random Prism is based on the MapReduce programming paradigm. The paper provides, for the first time, novel theoretical analysis of the proposed technique and in-depth experimental study that show that Parallel Random Prism scales well on a large number of training examples, a large number of data features and a large number of processors. Expressiveness of decision rules that our technique produces makes it a natural choice for Big Data applications where informed decision making increases the user’s trust in the system.
Resumo:
Let X be a locally compact Polish space. A random measure on X is a probability measure on the space of all (nonnegative) Radon measures on X. Denote by K(X) the cone of all Radon measures η on X which are of the form η =
Resumo:
We consider the billiard dynamics in a non-compact set of ℝ d that is constructed as a bi-infinite chain of translated copies of the same d-dimensional polytope. A random configuration of semi-dispersing scatterers is placed in each copy. The ensemble of dynamical systems thus defined, one for each global realization of the scatterers, is called quenched random Lorentz tube. Under some fairly general conditions, we prove that every system in the ensemble is hyperbolic and almost every system is recurrent, ergodic, and enjoys some higher chaotic properties.
Resumo:
We consider the billiard dynamics in a striplike set that is tessellated by countably many translated copies of the same polygon. A random configuration of semidispersing scatterers is placed in each copy. The ensemble of dynamical systems thus defined, one for each global choice of scatterers, is called quenched random Lorentz tube. We prove that under general conditions, almost every system in the ensemble is recurrent.
Resumo:
There remains large disagreement between ice-water path (IWP) in observational data sets, largely because the sensors observe different parts of the ice particle size distribution. A detailed comparison of retrieved IWP from satellite observations in the Tropics (!30 " latitude) in 2007 was made using collocated measurements. The radio detection and ranging(radar)/light detection and ranging (lidar) (DARDAR) IWP data set, based on combined radar/lidar measurements, is used as a reference because it provides arguably the best estimate of the total column IWP. For each data set, usable IWP dynamic ranges are inferred from this comparison. IWP retrievals based on solar reflectance measurements, in the moderate resolution imaging spectroradiometer (MODIS), advanced very high resolution radiometer–based Climate Monitoring Satellite Applications Facility (CMSAF), and Pathfinder Atmospheres-Extended (PATMOS-x) datasets, were found to be correlated with DARDAR over a large IWP range (~20–7000 g m -2 ). The random errors of the collocated data sets have a close to lognormal distribution, and the combined random error of MODIS and DARDAR is less than a factor of 2, which also sets the upper limit for MODIS alone. In the same way, the upper limit for the random error of all considered data sets is determined. Data sets based on passive microwave measurements, microwave surface and precipitation products system (MSPPS), microwave integrated retrieval system (MiRS), and collocated microwave only (CMO), are largely correlated with DARDAR for IWP values larger than approximately 700 g m -2 . The combined uncertainty between these data sets and DARDAR in this range is slightly less MODIS-DARDAR, but the systematic bias is nearly an order of magnitude.
Resumo:
We establish a methodology for calculating uncertainties in sea surface temperature estimates from coefficient based satellite retrievals. The uncertainty estimates are derived independently of in-situ data. This enables validation of both the retrieved SSTs and their uncertainty estimate using in-situ data records. The total uncertainty budget is comprised of a number of components, arising from uncorrelated (eg. noise), locally systematic (eg. atmospheric), large scale systematic and sampling effects (for gridded products). The importance of distinguishing these components arises in propagating uncertainty across spatio-temporal scales. We apply the method to SST data retrieved from the Advanced Along Track Scanning Radiometer (AATSR) and validate the results for two different SST retrieval algorithms, both at a per pixel level and for gridded data. We find good agreement between our estimated uncertainties and validation data. This approach to calculating uncertainties in SST retrievals has a wider application to data from other instruments and retrieval of other geophysical variables.
Resumo:
The co-polar correlation coefficient (ρhv) has many applications, including hydrometeor classification, ground clutter and melting layer identification, interpretation of ice microphysics and the retrieval of rain drop size distributions (DSDs). However, we currently lack the quantitative error estimates that are necessary if these applications are to be fully exploited. Previous error estimates of ρhv rely on knowledge of the unknown "true" ρhv and implicitly assume a Gaussian probability distribution function of ρhv samples. We show that frequency distributions of ρhv estimates are in fact highly negatively skewed. A new variable: L = -log10(1 - ρhv) is defined, which does have Gaussian error statistics, and a standard deviation depending only on the number of independent radar pulses. This is verified using observations of spherical drizzle drops, allowing, for the first time, the construction of rigorous confidence intervals in estimates of ρhv. In addition, we demonstrate how the imperfect co-location of the horizontal and vertical polarisation sample volumes may be accounted for. The possibility of using L to estimate the dispersion parameter (µ) in the gamma drop size distribution is investigated. We find that including drop oscillations is essential for this application, otherwise there could be biases in retrieved µ of up to ~8. Preliminary results in rainfall are presented. In a convective rain case study, our estimates show µ to be substantially larger than 0 (an exponential DSD). In this particular rain event, rain rate would be overestimated by up to 50% if a simple exponential DSD is assumed.