979 resultados para Rademacher averages
Resumo:
We investigate the use of certain data-dependent estimates of the complexity of a function class, called Rademacher and Gaussian complexities. In a decision theoretic setting, we prove general risk bounds in terms of these complexities. We consider function classes that can be expressed as combinations of functions from basis classes and show how the Rademacher and Gaussian complexities of such a function class can be bounded in terms of the complexity of the basis classes. We give examples of the application of these techniques in finding data-dependent risk bounds for decision trees, neural networks and support vector machines.
Resumo:
We propose new bounds on the error of learning algorithms in terms of a data-dependent notion of complexity. The estimates we establish give optimal rates and are based on a local and empirical version of Rademacher averages, in the sense that the Rademacher averages are computed from the data, on a subset of functions with small empirical error. We present some applications to classification and prediction with convex function classes, and with kernel classes in particular.
Resumo:
Tools known as maximal functions are frequently used in harmonic analysis when studying local behaviour of functions. Typically they measure the suprema of local averages of non-negative functions. It is essential that the size (more precisely, the L^p-norm) of the maximal function is comparable to the size of the original function. When dealing with families of operators between Banach spaces we are often forced to replace the uniform bound with the larger R-bound. Hence such a replacement is also needed in the maximal function for functions taking values in spaces of operators. More specifically, the suprema of norms of local averages (i.e. their uniform bound in the operator norm) has to be replaced by their R-bound. This procedure gives us the Rademacher maximal function, which was introduced by Hytönen, McIntosh and Portal in order to prove a certain vector-valued Carleson's embedding theorem. They noticed that the sizes of an operator-valued function and its Rademacher maximal function are comparable for many common range spaces, but not for all. Certain requirements on the type and cotype of the spaces involved are necessary for this comparability, henceforth referred to as the “RMF-property”. It was shown, that other objects and parameters appearing in the definition, such as the domain of functions and the exponent p of the norm, make no difference to this. After a short introduction to randomized norms and geometry in Banach spaces we study the Rademacher maximal function on Euclidean spaces. The requirements on the type and cotype are considered, providing examples of spaces without RMF. L^p-spaces are shown to have RMF not only for p greater or equal to 2 (when it is trivial) but also for 1 < p < 2. A dyadic version of Carleson's embedding theorem is proven for scalar- and operator-valued functions. As the analysis with dyadic cubes can be generalized to filtrations on sigma-finite measure spaces, we consider the Rademacher maximal function in this case as well. It turns out that the RMF-property is independent of the filtration and the underlying measure space and that it is enough to consider very simple ones known as Haar filtrations. Scalar- and operator-valued analogues of Carleson's embedding theorem are also provided. With the RMF-property proven independent of the underlying measure space, we can use probabilistic notions and formulate it for martingales. Following a similar result for UMD-spaces, a weak type inequality is shown to be (necessary and) sufficient for the RMF-property. The RMF-property is also studied using concave functions giving yet another proof of its independence from various parameters.
Resumo:
In the multi-view approach to semisupervised learning, we choose one predictor from each of multiple hypothesis classes, and we co-regularize our choices by penalizing disagreement among the predictors on the unlabeled data. We examine the co-regularization method used in the co-regularized least squares (CoRLS) algorithm, in which the views are reproducing kernel Hilbert spaces (RKHS's), and the disagreement penalty is the average squared difference in predictions. The final predictor is the pointwise average of the predictors from each view. We call the set of predictors that can result from this procedure the co-regularized hypothesis class. Our main result is a tight bound on the Rademacher complexity of the co-regularized hypothesis class in terms of the kernel matrices of each RKHS. We find that the co-regularization reduces the Rademacher complexity by an amount that depends on the distance between the two views, as measured by a data dependent metric. We then use standard techniques to bound the gap between training error and test error for the CoRLS algorithm. Experimentally, we find that the amount of reduction in complexity introduced by co regularization correlates with the amount of improvement that co-regularization gives in the CoRLS algorithm.
Resumo:
Biological responses to climate change are typically communicated in generalized terms such as poleward and altitudinal range shifts, but adaptation efforts relevant to management decisions often require forecasts that incorporate the interaction of multiple climatic and nonclimatic stressors at far smaller spatiotemporal scales. We argue that the desire for generalizations has, ironically, contributed to the frequent conflation of weather with climate, even within the scientific community. As a result, current predictions of ecological responses to climate change, and the design of experiments to understand underlying mechanisms, are too often based on broad-scale trends and averages that at a proximate level may have very little to do with the vulnerability of organisms and ecosystems. The creation of biologically relevant metrics of environmental change that incorporate the physical mechanisms by which climate trains patterns of weather, coupled with knowledge of how organisms and ecosystems respond to these changes, can offer insight into which aspects of climate change may be most important to monitor and predict. This approach also has the potential to enhance our ability to communicate impacts of climate change to nonscientists and especially to stakeholders attempting to enact climate change adaptation policies.
Resumo:
Four problems of physical interest have been solved in this thesis using the path integral formalism. Using the trigonometric expansion method of Burton and de Borde (1955), we found the kernel for two interacting one dimensional oscillators• The result is the same as one would obtain using a normal coordinate transformation, We next introduced the method of Papadopolous (1969), which is a systematic perturbation type method specifically geared to finding the partition function Z, or equivalently, the Helmholtz free energy F, of a system of interacting oscillators. We applied this method to the next three problems considered• First, by summing the perturbation expansion, we found F for a system of N interacting Einstein oscillators^ The result obtained is the same as the usual result obtained by Shukla and Muller (1972) • Next, we found F to 0(Xi)f where A is the usual Tan Hove ordering parameter* The results obtained are the same as those of Shukla and Oowley (1971), who have used a diagrammatic procedure, and did the necessary sums in Fourier space* We performed the work in temperature space• Finally, slightly modifying the method of Papadopolous, we found the finite temperature expressions for the Debyecaller factor in Bravais lattices, to 0(AZ) and u(/K/ j,where K is the scattering vector* The high temperature limit of the expressions obtained here, are in complete agreement with the classical results of Maradudin and Flinn (1963) .
Resumo:
In this paper we make use of some stochastic volatility models to analyse the behaviour of a weekly ozone average measurements series. The models considered here have been used previously in problems related to financial time series. Two models are considered and their parameters are estimated using a Bayesian approach based on Markov chain Monte Carlo (MCMC) methods. Both models are applied to the data provided by the monitoring network of the Metropolitan Area of Mexico City. The selection of the best model for that specific data set is performed using the Deviance Information Criterion and the Conditional Predictive Ordinate method.