73 resultados para Error estimate.
Resumo:
To obtain a state-of-the-art benchmark potential energy surface (PES) for the archetypal oxidative addition of the methane C-H bond to the palladium atom, we have explored this PES using a hierarchical series of ab initio methods (Hartree-Fock, second-order Møller-Plesset perturbation theory, fourth-order Møller-Plesset perturbation theory with single, double and quadruple excitations, coupled cluster theory with single and double excitations (CCSD), and with triple excitations treated perturbatively [CCSD(T)]) and hybrid density functional theory using the B3LYP functional, in combination with a hierarchical series of ten Gaussian-type basis sets, up to g polarization. Relativistic effects are taken into account either through a relativistic effective core potential for palladium or through a full four-component all-electron approach. Counterpoise corrected relative energies of stationary points are converged to within 0.1-0.2 kcal/mol as a function of the basis-set size. Our best estimate of kinetic and thermodynamic parameters is -8.1 (-8.3) kcal/mol for the formation of the reactant complex, 5.8 (3.1) kcal/mol for the activation energy relative to the separate reactants, and 0.8 (-1.2) kcal/mol for the reaction energy (zero-point vibrational energy-corrected values in parentheses). This agrees well with available experimental data. Our work highlights the importance of sufficient higher angular momentum polarization functions, f and g, for correctly describing metal-d-electron correlation and, thus, for obtaining reliable relative energies. We show that standard basis sets, such as LANL2DZ+ 1f for palladium, are not sufficiently polarized for this purpose and lead to erroneous CCSD(T) results. B3LYP is associated with smaller basis set superposition errors and shows faster convergence with basis-set size but yields relative energies (in particular, a reaction barrier) that are ca. 3.5 kcal/mol higher than the corresponding CCSD(T) values
Resumo:
Recently, the surprising result that ab initio calculations on benzene and other planar arenes at correlated MP2, MP3, configuration interaction with singles and doubles (CISD), and coupled cluster with singles and doubles levels of theory using standard Pople’s basis sets yield nonplanar minima has been reported. The planar optimized structures turn out to be transition states presenting one or more large imaginary frequencies, whereas single-determinant-based methods lead to the expected planar minima and no imaginary frequencies. It has been suggested that such anomalous behavior can be originated by two-electron basis set incompleteness error. In this work, we show that the reported pitfalls can be interpreted in terms of intramolecular basis set superposition error (BSSE) effects, mostly between the C–H moieties constituting the arenes. We have carried out counterpoise-corrected optimizations and frequency calculations at the Hartree–Fock, B3LYP, MP2, and CISD levels of theory with several basis sets for a number of arenes. In all cases, correcting for intramolecular BSSE fixes the anomalous behavior of the correlated methods, whereas no significant differences are observed in the single-determinant case. Consequently, all systems studied are planar at all levels of theory. The effect of different intramolecular fragment definitions and the particular case of charged species, namely, cyclopentadienyl and indenyl anions, respectively, are also discussed
Resumo:
A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.
Resumo:
Human arteries affected by atherosclerosis are characterized by altered wall viscoelastic properties. The possibility of noninvasively assessing arterial viscoelasticity in vivo would significantly contribute to the early diagnosis and prevention of this disease. This paper presents a noniterative technique to estimate the viscoelastic parameters of a vascular wall Zener model. The approach requires the simultaneous measurement of flow variations and wall displacements, which can be provided by suitable ultrasound Doppler instruments. Viscoelastic parameters are estimated by fitting the theoretical constitutive equations to the experimental measurements using an ARMA parameter approach. The accuracy and sensitivity of the proposed method are tested using reference data generated by numerical simulations of arterial pulsation in which the physiological conditions and the viscoelastic parameters of the model can be suitably varied. The estimated values quantitatively agree with the reference values, showing that the only parameter affected by changing the physiological conditions is viscosity, whose relative error was about 27% even when a poor signal-to-noise ratio is simulated. Finally, the feasibility of the method is illustrated through three measurements made at different flow regimes on a cylindrical vessel phantom, yielding a parameter mean estimation error of 25%.
Resumo:
This paper presents a technique to estimate and model patient-specific pulsatility of cerebral aneurysms over onecardiac cycle, using 3D rotational X-ray angiography (3DRA) acquisitions. Aneurysm pulsation is modeled as a time varying-spline tensor field representing the deformation applied to a reference volume image, thus producing the instantaneousmorphology at each time point in the cardiac cycle. The estimated deformation is obtained by matching multiple simulated projections of the deforming volume to their corresponding original projections. A weighting scheme is introduced to account for the relevance of each original projection for the selected time point. The wide coverage of the projections, together with the weighting scheme, ensures motion consistency in all directions. The technique has been tested on digital and physical phantoms that are realistic and clinically relevant in terms of geometry, pulsation and imaging conditions. Results from digital phantomexperiments demonstrate that the proposed technique is able to recover subvoxel pulsation with an error lower than 10% of the maximum pulsation in most cases. The experiments with the physical phantom allowed demonstrating the feasibility of pulsation estimation as well as identifying different pulsation regions under clinical conditions.
Resumo:
For the standard kernel density estimate, it is known that one can tune the bandwidth such that the expected L1 error is within a constant factor of the optimal L1 error (obtained when one is allowed to choose the bandwidth with knowledge of the density). In this paper, we pose the same problem for variable bandwidth kernel estimates where the bandwidths are allowed to depend upon the location. We show in particular that for positive kernels on the real line, for any data-based bandwidth, there exists a densityfor which the ratio of expected L1 error over optimal L1 error tends to infinity. Thus, the problem of tuning the variable bandwidth in an optimal manner is ``too hard''. Moreover, from the class of counterexamples exhibited in the paper, it appears thatplacing conditions on the densities (monotonicity, convexity, smoothness) does not help.
Resumo:
This paper explores three aspects of strategic uncertainty: its relation to risk, predictability of behavior and subjective beliefs of players. In a laboratory experiment we measure subjects certainty equivalents for three coordination games and one lottery. Behavior in coordination games is related to risk aversion, experience seeking, and age.From the distribution of certainty equivalents we estimate probabilities for successful coordination in a wide range of games. For many games, success of coordination is predictable with a reasonable error rate. The best response to observed behavior is close to the global-game solution. Comparing choices in coordination games with revealed risk aversion, we estimate subjective probabilities for successful coordination. In games with a low coordination requirement, most subjects underestimate the probability of success. In games with a high coordination requirement, most subjects overestimate this probability. Estimating probabilistic decision models, we show that the quality of predictions can be improved when individual characteristics are taken into account. Subjects behavior is consistent with probabilistic beliefs about the aggregate outcome, but inconsistent with probabilistic beliefs about individual behavior.
Resumo:
There is a controversial debate about the effects of permanent disability benefits on labormarket behavior. In this paper we estimate equations for deserving and receiving disabilitybenefits to evaluate the award error as the difference in the probability of receiving anddeserving using survey data from Spain. Our results indicate that individuals aged between55 and 59, self-employers or working in an agricultural sector have a probability of receiving a benefit without deserving it significantly higher than the rest of individuals. We also find evidence of gender discrimination since male have a significantly higher probability of receiving a benefit without deserving it. This seems to confirm that disability benefits are being used as an instrument for exiting the labor market for some individuals approaching the early retirement or those who do not have right to retire early. Taking into account that awarding process depends on Social Security Provincial Department, this means that some departments are applying loosely the disability requirements for granting disability benefits.
Resumo:
Consider the problem of testing k hypotheses simultaneously. In this paper,we discuss finite and large sample theory of stepdown methods that providecontrol of the familywise error rate (FWE). In order to improve upon theBonferroni method or Holm's (1979) stepdown method, Westfall and Young(1993) make eective use of resampling to construct stepdown methods thatimplicitly estimate the dependence structure of the test statistics. However,their methods depend on an assumption called subset pivotality. The goalof this paper is to construct general stepdown methods that do not requiresuch an assumption. In order to accomplish this, we take a close look atwhat makes stepdown procedures work, and a key component is a monotonicityrequirement of critical values. By imposing such monotonicity on estimatedcritical values (which is not an assumption on the model but an assumptionon the method), it is demonstrated that the problem of constructing a validmultiple test procedure which controls the FWE can be reduced to the problemof contructing a single test which controls the usual probability of a Type 1error. This reduction allows us to draw upon an enormous resamplingliterature as a general means of test contruction.
Resumo:
This work is part of a project studying the performance of model basedestimators in a small area context. We have chosen a simple statisticalapplication in which we estimate the growth rate of accupation for severalregions of Spain. We compare three estimators: the direct one based onstraightforward results from the survey (which is unbiassed), and a thirdone which is based in a statistical model and that minimizes the mean squareerror.
Resumo:
Given $n$ independent replicates of a jointly distributed pair $(X,Y)\in {\cal R}^d \times {\cal R}$, we wish to select from a fixed sequence of model classes ${\cal F}_1, {\cal F}_2, \ldots$ a deterministic prediction rule $f: {\cal R}^d \to {\cal R}$ whose risk is small. We investigate the possibility of empirically assessingthe {\em complexity} of each model class, that is, the actual difficulty of the estimation problem within each class. The estimated complexities are in turn used to define an adaptive model selection procedure, which is based on complexity penalized empirical risk.The available data are divided into two parts. The first is used to form an empirical cover of each model class, and the second is used to select a candidate rule from each cover based on empirical risk. The covering radii are determined empirically to optimize a tight upper bound on the estimation error. An estimate is chosen from the list of candidates in order to minimize the sum of class complexity and empirical risk. A distinguishing feature of the approach is that the complexity of each model class is assessed empirically, based on the size of its empirical cover.Finite sample performance bounds are established for the estimates, and these bounds are applied to several non-parametric estimation problems. The estimates are shown to achieve a favorable tradeoff between approximation and estimation error, and to perform as well as if the distribution-dependent complexities of the model classes were known beforehand. In addition, it is shown that the estimate can be consistent,and even possess near optimal rates of convergence, when each model class has an infinite VC or pseudo dimension.For regression estimation with squared loss we modify our estimate to achieve a faster rate of convergence.
Resumo:
The classical binary classification problem is investigatedwhen it is known in advance that the posterior probability function(or regression function) belongs to some class of functions. We introduceand analyze a method which effectively exploits this knowledge. The methodis based on minimizing the empirical risk over a carefully selected``skeleton'' of the class of regression functions. The skeleton is acovering of the class based on a data--dependent metric, especiallyfitted for classification. A new scale--sensitive dimension isintroduced which is more useful for the studied classification problemthan other, previously defined, dimension measures. This fact isdemonstrated by performance bounds for the skeleton estimate in termsof the new dimension.
Resumo:
We continue the development of a method for the selection of a bandwidth or a number of design parameters in density estimation. We provideexplicit non-asymptotic density-free inequalities that relate the $L_1$ error of the selected estimate with that of the best possible estimate,and study in particular the connection between the richness of the classof density estimates and the performance bound. For example, our methodallows one to pick the bandwidth and kernel order in the kernel estimatesimultaneously and still assure that for {\it all densities}, the $L_1$error of the corresponding kernel estimate is not larger than aboutthree times the error of the estimate with the optimal smoothing factor and kernel plus a constant times $\sqrt{\log n/n}$, where $n$ is the sample size, and the constant only depends on the complexity of the family of kernels used in the estimate. Further applications include multivariate kernel estimates, transformed kernel estimates, and variablekernel estimates.
Resumo:
Ground clutter caused by anomalous propagation (anaprop) can affect seriously radar rain rate estimates, particularly in fully automatic radar processing systems, and, if not filtered, can produce frequent false alarms. A statistical study of anomalous propagation detected from two operational C-band radars in the northern Italian region of Emilia Romagna is discussed, paying particular attention to its diurnal and seasonal variability. The analysis shows a high incidence of anaprop in summer, mainly in the morning and evening, due to the humid and hot summer climate of the Po Valley, particularly in the coastal zone. Thereafter, a comparison between different techniques and datasets to retrieve the vertical profile of the refractive index gradient in the boundary layer is also presented. In particular, their capability to detect anomalous propagation conditions is compared. Furthermore, beam path trajectories are simulated using a multilayer ray-tracing model and the influence of the propagation conditions on the beam trajectory and shape is examined. High resolution radiosounding data are identified as the best available dataset to reproduce accurately the local propagation conditions, while lower resolution standard TEMP data suffers from interpolation degradation and Numerical Weather Prediction model data (Lokal Model) are able to retrieve a tendency to superrefraction but not to detect ducting conditions. Observing the ray tracing of the centre, lower and upper limits of the radar antenna 3-dB half-power main beam lobe it is concluded that ducting layers produce a change in the measured volume and in the power distribution that can lead to an additional error in the reflectivity estimate and, subsequently, in the estimated rainfall rate.