46 resultados para large sample distributions


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper investigates the price effect of EPC ratings on the residential dwelling prices in Wales. It examines the capitalisation of energy efficiency ratings into house prices using two approaches. The first adopts a cross-sectional framework to investigate the effect of EPC band (and EPC rating) on a large sample of dwelling transactions. The second approach is based on a repeat-sales methodology to examine the impact of EPC band and rating on house price appreciation. The results show that, controlling for other price influencing dwelling characteristics, EPC band does affect house prices. This observed influence of EPC on price may not be a result of energy performance alone; the effect may be due to non-energy related benefits associated with certain types, specifications and ages of dwellings or there may be unobserved quality differences unrelated to energy performance such as better quality fittings and materials. An analysis of the private rental segment reveals that, in contrast to the general market, low-EPC rated properties were not traded at a significant discount, suggesting different implicit prices of potential energy savings for landlords and owner-occupiers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Combining data on structural characteristics and economic performance for a large sample of Italian firms with data on exporting and importing activity, we uncover evidence supporting recent theories on firm heterogeneity and international trade, together with some new facts. In particular, we find that importing is associated with substantial firm heterogeneity. First, we document that trade is more concentrated than employment and sales, and show that importing is even more concentrated than exporting both within sectors and along the sector- and country-extensive margins. Second, while supporting the fact that firms involved in both are the best performers, we also find that firms involved only in importing activities perform better than those involved only in exporting. Our evidence suggests there is a strong self-selection effect in the case of importers and the performance premia of internationalised firms correlate relatively more with the degree of geographical and sectoral diversification of imports.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The challenge of moving past the classic Window Icons Menus Pointer (WIMP) interface, i.e. by turning it ‘3D’, has resulted in much research and development. To evaluate the impact of 3D on the ‘finding a target picture in a folder’ task, we built a 3D WIMP interface that allowed the systematic manipulation of visual depth, visual aides, semantic category distribution of targets versus non-targets; and the detailed measurement of lower-level stimuli features. Across two separate experiments, one large sample web-based experiment, to understand associations, and one controlled lab environment, using eye tracking to understand user focus, we investigated how visual depth, use of visual aides, use of semantic categories, and lower-level stimuli features (i.e. contrast, colour and luminance) impact how successfully participants are able to search for, and detect, the target image. Moreover in the lab-based experiment, we captured pupillometry measurements to allow consideration of the influence of increasing cognitive load as a result of either an increasing number of items on the screen, or due to the inclusion of visual depth. Our findings showed that increasing the visible layers of depth, and inclusion of converging lines, did not impact target detection times, errors, or failure rates. Low-level features, including colour, luminance, and number of edges, did correlate with differences in target detection times, errors, and failure rates. Our results also revealed that semantic sorting algorithms significantly decreased target detection times. Increased semantic contrasts between a target and its neighbours correlated with an increase in detection errors. Finally, pupillometric data did not provide evidence of any correlation between the number of visible layers of depth and pupil size, however, using structural equation modelling, we demonstrated that cognitive load does influence detection failure rates when there is luminance contrasts between the target and its surrounding neighbours. Results suggest that WIMP interaction designers should consider stimulus-driven factors, which were shown to influence the efficiency with which a target icon can be found in a 3D WIMP interface.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Global climate and weather models tend to produce rainfall that is too light and too regular over the tropical ocean. This is likely because of convective parametrizations, but the problem is not well understood. Here, distributions of precipitation rates are analyzed for high-resolution UK Met Office Unified Model simulations of a 10 day case study over a large tropical domain (∼20°S–20°N and 42°E–180°E). Simulations with 12 km grid length and parametrized convection have too many occurrences of light rain and too few of heavier rain when interpolated onto a 1° grid and compared with Tropical Rainfall Measuring Mission (TRMM) data. In fact, this version of the model appears to have a preferred scale of rainfall around 0.4 mm h−1 (10 mm day−1), unlike observations of tropical rainfall. On the other hand, 4 km grid length simulations with explicit convection produce distributions much more similar to TRMM observations. The apparent preferred scale at lighter rain rates seems to be a feature of the convective parametrization rather than the coarse resolution, as demonstrated by results from 12 km simulations with explicit convection and 40 km simulations with parametrized convection. In fact, coarser resolution models with explicit convection tend to have even more heavy rain than observed. Implications for models using convective parametrizations, including interactions of heating and moistening profiles with larger scales, are discussed. One important implication is that the explicit convection 4 km model has temperature and moisture tendencies that favour transitions in the convective regime. Also, the 12 km parametrized convection model produces a more stable temperature profile at its extreme high-precipitation range, which may reduce the chance of very heavy rainfall. Further study is needed to determine whether unrealistic precipitation distributions are due to some fundamental limitation of convective parametrizations or whether parametrizations can be improved, in order to better simulate these distributions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A regional study of the prediction of extratropical cyclones by the European Centre for Medium-Range Weather Forecasts (ECMWF) Ensemble Prediction System (EPS) has been performed. An objective feature-tracking method has been used to identify and track the cyclones along the forecast trajectories. Forecast error statistics have then been produced for the position, intensity, and propagation speed of the storms. In previous work, data limitations meant it was only possible to present the diagnostics for the entire Northern Hemisphere (NH) or Southern Hemisphere. A larger data sample has allowed the diagnostics to be computed separately for smaller regions around the globe and has made it possible to explore the regional differences in the prediction of storms by the EPS. Results show that in the NH there is a larger ensemble mean error in the position of storms over the Atlantic Ocean. Further analysis revealed that this is mainly due to errors in the prediction of storm propagation speed rather than in direction. Forecast storms propagate too slowly in all regions, but the bias is about 2 times as large in the NH Atlantic region. The results show that storm intensity is generally overpredicted over the ocean and underpredicted over the land and that the absolute error in intensity is larger over the ocean than over the land. In the NH, large errors occur in the prediction of the intensity of storms that originate as tropical cyclones but then move into the extratropics. The ensemble is underdispersive for the intensity of cyclones (i.e., the spread is smaller than the mean error) in all regions. The spatial patterns of the ensemble mean error and ensemble spread are very different for the intensity of cyclones. Spatial distributions of the ensemble mean error suggest that large errors occur during the growth phase of storm development, but this is not indicated by the spatial distributions of the ensemble spread. In the NH there are further differences. First, the large errors in the prediction of the intensity of cyclones that originate in the tropics are not indicated by the spread. Second, the ensemble mean error is larger over the Pacific Ocean than over the Atlantic, whereas the opposite is true for the spread. The use of a storm-tracking approach, to both weather forecasters and developers of forecast systems, is also discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been generally accepted that the method of moments (MoM) variogram, which has been widely applied in soil science, requires about 100 sites at an appropriate interval apart to describe the variation adequately. This sample size is often larger than can be afforded for soil surveys of agricultural fields or contaminated sites. Furthermore, it might be a much larger sample size than is needed where the scale of variation is large. A possible alternative in such situations is the residual maximum likelihood (REML) variogram because fewer data appear to be required. The REML method is parametric and is considered reliable where there is trend in the data because it is based on generalized increments that filter trend out and only the covariance parameters are estimated. Previous research has suggested that fewer data are needed to compute a reliable variogram using a maximum likelihood approach such as REML, however, the results can vary according to the nature of the spatial variation. There remain issues to examine: how many fewer data can be used, how should the sampling sites be distributed over the site of interest, and how do different degrees of spatial variation affect the data requirements? The soil of four field sites of different size, physiography, parent material and soil type was sampled intensively, and MoM and REML variograms were calculated for clay content. The data were then sub-sampled to give different sample sizes and distributions of sites and the variograms were computed again. The model parameters for the sets of variograms for each site were used for cross-validation. Predictions based on REML variograms were generally more accurate than those from MoM variograms with fewer than 100 sampling sites. A sample size of around 50 sites at an appropriate distance apart, possibly determined from variograms of ancillary data, appears adequate to compute REML variograms for kriging soil properties for precision agriculture and contaminated sites. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper concerns the design and analysis of serial dilution assays to estimate the infectivity of a sample of tissue when it is assumed that the sample contains a finite number of indivisible infectious units such that a subsample will be infectious if it contains one or more of these units. The aim of the study is to estimate the number of infectious units in the original sample. The standard approach to the analysis of data from such a study is based on the assumption of independence of aliquots both at the same dilution level and at different dilution levels, so that the numbers of infectious units in the aliquots follow independent Poisson distributions. An alternative approach is based on calculation of the expected value of the total number of samples tested that are not infectious. We derive the likelihood for the data on the basis of the discrete number of infectious units, enabling calculation of the maximum likelihood estimate and likelihood-based confidence intervals. We use the exact probabilities that are obtained to compare the maximum likelihood estimate with those given by the other methods in terms of bias and standard error and to compare the coverage of the confidence intervals. We show that the methods have very similar properties and conclude that for practical use the method that is based on the Poisson assumption is to be recommended, since it can be implemented by using standard statistical software. Finally we consider the design of serial dilution assays, concluding that it is important that neither the dilution factor nor the number of samples that remain untested should be too large.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the comparison of two formulations in terms of average bioequivalence using the 2 × 2 cross-over design. In a bioequivalence study, the primary outcome is a pharmacokinetic measure, such as the area under the plasma concentration by time curve, which is usually assumed to have a lognormal distribution. The criterion typically used for claiming bioequivalence is that the 90% confidence interval for the ratio of the means should lie within the interval (0.80, 1.25), or equivalently the 90% confidence interval for the differences in the means on the natural log scale should be within the interval (-0.2231, 0.2231). We compare the gold standard method for calculation of the sample size based on the non-central t distribution with those based on the central t and normal distributions. In practice, the differences between the various approaches are likely to be small. Further approximations to the power function are sometimes used to simplify the calculations. These approximations should be used with caution, because the sample size required for a desirable level of power might be under- or overestimated compared to the gold standard method. However, in some situations the approximate methods produce very similar sample sizes to the gold standard method. Copyright © 2005 John Wiley & Sons, Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The problem of estimating the individual probabilities of a discrete distribution is considered. The true distribution of the independent observations is a mixture of a family of power series distributions. First, we ensure identifiability of the mixing distribution assuming mild conditions. Next, the mixing distribution is estimated by non-parametric maximum likelihood and an estimator for individual probabilities is obtained from the corresponding marginal mixture density. We establish asymptotic normality for the estimator of individual probabilities by showing that, under certain conditions, the difference between this estimator and the empirical proportions is asymptotically negligible. Our framework includes Poisson, negative binomial and logarithmic series as well as binomial mixture models. Simulations highlight the benefit in achieving normality when using the proposed marginal mixture density approach instead of the empirical one, especially for small sample sizes and/or when interest is in the tail areas. A real data example is given to illustrate the use of the methodology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a Bayesian method for investigating correlated evolution of discrete binary traits on phylogenetic trees. The method fits a continuous-time Markov model to a pair of traits, seeking the best fitting models that describe their joint evolution on a phylogeny. We employ the methodology of reversible-jump ( RJ) Markov chain Monte Carlo to search among the large number of possible models, some of which conform to independent evolution of the two traits, others to correlated evolution. The RJ Markov chain visits these models in proportion to their posterior probabilities, thereby directly estimating the support for the hypothesis of correlated evolution. In addition, the RJ Markov chain simultaneously estimates the posterior distributions of the rate parameters of the model of trait evolution. These posterior distributions can be used to test among alternative evolutionary scenarios to explain the observed data. All results are integrated over a sample of phylogenetic trees to account for phylogenetic uncertainty. We implement the method in a program called RJ Discrete and illustrate it by analyzing the question of whether mating system and advertisement of estrus by females have coevolved in the Old World monkeys and great apes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins (similar to30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A revised Bayesian algorithm for estimating surface rain rate, convective rain proportion, and latent heating profiles from satellite-borne passive microwave radiometer observations over ocean backgrounds is described. The algorithm searches a large database of cloud-radiative model simulations to find cloud profiles that are radiatively consistent with a given set of microwave radiance measurements. The properties of these radiatively consistent profiles are then composited to obtain best estimates of the observed properties. The revised algorithm is supported by an expanded and more physically consistent database of cloud-radiative model simulations. The algorithm also features a better quantification of the convective and nonconvective contributions to total rainfall, a new geographic database, and an improved representation of background radiances in rain-free regions. Bias and random error estimates are derived from applications of the algorithm to synthetic radiance data, based upon a subset of cloud-resolving model simulations, and from the Bayesian formulation itself. Synthetic rain-rate and latent heating estimates exhibit a trend of high (low) bias for low (high) retrieved values. The Bayesian estimates of random error are propagated to represent errors at coarser time and space resolutions, based upon applications of the algorithm to TRMM Microwave Imager (TMI) data. Errors in TMI instantaneous rain-rate estimates at 0.5°-resolution range from approximately 50% at 1 mm h−1 to 20% at 14 mm h−1. Errors in collocated spaceborne radar rain-rate estimates are roughly 50%–80% of the TMI errors at this resolution. The estimated algorithm random error in TMI rain rates at monthly, 2.5° resolution is relatively small (less than 6% at 5 mm day−1) in comparison with the random error resulting from infrequent satellite temporal sampling (8%–35% at the same rain rate). Percentage errors resulting from sampling decrease with increasing rain rate, and sampling errors in latent heating rates follow the same trend. Averaging over 3 months reduces sampling errors in rain rates to 6%–15% at 5 mm day−1, with proportionate reductions in latent heating sampling errors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Over recent years there has been an increasing deployment of renewable energy generation technologies, particularly large-scale wind farms. As wind farm deployment increases, it is vital to gain a good understanding of how the energy produced is affected by climate variations, over a wide range of time-scales, from short (hours to weeks) to long (months to decades) periods. By relating wind speed at specific sites in the UK to a large-scale climate pattern (the North Atlantic Oscillation or "NAO"), the power generated by a modelled wind turbine under three different NAO states is calculated. It was found that the wind conditions under these NAO states may yield a difference in the mean wind power output of up to 10%. A simple model is used to demonstrate that forecasts of future NAO states can potentially be used to improve month-ahead statistical forecasts of monthly-mean wind power generation. The results confirm that the NAO has a significant impact on the hourly-, daily- and monthly-mean power output distributions from the turbine with important implications for (a) the use of meteorological data (e.g. their relationship to large scale climate patterns) in wind farm site assessment and, (b) the utilisation of seasonal-to-decadal climate forecasts to estimate future wind farm power output. This suggests that further research into the links between large-scale climate variability and wind power generation is both necessary and valuable.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We compare rain event size distributions derived from measurements in climatically different regions, which we find to be well approximated by power laws of similar exponents over broad ranges. Differences can be seen in the large-scale cutoffs of the distributions. Event duration distributions suggest that the scale-free aspects are related to the absence of characteristic scales in the meteorological mesoscale.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Magmas in volcanic conduits commonly contain microlites in association with preexisting phenocrysts, as often indicated by volcanic rock textures. In this study, we present two different experiments that inves- tigate the flow behavior of these bidisperse systems. In the first experiments, rotational rheometric methods are used to determine the rheology of monodisperse and polydisperse suspensions consisting of smaller, prolate particles (microlites) and larger, equant particles (phenocrysts) in a bubble‐free Newtonian liquid (silicate melt). Our data show that increasing the relative proportion of prolate microlites to equant pheno- crysts in a magma at constant total particle content can increase the relative viscosity by up to three orders of magnitude. Consequently, the rheological effect of particles in magmas cannot be modeled by assuming a monodisperse population of particles. We propose a new model that uses interpolated parameters based on the relative proportions of small and large particles and produces a considerably improved fit to the data than earlier models. In a second series of experiments we investigate the textures produced by shearing bimodal suspensions in gradually solidifying epoxy resin in a concentric cylinder setup. The resulting textures show the prolate particles are aligned with the flow lines and spherical particles are found in well‐organized strings, with sphere‐depleted shear bands in high‐shear regions. These observations may explain the measured variation in the shear thinning and yield stress behavior with increasing solid fraction and particle aspect ratio. The implications for magma flow are discussed, and rheological results and tex- tural observations are compared with observations on natural samples.