935 resultados para test data generation
Resumo:
This paper presents an approximate closed form sample size formula for determining non-inferiority in active-control trials with binary data. We use the odds-ratio as the measure of the relative treatment effect, derive the sample size formula based on the score test and compare it with a second, well-known formula based on the Wald test. Both closed form formulae are compared with simulations based on the likelihood ratio test. Within the range of parameter values investigated, the score test closed form formula is reasonably accurate when non-inferiority margins are based on odds-ratios of about 0.5 or above and when the magnitude of the odds ratio under the alternative hypothesis lies between about 1 and 2.5. The accuracy generally decreases as the odds ratio under the alternative hypothesis moves upwards from 1. As the non-inferiority margin odds ratio decreases from 0.5, the score test closed form formula increasingly overestimates the sample size irrespective of the magnitude of the odds ratio under the alternative hypothesis. The Wald test closed form formula is also reasonably accurate in the cases where the score test closed form formula works well. Outside these scenarios, the Wald test closed form formula can either underestimate or overestimate the sample size, depending on the magnitude of the non-inferiority margin odds ratio and the odds ratio under the alternative hypothesis. Although neither approximation is accurate for all cases, both approaches lead to satisfactory sample size calculation for non-inferiority trials with binary data where the odds ratio is the parameter of interest.
Resumo:
Met Office station data from 1980 to 2012 has been used to characterise the interannual variability of incident solar irradiance across the UK. The same data are used to evaluate four popular historical irradiance products to determine which are most suitable for use by the UK PV industry for site selection and system design. The study confirmed previous findings that interannual variability is typically 3–6% and weighted average probability of a particular percentage deviation from the mean at an average site in the UK was calculated. This weighted average showed that fewer than 2% of site-years could be expected to fall below 90% of the long-term site mean. The historical irradiance products were compared against Met Office station data from the input years of each product. This investigation has found that all products perform well. No products have a strong spatial trend. Meteonorm 7 is most conservative (MBE = −2.5%), CMSAF is most optimistic (MBE = +3.4%) and an average of all four products performs better than any one individual product (MBE = 0.3%)
Resumo:
Upscaling ecological information to larger scales in space and downscaling remote sensing observations or model simulations to finer scales remain grand challenges in Earth system science. Downscaling often involves inferring subgrid information from coarse-scale data, and such ill-posed problems are classically addressed using regularization. Here, we apply two-dimensional Tikhonov Regularization (2DTR) to simulate subgrid surface patterns for ecological applications. Specifically, we test the ability of 2DTR to simulate the spatial statistics of high-resolution (4 m) remote sensing observations of the normalized difference vegetation index (NDVI) in a tundra landscape. We find that the 2DTR approach as applied here can capture the major mode of spatial variability of the high-resolution information, but not multiple modes of spatial variability, and that the Lagrange multiplier (γ) used to impose the condition of smoothness across space is related to the range of the experimental semivariogram. We used observed and 2DTR-simulated maps of NDVI to estimate landscape-level leaf area index (LAI) and gross primary productivity (GPP). NDVI maps simulated using a γ value that approximates the range of observed NDVI result in a landscape-level GPP estimate that differs by ca 2% from those created using observed NDVI. Following findings that GPP per unit LAI is lower near vegetation patch edges, we simulated vegetation patch edges using multiple approaches and found that simulated GPP declined by up to 12% as a result. 2DTR can generate random landscapes rapidly and can be applied to disaggregate ecological information and compare of spatial observations against simulated landscapes.
Resumo:
Wind generation's contribution to supporting peak electricity demand is one of the key questions in wind integration studies. Differently from conventional units, the available outputs of different wind farms cannot be approximated as being statistically independent, and hence near-zero wind output is possible across an entire power system. This paper will review the risk model structures currently used to assess wind's capacity value, along with discussion of the resulting data requirements. A central theme is the benefits from performing statistical estimation of the joint distribution for demand and available wind capacity, focusing attention on uncertainties due to limited histories of wind and demand data; examination of Great Britain data from the last 25 years shows that the data requirements are greater than generally thought. A discussion is therefore presented into how analysis of the types of weather system which have historically driven extreme electricity demands can help to deliver robust insights into wind's contribution to supporting demand, even in the face of such data limitations. The role of the form of the probability distribution for available conventional capacity in driving wind capacity credit results is also discussed.
Resumo:
There has been an ongoing concern about the lack of reliable data on disabled children in schools. To date there has been no consistent way of identifying and categorising disabilities. Schools in England are currentlyrequired to collect data on children with Special Educational Need (SEN), but this does not capture information about all disabled children. The lack of this information may seriously restrict capacity at all levels of policy and practice to understand and respond to the needs of disabled children and their families in line with Disability Discrimination Act (2005) and the single Equality Act (2010). The aim of the project was to test the draft tools for identifying disability and accompanying guidance in a sample of all types of maintained schools in order to assess their usability and reliability and whether they resulted in the generation of robust and consistent data that could reliably inform school returns for the annual School Census.
Resumo:
We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.
Resumo:
Solitar y meanders of the Agulhas Current, so-called Natal pulses, may play an important role in the overall dynamics of this current system. Several hypotheses concer ning the triggering of these pulses are tested using sea sur face height and temperature data from satellites. The data show the for mation of pulses in the Natal Bight area at irregular inter vals ranging from 50 to 240 days. Moving downstream at speeds between 10 and 20 km day 2 1 they sometimes reach sizes of up to 300 km. They seem to play a role in the shedding of Agulhas rings that penetrate the South Atlantic. The inter mittent for mation of these solitar y meanders is argued to be most probably related to barotropic instability of the strongly baroclinic Agulhas Current in the Natal Bight. The vorticity structure of the obser ved basic flow is argued to be stable anywhere along its path. However , a proper perturbation of the jet in the Natal Bight area will allow barotropic instability , because the bottom slope there is considerably less steep than elsewhere along the South African east coast. Using satellite altimetr y these perturbations seem to be related to the inter mittent presence of offshore anticyclonic anomalies, both upstream and eastward of the Natal Bight.
Resumo:
Existing urban meteorological networks have an important role to play as test beds for inexpensive and more sustainable measurement techniques that are now becoming possible in our increasingly smart cities. The Birmingham Urban Climate Laboratory (BUCL) is a near-real-time, high-resolution urban meteorological network (UMN) of automatic weather stations and inexpensive, nonstandard air temperature sensors. The network has recently been implemented with an initial focus on monitoring urban heat, infrastructure, and health applications. A number of UMNs exist worldwide; however, BUCL is novel in its density, the low-cost nature of the sensors, and the use of proprietary Wi-Fi networks. This paper provides an overview of the logistical aspects of implementing a UMN test bed at such a density, including selecting appropriate urban sites; testing and calibrating low-cost, nonstandard equipment; implementing strict quality-assurance/quality-control mechanisms (including metadata); and utilizing preexisting Wi-Fi networks to transmit data. Also included are visualizations of data collected by the network, including data from the July 2013 U.K. heatwave as well as highlighting potential applications. The paper is an open invitation to use the facility as a test bed for evaluating models and/or other nonstandard observation techniques such as those generated via crowdsourcing techniques.
Resumo:
The increased availability of digital elevation models and satellite image data enable testing of morphometric relationships between sand dune variables (dune height, spacing and equivalent sand thickness), which were originally established using limited field survey data. These long-established geomorphological hypotheses can now be tested against very much larger samples than were possible when available data were limited to what could be collected by field surveys alone. This project uses ASTER Global Digital Elevation Model (GDEM) data to compare morphometric relationships between sand dune variables in the southwest Kalahari dunefield to those of the Namib Sand Sea, to test whether the relationships found in an active sand sea (Namib) also hold for the fixed dune system of the nearby southwest Kalahari. The data show significant morphometric differences between the simple linear dunes of the Namib sand sea and the southwest Kalahari; the latter do not show the expected positive relationship between dune height and spacing. The southwest Kalahari dunes show a similar range of dune spacings, but they are less tall, on average, than the Namib sand sea dunes. There is a clear spatial pattern to these morphometric data; the tallest and most closely spaced dunes are towards the southeast of the Kalahari dunefield; and this is where the highest values of equivalent sand thickness result. We consider the possible reasons for the observed differences and highlight the need for more studies comparing sand seas and dunefields from different environmental settings.
Resumo:
More than 70 years ago it was recognised that ionospheric F2-layer critical frequencies [foF2] had a strong relationship to sunspot number. Using historic datasets from the Slough and Washington ionosondes, we evaluate the best statistical fits of foF2 to sunspot numbers (at each Universal Time [UT] separately) in order to search for drifts and abrupt changes in the fit residuals over Solar Cycles 17-21. This test is carried out for the original composite of the Wolf/Zürich/International sunspot number [R], the new “backbone” group sunspot number [RBB] and the proposed “corrected sunspot number” [RC]. Polynomial fits are made both with and without allowance for the white-light facular area, which has been reported as being associated with cycle-to-cycle changes in the sunspot number - foF2 relationship. Over the interval studied here, R, RBB, and RC largely differ in their allowance for the “Waldmeier discontinuity” around 1945 (the correction factor for which for R, RBB and RC is, respectively, zero, effectively over 20 %, and explicitly 11.6 %). It is shown that for Solar Cycles 18-21, all three sunspot data sequences perform well, but that the fit residuals are lowest and most uniform for RBB. We here use foF2 for those UTs for which R, RBB, and RC all give correlations exceeding 0.99 for intervals both before and after the Waldmeier discontinuity. The error introduced by the Waldmeier discontinuity causes R to underestimate the fitted values based on the foF2 data for 1932-1945 but RBB overestimates them by almost the same factor, implying that the correction for the Waldmeier discontinuity inherent in RBB is too large by a factor of two. Fit residuals are smallest and most uniform for RC and the ionospheric data support the optimum discontinuity multiplicative correction factor derived from the independent Royal Greenwich Observatory (RGO) sunspot group data for the same interval.
Resumo:
Clusters of galaxies are the most impressive gravitationally-bound systems in the universe, and their abundance (the cluster mass function) is an important statistic to probe the matter density parameter (Omega(m)) and the amplitude of density fluctuations (sigma(8)). The cluster mass function is usually described in terms of the Press-Schecther (PS) formalism where the primordial density fluctuations are assumed to be a Gaussian random field. In previous works we have proposed a non-Gaussian analytical extension of the PS approach with basis on the q-power law distribution (PL) of the nonextensive kinetic theory. In this paper, by applying the PL distribution to fit the observational mass function data from X-ray highest flux-limited sample (HIFLUGCS), we find a strong degeneracy among the cosmic parameters, sigma(8), Omega(m) and the q parameter from the PL distribution. A joint analysis involving recent observations from baryon acoustic oscillation (BAO) peak and Cosmic Microwave Background (CMB) shift parameter is carried out in order to break these degeneracy and better constrain the physically relevant parameters. The present results suggest that the next generation of cluster surveys will be able to probe the quantities of cosmological interest (sigma(8), Omega(m)) and the underlying cluster physics quantified by the q-parameter.
Resumo:
Electromagnetic induction (EMI) method results are shown for vertical magnetic dipole (VMD) configuration by using the EM38 equipment. Performance in the location of metallic pipes and electrical cables is compared as a function of instrumental drift correction by linear and quadratic adjusting under controlled conditions. Metallic pipes and electrical cables are buried at the IAG/USP shallow geophysical test site in Sao Paulo City. Brazil. Results show that apparent electrical conductivity and magnetic susceptibility data were affected by ambient temperature variation. In order to obtain better contrast between background and metallic targets it was necessary to correct the drift. This correction was accomplished by using linear and quadratic relation between conductivity/susceptibility and temperature intending comparative studies. The correction of temperature drift by using a quadratic relation was effective, showing that all metallic targets were located as well deeper targets were also improved. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Upper-mantle seismic anisotropy has been extensively used to infer both present and past deformation processes at lithospheric and asthenospheric depths. Analysis of shear-wave splitting (mainly from core-refracted SKS phases) provides information regarding upper-mantle anisotropy. We present average measurements of fast-polarization directions at 21 new sites in poorly sampled regions of intra-plate South America, such as northern and northeastern Brazil. Despite sparse data coverage for the South American stable platform, consistent orientations are observed over hundreds of kilometers. Over most of the continent, the fast-polarization direction tends to be close to the absolute plate motion direction given by the hotspot reference model HS3-NUVEL-1A. A previous global comparison of the SKS fast-polarization directions with flow models of the upper mantle showed relatively poor correlation on the continents, which was interpreted as evidence for a large contribution of ""frozen"" anisotropy in the lithosphere. For the South American plate, our data indicate that one of the reasons for the poor correlation may have been the relatively coarse model of lithospheric thicknesses. We suggest that improved models of upper-mantle flow that are based on more detailed lithospheric thicknesses in South America may help to explain most of the observed anisotropy patterns.
Resumo:
The matrix-tolerance hypothesis suggests that the most abundant species in the inter-habitat matrix would be less vulnerable to their habitat fragmentation. This model was tested with leaf-litter frogs in the Atlantic Forest where the fragmentation process is older and more severe than in the Amazon, where the model was first developed. Frog abundance data from the agricultural matrix, forest fragments and continuous forest localities were used. We found an expected negative correlation between the abundance of frogs in the matrix and their vulnerability to fragmentation, however, results varied with fragment size and species traits. Smaller fragments exhibited stronger matrix-vulnerability correlation than intermediate fragments, while no significant relation was observed for large fragments. Moreover, some species that avoid the matrix were not sensitive to a decrease in the patch size, and the opposite was also true, indicating significant differences with that expected from the model. Most of the species that use the matrix were forest species with aquatic larvae development, but those species do not necessarily respond to fragmentation or fragment size, and thus affect more intensively the strengthen of the expected relationship. Therefore, the main relationship expected by the matrix-tolerance hypothesis was observed in the Atlantic Forest; however we noted that the prediction of this hypothesis can be substantially affected by the size of the fragments, and by species traits. We propose that matrix-tolerance model should be broadened to become a more effective model, including other patch characteristics, particularly fragment size, and individual species traits (e. g., reproductive mode and habitat preference).
Resumo:
Phylogenetic analyses of chloroplast DNA sequences, morphology, and combined data have provided consistent support for many of the major branches within the angiosperm, clade Dipsacales. Here we use sequences from three mitochondrial loci to test the existing broad scale phylogeny and in an attempt to resolve several relationships that have remained uncertain. Parsimony, maximum likelihood, and Bayesian analyses of a combined mitochondrial data set recover trees broadly consistent with previous studies, although resolution and support are lower than in the largest chloroplast analyses. Combining chloroplast and mitochondrial data results in a generally well-resolved and very strongly supported topology but the previously recognized problem areas remain. To investigate why these relationships have been difficult to resolve we conducted a series of experiments using different data partitions and heterogeneous substitution models. Usually more complex modeling schemes are favored regardless of the partitions recognized but model choice had little effect on topology or support values. In contrast there are consistent but weakly supported differences in the topologies recovered from coding and non-coding matrices. These conflicts directly correspond to relationships that were poorly resolved in analyses of the full combined chloroplast-mitochondrial data set. We suggest incongruent signal has contributed to our inability to confidently resolve these problem areas. (c) 2007 Elsevier Inc. All rights reserved.