Biblioteca Digital

952 resultados para Data sets storage

CAMCR: Computer-Assisted Mixture model analysis for Capture–Recapture count data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.

Jpowder: a Java-based program for the display and examination of powder diffraction data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The ability to display and inspect powder diffraction data quickly and efficiently is a central part of the data analysis process. Whilst many computer programs are capable of displaying powder data, their focus is typically on advanced operations such as structure solution or Rietveld refinement. This article describes a lightweight software package, Jpowder, whose focus is fast and convenient visualization and comparison of powder data sets in a variety of formats from computers with network access. Jpowder is written in Java and uses its associated Web Start technology to allow ‘single-click deployment’ from a web page, http://www.jpowder.org. Jpowder is open source, free and available for use by anyone.

Solving surface structures from normal incidence X-ray standing wave data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A program is provided to determine structural parameters of atoms in or adsorbed on surfaces by refinement of atomistic models towards experimentally determined data generated by the normal incidence X-ray standing wave (NIXSW) technique. The method employs a combination of Differential Evolution Genetic Algorithms and Steepest Descent Line Minimisations to provide a fast, reliable and user friendly tool for experimentalists to interpret complex multidimensional NIXSW data sets.

Design considerations in the sequential analysis of matched case–control data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A role for sequential test procedures is emerging in genetic and epidemiological studies using banked biological resources. This stems from the methodology's potential for improved use of information relative to comparable fixed sample designs. Studies in which cost, time and ethics feature prominently are particularly suited to a sequential approach. In this paper sequential procedures for matched case–control studies with binary data will be investigated and assessed. Design issues such as sample size evaluation and error rates are identified and addressed. The methodology is illustrated and evaluated using both real and simulated data sets.

Centennial changes in the heliospheric magnetic field and open solar flux: the consensus view from geomagnetic data and cosmogenic isotopes and its implications

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Svalgaard and Cliver (2010) recently reported a consensus between the various reconstructions of the heliospheric field over recent centuries. This is a significant development because, individually, each has uncertainties introduced by instrument calibration drifts, limited numbers of observatories, and the strength of the correlations employed. However, taken collectively, a consistent picture is emerging. We here show that this consensus extends to more data sets and methods than reported by Svalgaard and Cliver, including that used by Lockwood et al. (1999), when their algorithm is used to predict the heliospheric field rather than the open solar flux. One area where there is still some debate relates to the existence and meaning of a floor value to the heliospheric field. From cosmogenic isotope abundances, Steinhilber et al. (2010) have recently deduced that the near-Earth IMF at the end of the Maunder minimum was 1.80 ± 0.59 nT which is considerably lower than the revised floor of 4nT proposed by Svalgaard and Cliver. We here combine cosmogenic and geomagnetic reconstructions and modern observations (with allowance for the effect of solar wind speed and structure on the near-Earth data) to derive an estimate for the open solar flux of (0.48 ± 0.29) × 1014 Wb at the end of the Maunder minimum. By way of comparison, the largest and smallest annual means recorded by instruments in space between 1965 and 2010 are 5.75 × 1014 Wb and 1.37 × 1014 Wb, respectively, set in 1982 and 2009, and the maximum of the 11 year running means was 4.38 × 1014 Wb in 1986. Hence the average open solar flux during the Maunder minimum is found to have been 11% of its peak value during the recent grand solar maximum.

Data dilemmas in forecasting European office market rents

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper uses data provided by three major real estate advisory firms to investigate the level and pattern of variation in the measurement of historic real estate rental values for the main European office centres. The paper assesses the extent to which the data providing organizations agree on historic market performance in terms of returns, risk and timing and examines the relationship between market maturity and agreement. The analysis suggests that at the aggregate level and for many markets, there is substantial agreement on direction, quantity and timing of market change. However, there is substantial variability in the level of agreement among cities. The paper also assesses whether the different data sets produce different explanatory models and market forecast. It is concluded that, although disagreement on the direction of market change is high for many market, the different data sets often produce similar explanatory models and predict similar relative performance.

Modeling data with structural and temporal correlation using lower level and higher level multilevel models

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Novel imaging techniques are playing an increasingly important role in drug development, providing insight into the mechanism of action of new chemical entities. The data sets obtained by these methods can be large with complex inter-relationships, but the most appropriate statistical analysis for handling this data is often uncertain - precisely because of the exploratory nature of the way the data are collected. We present an example from a clinical trial using magnetic resonance imaging to assess changes in atherosclerotic plaques following treatment with a tool compound with established clinical benefit. We compared two specific approaches to handle the correlations due to physical location and repeated measurements: two-level and four-level multilevel models. The two methods identified similar structural variables, but higher level multilevel models had the advantage of explaining a greater proportion of variation, and the modeling assumptions appeared to be better satisfied.

Validation of ACE-FTS satellite data in the upper troposphere/lower stratosphere (UTLS) using non-coincident measurements

Relevância:

90.00% 90.00%

Publicador:

Resumo:

CO, O3, and H2O data in the upper troposphere/lower stratosphere (UTLS) measured by the Atmospheric Chemistry Experiment Fourier Transform Spectrometer(ACE-FTS) on Canada’s SCISAT-1 satellite are validated using aircraft and ozonesonde measurements. In the UTLS, validation of chemical trace gas measurements is a challenging task due to small-scale variability in the tracer fields, strong gradients of the tracers across the tropopause, and scarcity of measurements suitable for validation purposes. Validation based on coincidences therefore suffers from geophysical noise. Two alternative methods for the validation of satellite data are introduced, which avoid the usual need for coincident measurements: tracer-tracer correlations, and vertical tracer profiles relative to tropopause height. Both are increasingly being used for model validation as they strongly suppress geophysical variability and thereby provide an “instantaneous climatology”. This allows comparison of measurements between non-coincident data sets which yields information about the precision and a statistically meaningful error-assessment of the ACE-FTS satellite data in the UTLS. By defining a trade-off factor, we show that the measurement errors can be reduced by including more measurements obtained over a wider longitude range into the comparison, despite the increased geophysical variability. Applying the methods then yields the following upper bounds to the relative differences in the mean found between the ACE-FTS and SPURT aircraft measurements in the upper troposphere (UT) and lower stratosphere (LS), respectively: for CO ±9% and ±12%, for H2O ±30% and ±18%, and for O3 ±25% and ±19%. The relative differences for O3 can be narrowed down by using a larger dataset obtained from ozonesondes, yielding a high bias in the ACEFTS measurements of 18% in the UT and relative differences of ±8% for measurements in the LS. When taking into account the smearing effect of the vertically limited spacing between measurements of the ACE-FTS instrument, the relative differences decrease by 5–15% around the tropopause, suggesting a vertical resolution of the ACE-FTS in the UTLS of around 1 km. The ACE-FTS hence offers unprecedented precision and vertical resolution for a satellite instrument, which will allow a new global perspective on UTLS tracer distributions.

Comparison of different boundary layer surface schemes using single point micrometeorological field data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the last decade, a vast number of land surface schemes has been designed for use in global climate models, atmospheric weather prediction, mesoscale numerical models, ecological models, and models of global changes. Since land surface schemes are designed for different purposes they have various levels of complexity in the treatment of bare soil processes, vegetation, and soil water movement. This paper is a contribution to a little group of papers dealing with intercomparison of differently designed and oriented land surface schemes. For that purpose we have chosen three schemes for classification: i) global climate models, BATS (Dickinson et al., 1986; Dickinson et al., 1992); ii) mesoscale and ecological models, LEAF (Lee, 1992) and iii) mesoscale models, LAPS (Mihailović, 1996; Mihailović and Kallos, 1997; Mihailović et al., 1999) according to the Shao et al. (1995) classification. These schemes were compared using surface fluxes and leaf temperature outputs obtained by time integrations of data sets derived from the micrometeorological measurements above a maize field at an experimental site in De Sinderhoeve (The Netherlands) for 18 August, 8 September, and 4 October 1988. Finally, comparison of the schemes was supported applying a simple statistical analysis on the surface flux outputs.

Grid computing solutions for distributed repositories of protein folding and unfolding simulations

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The P-found protein folding and unfolding simulation repository is designed to allow scientists to perform analyses across large, distributed simulation data sets. There are two storage components in P-found: a primary repository of simulation data and a data warehouse. Here we demonstrate how grid technologies can support multiple, distributed P-found installations. In particular we look at two aspects, first how grid data management technologies can be used to access the distributed data warehouses; and secondly, how the grid can be used to transfer analysis programs to the primary repositories --- this is an important and challenging aspect of P-found because the data volumes involved are too large to be centralised. The grid technologies we are developing with the P-found system will allow new large data sets of protein folding simulations to be accessed and analysed in novel ways, with significant potential for enabling new scientific discoveries.

Comparison of different boundary layer surface schemes using single point micrometeorological field data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the last decade, a vast number of land surface schemes has been designed for use in global climate models, atmospheric weather prediction, mesoscale numerical models, ecological models, and models of global changes. Since land surface schemes are designed for different purposes they have various levels of complexity in the treatment of bare soil processes, vegetation, and soil water movement. This paper is a contribution to a little group of papers dealing with intercomparison of differently designed and oriented land surface schemes. For that purpose we have chosen three schemes for classification: i) global climate models, BATS (Dickinson et al., 1986; Dickinson et al., 1992); ii) mesoscale and ecological models, LEAF (Lee, 1992) and iii) mesoscale models, LAPS (Mihailović, 1996; Mihailović and Kallos, 1997; Mihailović et al., 1999) according to the Shao et al. (1995) classification. These schemes were compared using surface fluxes and leaf temperature outputs obtained by time integrations of data sets derived from the micrometeorological measurements above a maize field at an experimental site in De Sinderhoeve (The Netherlands) for 18 August, 8 September, and 4 October 1988. Finally, comparison of the schemes was supported applying a simple statistical analysis on the surface flux outputs.

A method for under-sampled ecological network data analysis: plant-pollination as case study

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we develop a method, termed the Interaction Distribution (ID) method, for analysis of quantitative ecological network data. In many cases, quantitative network data sets are under-sampled, i.e. many interactions are poorly sampled or remain unobserved. Hence, the output of statistical analyses may fail to differentiate between patterns that are statistical artefacts and those which are real characteristics of ecological networks. The ID method can support assessment and inference of under-sampled ecological network data. In the current paper, we illustrate and discuss the ID method based on the properties of plant-animal pollination data sets of flower visitation frequencies. However, the ID method may be applied to other types of ecological networks. The method can supplement existing network analyses based on two definitions of the underlying probabilities for each combination of pollinator and plant species: (1), pi,j: the probability for a visit made by the i’th pollinator species to take place on the j’th plant species; (2), qi,j: the probability for a visit received by the j’th plant species to be made by the i’th pollinator. The method applies the Dirichlet distribution to estimate these two probabilities, based on a given empirical data set. The estimated mean values for pi,j and qi,j reflect the relative differences between recorded numbers of visits for different pollinator and plant species, and the estimated uncertainty of pi,j and qi,j decreases with higher numbers of recorded visits.

Construction of neurofuzzy models for imbalanced data classification

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We propose a new class of neurofuzzy construction algorithms with the aim of maximizing generalization capability specifically for imbalanced data classification problems based on leave-one-out (LOO) cross validation. The algorithms are in two stages, first an initial rule base is constructed based on estimating the Gaussian mixture model with analysis of variance decomposition from input data; the second stage carries out the joint weighted least squares parameter estimation and rule selection using orthogonal forward subspace selection (OFSS)procedure. We show how different LOO based rule selection criteria can be incorporated with OFSS, and advocate either maximizing the leave-one-out area under curve of the receiver operating characteristics, or maximizing the leave-one-out Fmeasure if the data sets exhibit imbalanced class distribution. Extensive comparative simulations illustrate the effectiveness of the proposed algorithms.

SPARC Data Initiative: Comparison of water vapor climatologies from international satellite limb sounders

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Within the SPARC Data Initiative, the first comprehensive assessment of the quality of 13 water vapor products from 11 limb-viewing satellite instruments (LIMS, SAGE II, UARS-MLS, HALOE, POAM III, SMR, SAGE III, MIPAS, SCIAMACHY, ACE-FTS, and Aura-MLS) obtained within the time period 1978-2010 has been performed. Each instrument's water vapor profile measurements were compiled into monthly zonal mean time series on a common latitude-pressure grid. These time series serve as basis for the "climatological" validation approach used within the project. The evaluations include comparisons of monthly or annual zonal mean cross sections and seasonal cycles in the tropical and extratropical upper troposphere and lower stratosphere averaged over one or more years, comparisons of interannual variability, and a study of the time evolution of physical features in water vapor such as the tropical tape recorder and polar vortex dehydration. Our knowledge of the atmospheric mean state in water vapor is best in the lower and middle stratosphere of the tropics and midlatitudes, with a relative uncertainty of. 2-6% (as quantified by the standard deviation of the instruments' multiannual means). The uncertainty increases toward the polar regions (+/- 10-15%), the mesosphere (+/- 15%), and the upper troposphere/lower stratosphere below 100 hPa (+/- 30-50%), where sampling issues add uncertainty due to large gradients and high natural variability in water vapor. The minimum found in multiannual (1998-2008) mean water vapor in the tropical lower stratosphere is 3.5 ppmv (+/- 14%), with slightly larger uncertainties for monthly mean values. The frequently used HALOE water vapor data set shows consistently lower values than most other data sets throughout the atmosphere, with increasing deviations from the multi-instrument mean below 100 hPa in both the tropics and extratropics. The knowledge gained from these comparisons and regarding the quality of the individual data sets in different regions of the atmosphere will help to improve model-measurement comparisons (e.g., for diagnostics such as the tropical tape recorder or seasonal cycles), data merging activities, and studies of climate variability.

SPARC Data Initiative: A comparison of ozone climatologies from international satellite limb sounders

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A comprehensive quality assessment of the ozone products from 18 limb-viewing satellite instruments is provided by means of a detailed intercomparison. The ozone climatologies in form of monthly zonal mean time series covering the upper troposphere to lower mesosphere are obtained from LIMS, SAGE I/II/III, UARS-MLS, HALOE, POAM II/III, SMR, OSIRIS, MIPAS, GOMOS, SCIAMACHY, ACE-FTS, ACE-MAESTRO, Aura-MLS, HIRDLS, and SMILES within 1978–2010. The intercomparisons focus on mean biases of annual zonal mean fields, interannual variability, and seasonal cycles. Additionally, the physical consistency of the data is tested through diagnostics of the quasi-biennial oscillation and Antarctic ozone hole. The comprehensive evaluations reveal that the uncertainty in our knowledge of the atmospheric ozone mean state is smallest in the tropical and midlatitude middle stratosphere with a 1σ multi-instrument spread of less than ±5%. While the overall agreement among the climatological data sets is very good for large parts of the stratosphere, individual discrepancies have been identified, including unrealistic month-to-month fluctuations, large biases in particular atmospheric regions, or inconsistencies in the seasonal cycle. Notable differences between the data sets exist in the tropical lower stratosphere (with a spread of ±30%) and at high latitudes (±15%). In particular, large relative differences are identified in the Antarctic during the time of the ozone hole, with a spread between the monthly zonal mean fields of ±50%. The evaluations provide guidance on what data sets are the most reliable for applications such as studies of ozone variability, model-measurement comparisons, detection of long-term trends, and data-merging activities.

«
1
2
...
9
10
11
12
13
14
15
...
63
64
»