Biblioteca Digital

939 resultados para STATISTICAL DATA

Sample size considerations in active-control non-inferiority trials with binary data based on the odds ratio

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an approximate closed form sample size formula for determining non-inferiority in active-control trials with binary data. We use the odds-ratio as the measure of the relative treatment effect, derive the sample size formula based on the score test and compare it with a second, well-known formula based on the Wald test. Both closed form formulae are compared with simulations based on the likelihood ratio test. Within the range of parameter values investigated, the score test closed form formula is reasonably accurate when non-inferiority margins are based on odds-ratios of about 0.5 or above and when the magnitude of the odds ratio under the alternative hypothesis lies between about 1 and 2.5. The accuracy generally decreases as the odds ratio under the alternative hypothesis moves upwards from 1. As the non-inferiority margin odds ratio decreases from 0.5, the score test closed form formula increasingly overestimates the sample size irrespective of the magnitude of the odds ratio under the alternative hypothesis. The Wald test closed form formula is also reasonably accurate in the cases where the score test closed form formula works well. Outside these scenarios, the Wald test closed form formula can either underestimate or overestimate the sample size, depending on the magnitude of the non-inferiority margin odds ratio and the odds ratio under the alternative hypothesis. Although neither approximation is accurate for all cases, both approaches lead to satisfactory sample size calculation for non-inferiority trials with binary data where the odds ratio is the parameter of interest.

A survey of data mining techniques for social media analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors.

A method for merging nadir-sounding climate records, with an application to the global-mean stratospheric temperature data sets from SSU and AMSU

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A method is proposed for merging different nadir-sounding climate data records using measurements from high-resolution limb sounders to provide a transfer function between the different nadir measurements. The two nadir-sounding records need not be overlapping so long as the limb-sounding record bridges between them. The method is applied to global-mean stratospheric temperatures from the NOAA Climate Data Records based on the Stratospheric Sounding Unit (SSU) and the Advanced Microwave Sounding Unit-A (AMSU), extending the SSU record forward in time to yield a continuous data set from 1979 to present, and providing a simple framework for extending the SSU record into the future using AMSU. SSU and AMSU are bridged using temperature measurements from the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS), which is of high enough vertical resolution to accurately represent the weighting functions of both SSU and AMSU. For this application, a purely statistical approach is not viable since the different nadir channels are not sufficiently linearly independent, statistically speaking. The near-global-mean linear temperature trends for extended SSU for 1980–2012 are −0.63 ± 0.13, −0.71 ± 0.15 and −0.80 ± 0.17 K decade−1 (95 % confidence) for channels 1, 2 and 3, respectively. The extended SSU temperature changes are in good agreement with those from the Microwave Limb Sounder (MLS) on the Aura satellite, with both exhibiting a cooling trend of ~ 0.6 ± 0.3 K decade−1 in the upper stratosphere from 2004 to 2012. The extended SSU record is found to be in agreement with high-top coupled atmosphere–ocean models over the 1980–2012 period, including the continued cooling over the first decade of the 21st century.

Statistical characterisation of the growth and spatial scales of the substorm onset arc

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present the first multi-event study of the spatial and temporal structuring of the aurora to provide statistical evidence of the near-Earth plasma instability which causes the substorm onset arc. Using data from ground-based auroral imagers, we study repeatable signatures of along-arc auroral beads, which are thought to represent the ionospheric projection of magnetospheric instability in the near-Earth plasma sheet. We show that the growth and spatial scales of these wave-like fluctuations are similar across multiple events, indicating that each sudden auroral brightening has a common explanation. We find statistically that growth rates for auroral beads peak at low wavenumber with the most unstable spatial scales mapping to an azimuthal wavelength λ≈1700 − 2500 km in the equatorial magnetosphere at around 9-12 RE. We compare growth rates and spatial scales with a range of theoretical predictions of magnetotail instabilities, including the cross-field current instability and the shear-flow ballooning instability. We conclude that, although the cross-field current instability can generate similar magnitude of growth rates, the range of unstable wavenumbers indicates that the shear-flow ballooning instability is the most likely explanation for our observations.

Tests of sunspot number sequences: 1. Using ionosonde data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

More than 70 years ago it was recognised that ionospheric F2-layer critical frequencies [foF2] had a strong relationship to sunspot number. Using historic datasets from the Slough and Washington ionosondes, we evaluate the best statistical fits of foF2 to sunspot numbers (at each Universal Time [UT] separately) in order to search for drifts and abrupt changes in the fit residuals over Solar Cycles 17-21. This test is carried out for the original composite of the Wolf/Zürich/International sunspot number [R], the new “backbone” group sunspot number [RBB] and the proposed “corrected sunspot number” [RC]. Polynomial fits are made both with and without allowance for the white-light facular area, which has been reported as being associated with cycle-to-cycle changes in the sunspot number - foF2 relationship. Over the interval studied here, R, RBB, and RC largely differ in their allowance for the “Waldmeier discontinuity” around 1945 (the correction factor for which for R, RBB and RC is, respectively, zero, effectively over 20 %, and explicitly 11.6 %). It is shown that for Solar Cycles 18-21, all three sunspot data sequences perform well, but that the fit residuals are lowest and most uniform for RBB. We here use foF2 for those UTs for which R, RBB, and RC all give correlations exceeding 0.99 for intervals both before and after the Waldmeier discontinuity. The error introduced by the Waldmeier discontinuity causes R to underestimate the fitted values based on the foF2 data for 1932-1945 but RBB overestimates them by almost the same factor, implying that the correction for the Waldmeier discontinuity inherent in RBB is too large by a factor of two. Fit residuals are smallest and most uniform for RC and the ionospheric data support the optimum discontinuity multiplicative correction factor derived from the independent Royal Greenwich Observatory (RGO) sunspot group data for the same interval.

On the ability of statistical wind-wave models to capture the variability and long-term trends of the North Atlantic winter wave climate

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A dynamical wind-wave climate simulation covering the North Atlantic Ocean and spanning the whole 21st century under the A1B scenario has been compared with a set of statistical projections using atmospheric variables or large scale climate indices as predictors. As a first step, the performance of all statistical models has been evaluated for the present-day climate; namely they have been compared with a dynamical wind-wave hindcast in terms of winter Significant Wave Height (SWH) trends and variance as well as with altimetry data. For the projections, it has been found that statistical models that use wind speed as independent variable predictor are able to capture a larger fraction of the winter SWH inter-annual variability (68% on average) and of the long term changes projected by the dynamical simulation. Conversely, regression models using climate indices, sea level pressure and/or pressure gradient as predictors, account for a smaller SWH variance (from 2.8% to 33%) and do not reproduce the dynamically projected long term trends over the North Atlantic. Investigating the wind-sea and swell components separately, we have found that the combination of two regression models, one for wind-sea waves and another one for the swell component, can improve significantly the wave field projections obtained from single regression models over the North Atlantic.

CONSTRAINING THE NONEXTENSIVE MASS FUNCTION OF HALOS FROM BAO, CMB AND X-RAY DATA

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clusters of galaxies are the most impressive gravitationally-bound systems in the universe, and their abundance (the cluster mass function) is an important statistic to probe the matter density parameter (Omega(m)) and the amplitude of density fluctuations (sigma(8)). The cluster mass function is usually described in terms of the Press-Schecther (PS) formalism where the primordial density fluctuations are assumed to be a Gaussian random field. In previous works we have proposed a non-Gaussian analytical extension of the PS approach with basis on the q-power law distribution (PL) of the nonextensive kinetic theory. In this paper, by applying the PL distribution to fit the observational mass function data from X-ray highest flux-limited sample (HIFLUGCS), we find a strong degeneracy among the cosmic parameters, sigma(8), Omega(m) and the q parameter from the PL distribution. A joint analysis involving recent observations from baryon acoustic oscillation (BAO) peak and Cosmic Microwave Background (CMB) shift parameter is carried out in order to break these degeneracy and better constrain the physically relevant parameters. The present results suggest that the next generation of cluster surveys will be able to probe the quantities of cosmological interest (sigma(8), Omega(m)) and the underlying cluster physics quantified by the q-parameter.

PCA Tomography: how to extract information from data cubes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Astronomy has evolved almost exclusively by the use of spectroscopic and imaging techniques, operated separately. With the development of modern technologies, it is possible to obtain data cubes in which one combines both techniques simultaneously, producing images with spectral resolution. To extract information from them can be quite complex, and hence the development of new methods of data analysis is desirable. We present a method of analysis of data cube (data from single field observations, containing two spatial and one spectral dimension) that uses Principal Component Analysis (PCA) to express the data in the form of reduced dimensionality, facilitating efficient information extraction from very large data sets. PCA transforms the system of correlated coordinates into a system of uncorrelated coordinates ordered by principal components of decreasing variance. The new coordinates are referred to as eigenvectors, and the projections of the data on to these coordinates produce images we will call tomograms. The association of the tomograms (images) to eigenvectors (spectra) is important for the interpretation of both. The eigenvectors are mutually orthogonal, and this information is fundamental for their handling and interpretation. When the data cube shows objects that present uncorrelated physical phenomena, the eigenvector`s orthogonality may be instrumental in separating and identifying them. By handling eigenvectors and tomograms, one can enhance features, extract noise, compress data, extract spectra, etc. We applied the method, for illustration purpose only, to the central region of the low ionization nuclear emission region (LINER) galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not known before. Furthermore, we show that it is displaced from the centre of its stellar bulge.

A bivariate regression model for matched paired survival data: local influence and residual analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of bivariate distributions plays a fundamental role in survival and reliability studies. In this paper, we consider a location scale model for bivariate survival times based on the proposal of a copula to model the dependence of bivariate survival data. For the proposed model, we consider inferential procedures based on maximum likelihood. Gains in efficiency from bivariate models are also examined in the censored data setting. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the bivariate regression model for matched paired survival data. Sensitivity analysis methods such as local and total influence are presented and derived under three perturbation schemes. The martingale marginal and the deviance marginal residual measures are used to check the adequacy of the model. Furthermore, we propose a new measure which we call modified deviance component residual. The methodology in the paper is illustrated on a lifetime data set for kidney patients.

COM-Poisson cure rate survival models and an application to a cutaneous melanoma data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we develop a flexible cure rate survival model by assuming the number of competing causes of the event of interest to follow the Conway-Maxwell Poisson distribution. This model includes as special cases some of the well-known cure rate models discussed in the literature. Next, we discuss the maximum likelihood estimation of the parameters of this cure rate survival model. Finally, we illustrate the usefulness of this model by applying it to a real cutaneous melanoma data. (C) 2009 Elsevier B.V. All rights reserved.

Statistical analysis of proficiency testing results under elliptical distributions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The use of inter-laboratory test comparisons to determine the performance of individual laboratories for specific tests (or for calibration) [ISO/IEC Guide 43-1, 1997. Proficiency testing by interlaboratory comparisons - Part 1: Development and operation of proficiency testing schemes] is called Proficiency Testing (PT). In this paper we propose the use of the generalized likelihood ratio test to compare the performance of the group of laboratories for specific tests relative to the assigned value and illustrate the procedure considering an actual data from the PT program in the area of volume. The proposed test extends the test criteria in use allowing to test for the consistency of the group of laboratories. Moreover, the class of elliptical distributions are considered for the obtained measurements. (C) 2008 Elsevier B.V. All rights reserved.

Statistical analysis of the Doppler broadening coincidence spectrum of electron-positron annihilation radiation in silicon

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report a statistical analysis of Doppler broadening coincidence data of electron-positron annihilation radiation in silicon using a (22)Na source. The Doppler broadening coincidence spectrum was fit using a model function that included positron annihilation at rest with 1s, 2s, 2p, and valence band electrons. In-flight positron annihilation was also fit. The response functions of the detectors accounted for backscattering, combinations of Compton effects, pileup, ballistic deficit, and pulse-shaping problems. The procedure allows the quantitative determination of positron annihilation with core and valence electron intensities as well as their standard deviations directly from the experimental spectrum. The results obtained for the core and valence band electron annihilation intensities were 2.56(9)% and 97.44(9)%, respectively. These intensities are consistent with published experimental data treated by conventional analysis methods. This new procedure has the advantage of allowing one to distinguish additional effects from those associated with the detection system response function. (C) 2009 Elsevier B.V. All rights reserved.

Parametric analysis of discrepant sets of data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper a new parametric method to deal with discrepant experimental results is developed. The method is based on the fit of a probability density function to the data. This paper also compares the characteristics of different methods used to deduce recommended values and uncertainties from a discrepant set of experimental data. The methods are applied to the (137)Cs and (90)Sr published half-lives and special emphasis is given to the deduced confidence intervals. The obtained results are analyzed considering two fundamental properties expected from an experimental result: the probability content of confidence intervals and the statistical consistency between different recommended values. The recommended values and uncertainties for the (137)Cs and (90)Sr half-lives are 10,984 (24) days and 10,523 (70) days, respectively. (C) 2009 Elsevier B.V. All rights reserved.

On the efficiency of data representation on the modeling and characterization of complex networks

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Specific choices about how to represent complex networks can have a substantial impact on the execution time required for the respective construction and analysis of those structures. In this work we report a comparison of the effects of representing complex networks statically by adjacency matrices or dynamically by adjacency lists. Three theoretical models of complex networks are considered: two types of Erdos-Renyi as well as the Barabasi-Albert model. We investigated the effect of the different representations with respect to the construction and measurement of several topological properties (i.e. degree, clustering coefficient, shortest path length, and betweenness centrality). We found that different forms of representation generally have a substantial effect on the execution time, with the sparse representation frequently resulting in remarkably superior performance. (C) 2011 Elsevier B.V. All rights reserved.

Multiple Pathways Analysis of Brain Functional Networks from EEG Signals: An Application to Real Data

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the present study, we propose a theoretical graph procedure to investigate multiple pathways in brain functional networks. By taking into account all the possible paths consisting of h links between the nodes pairs of the network, we measured the global network redundancy R (h) as the number of parallel paths and the global network permeability P (h) as the probability to get connected. We used this procedure to investigate the structural and dynamical changes in the cortical networks estimated from a dataset of high-resolution EEG signals in a group of spinal cord injured (SCI) patients during the attempt of foot movement. In the light of a statistical contrast with a healthy population, the permeability index P (h) of the SCI networks increased significantly (P < 0.01) in the Theta frequency band (3-6 Hz) for distances h ranging from 2 to 4. On the contrary, no significant differences were found between the two populations for the redundancy index R (h) . The most significant changes in the brain functional network of SCI patients occurred mainly in the lower spectral contents. These changes were related to an improved propagation of communication between the closest cortical areas rather than to a different level of redundancy. This evidence strengthens the hypothesis of the need for a higher functional interaction among the closest ROIs as a mechanism to compensate the lack of feedback from the peripheral nerves to the sensomotor areas.

«
1
2
...
55
56
57
58
59
60
61
62
63
»