892 resultados para rank transform
Resumo:
Abstract We demonstrate the use of Fourier transform infrared spectroscopy (FTIRS) to make quantitative measures of total organic carbon (TOC), total inorganic carbon (TIC) and biogenic silica (BSi) concentrations in sediment. FTIRS is a fast and costeffective technique and only small sediment samples are needed (0.01 g). Statistically significant models were developed using sediment samples from northern Sweden and were applied to sediment records from Sweden, northeast Siberia and Macedonia. The correlation between FTIRS-inferred values and amounts of biogeochemical constituents assessed conventionally varied between r = 0.84–0.99 for TOC, r = 0.85– 0.99 for TIC, and r = 0.68–0.94 for BSi. Because FTIR spectra contain information on a large number of both inorganic and organic components, there is great potential for FTIRS to become an important tool in paleolimnology.
Resumo:
We consider the problem of fitting a union of subspaces to a collection of data points drawn from one or more subspaces and corrupted by noise and/or gross errors. We pose this problem as a non-convex optimization problem, where the goal is to decompose the corrupted data matrix as the sum of a clean and self-expressive dictionary plus a matrix of noise and/or gross errors. By self-expressive we mean a dictionary whose atoms can be expressed as linear combinations of themselves with low-rank coefficients. In the case of noisy data, our key contribution is to show that this non-convex matrix decomposition problem can be solved in closed form from the SVD of the noisy data matrix. The solution involves a novel polynomial thresholding operator on the singular values of the data matrix, which requires minimal shrinkage. For one subspace, a particular case of our framework leads to classical PCA, which requires no shrinkage. For multiple subspaces, the low-rank coefficients obtained by our framework can be used to construct a data affinity matrix from which the clustering of the data according to the subspaces can be obtained by spectral clustering. In the case of data corrupted by gross errors, we solve the problem using an alternating minimization approach, which combines our polynomial thresholding operator with the more traditional shrinkage-thresholding operator. Experiments on motion segmentation and face clustering show that our framework performs on par with state-of-the-art techniques at a reduced computational cost.
Resumo:
The rank-based nonlinear predictability score was recently introduced as a test for determinism in point processes. We here adapt this measure to time series sampled from time-continuous flows. We use noisy Lorenz signals to compare this approach against a classical amplitude-based nonlinear prediction error. Both measures show an almost identical robustness against Gaussian white noise. In contrast, when the amplitude distribution of the noise has a narrower central peak and heavier tails than the normal distribution, the rank-based nonlinear predictability score outperforms the amplitude-based nonlinear prediction error. For this type of noise, the nonlinear predictability score has a higher sensitivity for deterministic structure in noisy signals. It also yields a higher statistical power in a surrogate test of the null hypothesis of linear stochastic correlated signals. We show the high relevance of this improved performance in an application to electroencephalographic (EEG) recordings from epilepsy patients. Here the nonlinear predictability score again appears of higher sensitivity to nonrandomness. Importantly, it yields an improved contrast between signals recorded from brain areas where the first ictal EEG signal changes were detected (focal EEG signals) versus signals recorded from brain areas that were not involved at seizure onset (nonfocal EEG signals).
Resumo:
We propose notions of calibration for probabilistic forecasts of general multivariate quantities. Probabilistic copula calibration is a natural analogue of probabilistic calibration in the univariate setting. It can be assessed empirically by checking for the uniformity of the copula probability integral transform (CopPIT), which is invariant under coordinate permutations and coordinatewise strictly monotone transformations of the predictive distribution and the outcome. The CopPIT histogram can be interpreted as a generalization and variant of the multivariate rank histogram, which has been used to check the calibration of ensemble forecasts. Climatological copula calibration is an analogue of marginal calibration in the univariate setting. Methods and tools are illustrated in a simulation study and applied to compare raw numerical model and statistically postprocessed ensemble forecasts of bivariate wind vectors.
Resumo:
Debates over the merits of competing schemes for ranking metropolitan areas as hightech centers shed little light on the important policy questions that should be the core of economic development policy. There are no strong theoretical reasons for preferring one ranking system to others. Rankings often conflate different industries and ignore history, obscuring the varied and often idiosyncratic processes that drive growth in different regions. Although an occupational perspective is a useful one for examining economic activity, it is a supplement to, not a replacement for, a careful understanding of metropolitan industrial specialization. Practitioners should not put too much weight on any ranking system but instead should work to develop detailed knowledge of their region’s special economic niche and to develop relationships and strategies that build on established strengths.
Resumo:
Let Y_i = f(x_i) + E_i\ (1\le i\le n) with given covariates x_1\lt x_2\lt \cdots\lt x_n , an unknown regression function f and independent random errors E_i with median zero. It is shown how to apply several linear rank test statistics simultaneously in order to test monotonicity of f in various regions and to identify its local extrema.
Resumo:
Theoretical models predict lognormal species abundance distributions (SADs) in stable and productive environments, with log-series SADs in less stable, dispersal driven communities. We studied patterns of relative species abundances of perennial vascular plants in global dryland communities to: (i) assess the influence of climatic and soil characteristics on the observed SADs, (ii) infer how environmental variability influences relative abundances, and (iii) evaluate how colonisation dynamics and environmental filters shape abundance distributions. We fitted lognormal and log-series SADs to 91 sites containing at least 15 species of perennial vascular plants. The dependence of species relative abundances on soil and climate variables was assessed using general linear models. Irrespective of habitat type and latitude, the majority of the SADs (70.3%) were best described by a lognormal distribution. Lognormal SADs were associated with low annual precipitation, higher aridity, high soil carbon content, and higher variability of climate variables and soil nitrate. Our results do not corroborate models predicting the prevalence of log-series SADs in dryland communities. As lognormal SADs were particularly associated with sites with drier conditions and a higher environmental variability, we reject models linking lognormality to environmental stability and high productivity conditions. Instead our results point to the prevalence of lognormal SADs in heterogeneous environments, allowing for more evenly distributed plant communities, or in stressful ecosystems, which are generally shaped by strong habitat filters and limited colonisation. This suggests that drylands may be resilient to environmental changes because the many species with intermediate relative abundances could take over ecosystem functioning if the environment becomes suboptimal for dominant species.
Resumo:
Every x-ray attenuation curve inherently contains all the information necessary to extract the complete energy spectrum of a beam. To date, attempts to obtain accurate spectral information from attenuation data have been inadequate.^ This investigation presents a mathematical pair model, grounded in physical reality by the Laplace Transformation, to describe the attenuation of a photon beam and the corresponding bremsstrahlung spectral distribution. In addition the Laplace model has been mathematically extended to include characteristic radiation in a physically meaningful way. A method to determine the fraction of characteristic radiation in any diagnostic x-ray beam was introduced for use with the extended model.^ This work has examined the reconstructive capability of the Laplace pair model for a photon beam range of from 50 kVp to 25 MV, using both theoretical and experimental methods.^ In the diagnostic region, excellent agreement between a wide variety of experimental spectra and those reconstructed with the Laplace model was obtained when the atomic composition of the attenuators was accurately known. The model successfully reproduced a 2 MV spectrum but demonstrated difficulty in accurately reconstructing orthovoltage and 6 MV spectra. The 25 MV spectrum was successfully reconstructed although poor agreement with the spectrum obtained by Levy was found.^ The analysis of errors, performed with diagnostic energy data, demonstrated the relative insensitivity of the model to typical experimental errors and confirmed that the model can be successfully used to theoretically derive accurate spectral information from experimental attenuation data. ^
Resumo:
An extension of k-ratio multiple comparison methods to rank-based analyses is described. The new method is analogous to the Duncan-Godbold approximate k-ratio procedure for unequal sample sizes or correlated means. The close parallel of the new methods to the Duncan-Godbold approach is shown by demonstrating that they are based upon different parameterizations as starting points.^ A semi-parametric basis for the new methods is shown by starting from the Cox proportional hazards model, using Wald statistics. From there the log-rank and Gehan-Breslow-Wilcoxon methods may be seen as score statistic based methods.^ Simulations and analysis of a published data set are used to show the performance of the new methods. ^
Resumo:
Sizes and power of selected two-sample tests of the equality of survival distributions are compared by simulation for small samples from unequally, randomly-censored exponential distributions. The tests investigated include parametric tests (F, Score, Likelihood, Asymptotic), logrank tests (Mantel, Peto-Peto), and Wilcoxon-Type tests (Gehan, Prentice). Equal sized samples, n = 18, 16, 32 with 1000 (size) and 500 (power) simulation trials, are compared for 16 combinations of the censoring proportions 0%, 20%, 40%, and 60%. For n = 8 and 16, the Asymptotic, Peto-Peto, and Wilcoxon tests perform at nominal 5% size expectations, but the F, Score and Mantel tests exceeded 5% size confidence limits for 1/3 of the censoring combinations. For n = 32, all tests showed proper size, with the Peto-Peto test most conservative in the presence of unequal censoring. Powers of all tests are compared for exponential hazard ratios of 1.4 and 2.0. There is little difference in power characteristics of the tests within the classes of tests considered. The Mantel test showed 90% to 95% power efficiency relative to parametric tests. Wilcoxon-type tests have the lowest relative power but are robust to differential censoring patterns. A modified Peto-Peto test shows power comparable to the Mantel test. For n = 32, a specific Weibull-exponential comparison of crossing survival curves suggests that the relative powers of logrank and Wilcoxon-type tests are dependent on the scale parameter of the Weibull distribution. Wilcoxon-type tests appear more powerful than logrank tests in the case of late-crossing and less powerful for early-crossing survival curves. Guidelines for the appropriate selection of two-sample tests are given. ^
Resumo:
The determination of size as well as power of a test is a vital part of a Clinical Trial Design. This research focuses on the simulation of clinical trial data with time-to-event as the primary outcome. It investigates the impact of different recruitment patterns, and time dependent hazard structures on size and power of the log-rank test. A non-homogeneous Poisson process is used to simulate entry times according to the different accrual patterns. A Weibull distribution is employed to simulate survival times according to the different hazard structures. The current study utilizes simulation methods to evaluate the effect of different recruitment patterns on size and power estimates of the log-rank test. The size of the log-rank test is estimated by simulating survival times with identical hazard rates between the treatment and the control arm of the study resulting in a hazard ratio of one. Powers of the log-rank test at specific values of hazard ratio (≠1) are estimated by simulating survival times with different, but proportional hazard rates for the two arms of the study. Different shapes (constant, decreasing, or increasing) of the hazard function of the Weibull distribution are also considered to assess the effect of hazard structure on the size and power of the log-rank test. ^