891 resultados para Statistical hypothesis testing.
Resumo:
A framework for adaptive and non-adaptive statistical compressive sensing is developed, where a statistical model replaces the standard sparsity model of classical compressive sensing. We propose within this framework optimal task-specific sensing protocols specifically and jointly designed for classification and reconstruction. A two-step adaptive sensing paradigm is developed, where online sensing is applied to detect the signal class in the first step, followed by a reconstruction step adapted to the detected class and the observed samples. The approach is based on information theory, here tailored for Gaussian mixture models (GMMs), where an information-theoretic objective relationship between the sensed signals and a representation of the specific task of interest is maximized. Experimental results using synthetic signals, Landsat satellite attributes, and natural images of different sizes and with different noise levels show the improvements achieved using the proposed framework when compared to more standard sensing protocols. The underlying formulation can be applied beyond GMMs, at the price of higher mathematical and computational complexity. © 1991-2012 IEEE.
Resumo:
In this paper we propose exact likelihood-based mean-variance efficiency tests of the market portfolio in the context of Capital Asset Pricing Model (CAPM), allowing for a wide class of error distributions which include normality as a special case. These tests are developed in the frame-work of multivariate linear regressions (MLR). It is well known however that despite their simple statistical structure, standard asymptotically justified MLR-based tests are unreliable. In financial econometrics, exact tests have been proposed for a few specific hypotheses [Jobson and Korkie (Journal of Financial Economics, 1982), MacKinlay (Journal of Financial Economics, 1987), Gib-bons, Ross and Shanken (Econometrica, 1989), Zhou (Journal of Finance 1993)], most of which depend on normality. For the gaussian model, our tests correspond to Gibbons, Ross and Shanken’s mean-variance efficiency tests. In non-gaussian contexts, we reconsider mean-variance efficiency tests allowing for multivariate Student-t and gaussian mixture errors. Our framework allows to cast more evidence on whether the normality assumption is too restrictive when testing the CAPM. We also propose exact multivariate diagnostic checks (including tests for multivariate GARCH and mul-tivariate generalization of the well known variance ratio tests) and goodness of fit tests as well as a set estimate for the intervening nuisance parameters. Our results [over five-year subperiods] show the following: (i) multivariate normality is rejected in most subperiods, (ii) residual checks reveal no significant departures from the multivariate i.i.d. assumption, and (iii) mean-variance efficiency tests of the market portfolio is not rejected as frequently once it is allowed for the possibility of non-normal errors.
Resumo:
We discuss statistical inference problems associated with identification and testability in econometrics, and we emphasize the common nature of the two issues. After reviewing the relevant statistical notions, we consider in turn inference in nonparametric models and recent developments on weakly identified models (or weak instruments). We point out that many hypotheses, for which test procedures are commonly proposed, are not testable at all, while some frequently used econometric methods are fundamentally inappropriate for the models considered. Such situations lead to ill-defined statistical problems and are often associated with a misguided use of asymptotic distributional results. Concerning nonparametric hypotheses, we discuss three basic problems for which such difficulties occur: (1) testing a mean (or a moment) under (too) weak distributional assumptions; (2) inference under heteroskedasticity of unknown form; (3) inference in dynamic models with an unlimited number of parameters. Concerning weakly identified models, we stress that valid inference should be based on proper pivotal functions —a condition not satisfied by standard Wald-type methods based on standard errors — and we discuss recent developments in this field, mainly from the viewpoint of building valid tests and confidence sets. The techniques discussed include alternative proposed statistics, bounds, projection, split-sampling, conditioning, Monte Carlo tests. The possibility of deriving a finite-sample distributional theory, robustness to the presence of weak instruments, and robustness to the specification of a model for endogenous explanatory variables are stressed as important criteria assessing alternative procedures.
Resumo:
Statistical tests in vector autoregressive (VAR) models are typically based on large-sample approximations, involving the use of asymptotic distributions or bootstrap techniques. After documenting that such methods can be very misleading even with fairly large samples, especially when the number of lags or the number of equations is not small, we propose a general simulation-based technique that allows one to control completely the level of tests in parametric VAR models. In particular, we show that maximized Monte Carlo tests [Dufour (2002)] can provide provably exact tests for such models, whether they are stationary or integrated. Applications to order selection and causality testing are considered as special cases. The technique developed is applied to quarterly and monthly VAR models of the U.S. economy, comprising income, money, interest rates and prices, over the period 1965-1996.
Resumo:
Several eco-toxicological studies have shown that insectivorous mammals, due to their feeding habits, easily accumulate high amounts of pollutants in relation to other mammal species. To assess the bio-accumulation levels of toxic metals and their in°uence on essential metals, we quantified the concentration of 19 elements (Ca, K, Fe, B, P, S, Na, Al, Zn, Ba, Rb, Sr, Cu, Mn, Hg, Cd, Mo, Cr and Pb) in bones of 105 greater white-toothed shrews (Crocidura russula) from a polluted (Ebro Delta) and a control (Medas Islands) area. Since chemical contents of a bio-indicator are mainly compositional data, conventional statistical analyses currently used in eco-toxicology can give misleading results. Therefore, to improve the interpretation of the data obtained, we used statistical techniques for compositional data analysis to define groups of metals and to evaluate the relationships between them, from an inter-population viewpoint. Hypothesis testing on the adequate balance-coordinates allow us to confirm intuition based hypothesis and some previous results. The main statistical goal was to test equal means of balance-coordinates for the two defined populations. After checking normality, one-way ANOVA or Mann-Whitney tests were carried out for the inter-group balances
Resumo:
Ranald Roderick Macdonald (1945-2007) was an important contributor to mathematical psychology in the UK, as a referee and action editor for British Journal of Mathematical and Statistical Psychology and as a participant and organizer at the British Psychological Society's Mathematics, statistics and computing section meetings. This appreciation argues that his most important contribution was to the foundations of significance testing, where his concern about what information was relevant in interpreting the results of significance tests led him to be a persuasive advocate for the 'Weak Fisherian' form of hypothesis testing.
Resumo:
Equivalence testing is growing in use in scientific research outside of its traditional role in the drug approval process. Largely due to its ease of use and recommendation from the United States Food and Drug Administration guidance, the most common statistical method for testing (bio)equivalence is the two one-sided tests procedure (TOST). Like classical point-null hypothesis testing, TOST is subject to multiplicity concerns as more comparisons are made. In this manuscript, a condition that bounds the family-wise error rate (FWER) using TOST is given. This condition then leads to a simple solution for controlling the FWER. Specifically, we demonstrate that if all pairwise comparisons of k independent groups are being evaluated for equivalence, then simply scaling the nominal Type I error rate down by (k - 1) is sufficient to maintain the family-wise error rate at the desired value or less. The resulting rule is much less conservative than the equally simple Bonferroni correction. An example of equivalence testing in a non drug-development setting is given.
Resumo:
Testing for simultaneous vicariance across comparative phylogeographic data sets is a notoriously difficult problem hindered by mutational variance, the coalescent variance, and variability across pairs of sister taxa in parameters that affect genetic divergence. We simulate vicariance to characterize the behaviour of several commonly used summary statistics across a range of divergence times, and to characterize this behaviour in comparative phylogeographic datasets having multiple taxon-pairs. We found Tajima's D to be relatively uncorrelated with other summary statistics across divergence times, and using simple hypothesis testing of simultaneous vicariance given variable population sizes, we counter-intuitively found that the variance across taxon pairs in Nei and Li's net nucleotide divergence (pi(net)), a common measure of population divergence, is often inferior to using the variance in Tajima's D across taxon pairs as a test statistic to distinguish ancient simultaneous vicariance from variable vicariance histories. The opposite and more intuitive pattern is found for testing more recent simultaneous vicariance, and overall we found that depending on the timing of vicariance, one of these two test statistics can achieve high statistical power for rejecting simultaneous vicariance, given a reasonable number of intron loci (> 5 loci, 400 bp) and a range of conditions. These results suggest that components of these two composite summary statistics should be used in future simulation-based methods which can simultaneously use a pool of summary statistics to test comparative the phylogeographic hypotheses we consider here.
Resumo:
Citation information: Armstrong RA, Davies LN, Dunne MCM & Gilmartin B. Statistical guidelines for clinical studies of human vision. Ophthalmic Physiol Opt 2011, 31, 123-136. doi: 10.1111/j.1475-1313.2010.00815.x ABSTRACT: Statistical analysis of data can be complex and different statisticians may disagree as to the correct approach leading to conflict between authors, editors, and reviewers. The objective of this article is to provide some statistical advice for contributors to optometric and ophthalmic journals, to provide advice specifically relevant to clinical studies of human vision, and to recommend statistical analyses that could be used in a variety of circumstances. In submitting an article, in which quantitative data are reported, authors should describe clearly the statistical procedures that they have used and to justify each stage of the analysis. This is especially important if more complex or 'non-standard' analyses have been carried out. The article begins with some general comments relating to data analysis concerning sample size and 'power', hypothesis testing, parametric and non-parametric variables, 'bootstrap methods', one and two-tail testing, and the Bonferroni correction. More specific advice is then given with reference to particular statistical procedures that can be used on a variety of types of data. Where relevant, examples of correct statistical practice are given with reference to recently published articles in the optometric and ophthalmic literature.
Resumo:
The rising problems associated with construction such as decreasing quality and productivity, labour shortages, occupational safety, and inferior working conditions have opened the possibility of more revolutionary solutions within the industry. One prospective option is in the implementation of innovative technologies such as automation and robotics, which has the potential to improve the industry in terms of productivity, safety and quality. The construction work site could, theoretically, be contained in a safer environment, with more efficient execution of the work, greater consistency of the outcome and higher level of control over the production process. By identifying the barriers to construction automation and robotics implementation in construction, and investigating ways in which to overcome them, contributions could be made in terms of better understanding and facilitating, where relevant, greater use of these technologies in the construction industry so as to promote its efficiency. This research aims to ascertain and explain the barriers to construction automation and robotics implementation by exploring and establishing the relationship between characteristics of the construction industry and attributes of existing construction automation and robotics technologies to level of usage and implementation in three selected countries; Japan, Australia and Malaysia. These three countries were chosen as their construction industry characteristics provide contrast in terms of culture, gross domestic product, technology application, organisational structure and labour policies. This research uses a mixed method approach of gathering data, both quantitative and qualitative, by employing a questionnaire survey and an interview schedule; using a wide range of sample from management through to on-site users, working in a range of small (less than AUD0.2million) to large companies (more than AUD500million), and involved in a broad range of business types and construction sectors. Detailed quantitative (statistical) and qualitative (content) data analysis is performed to provide a set of descriptions, relationships, and differences. The statistical tests selected for use include cross-tabulations, bivariate and multivariate analysis for investigating possible relationships between variables; and Kruskal-Wallis and Mann Whitney U test of independent samples for hypothesis testing and inferring the research sample to the construction industry population. Findings and conclusions arising from the research work which include the ranking schemes produced for four key areas of, the construction attributes on level of usage; barrier variables; differing levels of usage between countries; and future trends, have established a number of potential areas that could impact the level of implementation both globally and for individual countries.
Resumo:
Analytical expressions are derived for the mean and variance, of estimates of the bispectrum of a real-time series assuming a cosinusoidal model. The effects of spectral leakage, inherent in discrete Fourier transform operation when the modes present in the signal have a nonintegral number of wavelengths in the record, are included in the analysis. A single phase-coupled triad of modes can cause the bispectrum to have a nonzero mean value over the entire region of computation owing to leakage. The variance of bispectral estimates in the presence of leakage has contributions from individual modes and from triads of phase-coupled modes. Time-domain windowing reduces the leakage. The theoretical expressions for the mean and variance of bispectral estimates are derived in terms of a function dependent on an arbitrary symmetric time-domain window applied to the record. the number of data, and the statistics of the phase coupling among triads of modes. The theoretical results are verified by numerical simulations for simple test cases and applied to laboratory data to examine phase coupling in a hypothesis testing framework
Resumo:
The dynamic capabilities view (DCV) focuses on renewal of firms’ strategic knowledge resources so as to sustain competitive advantage within turbulent markets. Within the context of the DCV, the focus of knowledge management (KM) is to develop the KMC through deploying knowledge governance mechanisms that are conducive to facilitating knowledge processes so as to produce superior business performance over time. The essence of KM performance evaluation is to assess how well the KMC is configured with knowledge governance mechanisms and processes that enable a firm to achieve superior performance through matching its knowledge base with market needs. However, little research has been undertaken to evaluate KM performance from the DCV perspective. This study employed a survey study design and adopted hypothesis-testing approaches to develop a capability-based KM evaluation framework (CKMEF) that upholds the basic assertions of the DCV. Under the governance of the framework, a KM index (KMI) and a KM maturity model (KMMM) were derived not only to indicate the extent to which a firm’s KM implementations fulfill its strategic objectives, and to identify the evolutionary phase of its KMC, but also to bench-mark the KMC in the research population. The research design ensured that the evaluation framework and instruments have statistical significance and good generalizabilty to be applied in the research population, namely construction firms operating in the dynamic Hong Kong construction market. The study demonstrated the feasibility of quantitatively evaluating the development of the KMC and revealing the performance heterogeneity associated with the development.
Resumo:
This paper presents the application of a statistical method for model structure selection of lift-drag and viscous damping components in ship manoeuvring models. The damping model is posed as a family of linear stochastic models, which is postulated based on previous work in the literature. Then a nested test of hypothesis problem is considered. The testing reduces to a recursive comparison of two competing models, for which optimal tests in the Neyman sense exist. The method yields a preferred model structure and its initial parameter estimates. Alternatively, the method can give a reduced set of likely models. Using simulated data we study how the selection method performs when there is both uncorrelated and correlated noise in the measurements. The first case is related to instrumentation noise, whereas the second case is related to spurious wave-induced motion often present during sea trials. We then consider the model structure selection of a modern high-speed trimaran ferry from full scale trial data.
Resumo:
Integer ambiguity resolution is an indispensable procedure for all high precision GNSS applications. The correctness of the estimated integer ambiguities is the key to achieving highly reliable positioning, but the solution cannot be validated with classical hypothesis testing methods. The integer aperture estimation theory unifies all existing ambiguity validation tests and provides a new prospective to review existing methods, which enables us to have a better understanding on the ambiguity validation problem. This contribution analyses two simple but efficient ambiguity validation test methods, ratio test and difference test, from three aspects: acceptance region, probability basis and numerical results. The major contribution of this paper can be summarized as: (1) The ratio test acceptance region is overlap of ellipsoids while the difference test acceptance region is overlap of half-spaces. (2) The probability basis of these two popular tests is firstly analyzed. The difference test is an approximation to optimal integer aperture, while the ratio test follows an exponential relationship in probability. (3) The limitations of the two tests are firstly identified. The two tests may under-evaluate the failure risk if the model is not strong enough or the float ambiguities fall in particular region. (4) Extensive numerical results are used to compare the performance of these two tests. The simulation results show the ratio test outperforms the difference test in some models while difference test performs better in other models. Particularly in the medium baseline kinematic model, the difference tests outperforms the ratio test, the superiority is independent on frequency number, observation noise, satellite geometry, while it depends on success rate and failure rate tolerance. Smaller failure rate leads to larger performance discrepancy.