19 resultados para Statistics - Analysis
em Aston University Research Archive
Resumo:
This book is aimed primarily at microbiologists who are undertaking research and who require a basic knowledge of statistics to analyse their experimental data. Computer software employing a wide range of data analysis methods is widely available to experimental scientists. The availability of this software, however, makes it essential that investigators understand the basic principles of statistics. Statistical analysis of data can be complex with many different methods of approach, each of which applies in a particular experimental circumstance. Hence, it is possible to apply an incorrect statistical method to data and to draw the wrong conclusions from an experiment. The purpose of this book, which has its origin in a series of articles published in the Society for Applied Microbiology journal ‘The Microbiologist’, is an attempt to present the basic logic of statistics as clearly as possible and therefore, to dispel some of the myths that often surround the subject. The 28 ‘Statnotes’ deal with various topics that are likely to be encountered, including the nature of variables, the comparison of means of two or more groups, non-parametric statistics, analysis of variance, correlating variables, and more complex methods such as multiple linear regression and principal components analysis. In each case, the relevant statistical method is illustrated with examples drawn from experiments in microbiological research. The text incorporates a glossary of the most commonly used statistical terms and there are two appendices designed to aid the investigator in the selection of the most appropriate test.
Resumo:
The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.
Resumo:
Loss of coherence with increasing excitation amplitudes and spatial size modulation is a fundamental problem in designing Raman fiber lasers. While it is known that ramping up laser pump power increases the amplitude of stochastic excitations, such higher energy inputs can also lead to a transition from a linearly stable coherent laminar regime to a non-desirable disordered turbulent state. This report presents a new statistical methodology, based on first passage statistics, that classifies lasing regimes in Raman fiber lasers, thereby leading to a fast and highly accurate identification of a strong instability leading to a laminar-turbulent phase transition through a self-consistently defined order parameter. The results have been consistent across a wide range of pump power values, heralding a breakthrough in the non-invasive analysis of fiber laser dynamics.
Resumo:
Since the original Data Envelopment Analysis (DEA) study by Charnes et al. [Measuring the efficiency of decision-making units. European Journal of Operational Research 1978;2(6):429–44], there has been rapid and continuous growth in the field. As a result, a considerable amount of published research has appeared, with a significant portion focused on DEA applications of efficiency and productivity in both public and private sector activities. While several bibliographic collections have been reported, a comprehensive listing and analysis of DEA research covering its first 30 years of history is not available. This paper thus presents an extensive, if not nearly complete, listing of DEA research covering theoretical developments as well as “real-world” applications from inception to the year 2007. A listing of the most utilized/relevant journals, a keyword analysis, and selected statistics are presented.
Resumo:
This accessible, practice-oriented and compact text provides a hands-on introduction to the principles of market research. Using the market research process as a framework, the authors explain how to collect and describe the necessary data and present the most important and frequently used quantitative analysis techniques, such as ANOVA, regression analysis, factor analysis, and cluster analysis. An explanation is provided of the theoretical choices a market researcher has to make with regard to each technique, as well as how these are translated into actions in IBM SPSS Statistics. This includes a discussion of what the outputs mean and how they should be interpreted from a market research perspective. Each chapter concludes with a case study that illustrates the process based on real-world data. A comprehensive web appendix includes additional analysis techniques, datasets, video files and case studies. Several mobile tags in the text allow readers to quickly browse related web content using a mobile device.
Resumo:
In some applications of data envelopment analysis (DEA) there may be doubt as to whether all the DMUs form a single group with a common efficiency distribution. The Mann-Whitney rank statistic has been used to evaluate if two groups of DMUs come from a common efficiency distribution under the assumption of them sharing a common frontier and to test if the two groups have a common frontier. These procedures have subsequently been extended using the Kruskal-Wallis rank statistic to consider more than two groups. This technical note identifies problems with the second of these applications of both the Mann-Whitney and Kruskal-Wallis rank statistics. It also considers possible alternative methods of testing if groups have a common frontier, and the difficulties of disaggregating managerial and programmatic efficiency within a non-parametric framework. © 2007 Springer Science+Business Media, LLC.
Resumo:
Different types of numerical data can be collected in a scientific investigation and the choice of statistical analysis will often depend on the distribution of the data. A basic distinction between variables is whether they are ‘parametric’ or ‘non-parametric’. When a variable is parametric, the data come from a symmetrically shaped distribution known as the ‘Gaussian’ or ‘normal distribution’ whereas non-parametric variables may have a distribution which deviates markedly in shape from normal. This article describes several aspects of the problem of non-normality including: (1) how to test for two common types of deviation from a normal distribution, viz., ‘skew’ and ‘kurtosis’, (2) how to fit the normal distribution to a sample of data, (3) the transformation of non-normally distributed data and scores, and (4) commonly used ‘non-parametric’ statistics which can be used in a variety of circumstances.
Resumo:
We investigate the feasibility of simultaneous suppressing of the amplification noise and nonlinearity, representing the most fundamental limiting factors in modern optical communication. To accomplish this task we developed a general design optimisation technique, based on concepts of noise and nonlinearity management. We demonstrate the immense efficiency of the novel approach by applying it to a design optimisation of transmission lines with periodic dispersion compensation using Raman and hybrid Raman-EDFA amplification. Moreover, we showed, using nonlinearity management considerations, that the optimal performance in high bit-rate dispersion managed fibre systems with hybrid amplification is achieved for a certain amplifier spacing – which is different from commonly known optimal noise performance corresponding to fully distributed amplification. Required for an accurate estimation of the bit error rate, the complete knowledge of signal statistics is crucial for modern transmission links with strong inherent nonlinearity. Therefore, we implemented the advanced multicanonical Monte Carlo (MMC) method, acknowledged for its efficiency in estimating distribution tails. We have accurately computed acknowledged for its efficiency in estimating distribution tails. We have accurately computed marginal probability density functions for soliton parameters, by numerical modelling of Fokker-Plank equation applying the MMC simulation technique. Moreover, applying a powerful MMC method we have studied the BER penalty caused by deviations from the optimal decision level in systems employing in-line 2R optical regeneration. We have demonstrated that in such systems the analytical linear approximation that makes a better fit in the central part of the regenerator nonlinear transfer function produces more accurate approximation of the BER and BER penalty. We present a statistical analysis of RZ-DPSK optical signal at direct detection receiver with Mach-Zehnder interferometer demodulation
Resumo:
Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.
Resumo:
There is an alternative model of the 1-way ANOVA called the 'random effects' model or ‘nested’ design in which the objective is not to test specific effects but to estimate the degree of variation of a particular measurement and to compare different sources of variation that influence the measurement in space and/or time. The most important statistics from a random effects model are the components of variance which estimate the variance associated with each of the sources of variation influencing a measurement. The nested design is particularly useful in preliminary experiments designed to estimate different sources of variation and in the planning of appropriate sampling strategies.
Resumo:
This article explains first, the reasons why a knowledge of statistics is necessary and describes the role that statistics plays in an experimental investigation. Second, the normal distribution is introduced which describes the natural variability shown by many measurements in optometry and vision sciences. Third, the application of the normal distribution to some common statistical problems including how to determine whether an individual observation is a typical member of a population and how to determine the confidence interval for a sample mean is described.
Resumo:
In this second article, statistical ideas are extended to the problem of testing whether there is a true difference between two samples of measurements. First, it will be shown that the difference between the means of two samples comes from a population of such differences which is normally distributed. Second, the 't' distribution, one of the most important in statistics, will be applied to a test of the difference between two means using a simple data set drawn from a clinical experiment in optometry. Third, in making a t-test, a statistical judgement is made as to whether there is a significant difference between the means of two samples. Before the widespread use of statistical software, this judgement was made with reference to a statistical table. Even if such tables are not used, it is useful to understand their logical structure and how to use them. Finally, the analysis of data, which are known to depart significantly from the normal distribution, will be described.
Resumo:
Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.
Resumo:
In the general introduction of the road-accident phenomenon inside and outside Iran, the results of previous research-works and international conferences and seminars on road-safety have been reviewed. Also a sample-road between Tehran and Mashad has been investigated as a case-study. Examining the road-accident data and iriformation,first: the information presented in road-accident report-forms in developed countries is discussed and, second: the procedures for road-accident data collection in Iran are investigated in detail. The data supplied by Iran Road-Police Central Statistics Office, is analysed, different rates are computed, due comparisons with other nations are made, and the results are discussed. Also such analysis and comparisons are presented for different provinces of Iran. It is concluded that each province with its own natural, geographical, social and economical characteristics possesses its own reasons for the quality and quantity of road-accidents and therefore must receive its own appropriate remedial solutions. The question~ of "what is the cost of road-accidents", "why and how evaluate the cost", "what is the appropriate way of approach to such evaluation" are all discussed and then "the cost of road-accidents in Iran" based on two different approaches: "Gross National Output"and "court award" is computed. It is concluded that this cost is about 1.5 per cent of the country's national product. In Appendix 3 an impressive example is given of the trend of costs and benefits that can be attributed to investment in road-safety measures.
Resumo:
Firstly, we numerically model a practical 20 Gb/s undersea configuration employing the Return-to-Zero Differential Phase Shift Keying data format. The modelling is completed using the Split-Step Fourier Method to solve the Generalised Nonlinear Schrdinger Equation. We optimise the dispersion map and per-channel launch power of these channels and investigate how the choice of pre/post compensation can influence the performance. After obtaining these optimal configurations, we investigate the Bit Error Rate estimation of these systems and we see that estimation based on Gaussian electrical current systems is appropriate for systems of this type, indicating quasi-linear behaviour. The introduction of narrower pulses due to the deployment of quasi-linear transmission decreases the tolerance to chromatic dispersion and intra-channel nonlinearity. We used tools from Mathematical Statistics to study the behaviour of these channels in order to develop new methods to estimate Bit Error Rate. In the final section, we consider the estimation of Eye Closure Penalty, a popular measure of signal distortion. Using a numerical example and assuming the symmetry of eye closure, we see that we can simply estimate Eye Closure Penalty using Gaussian statistics. We also see that the statistics of the logical ones dominates the statistics of the logical ones dominates the statistics of signal distortion in the case of Return-to-Zero On-Off Keying configurations.