9 resultados para statistical software
em Aston University Research Archive
Resumo:
The topic of this thesis is the development of knowledge based statistical software. The shortcomings of conventional statistical packages are discussed to illustrate the need to develop software which is able to exhibit a greater degree of statistical expertise, thereby reducing the misuse of statistical methods by those not well versed in the art of statistical analysis. Some of the issues involved in the development of knowledge based software are presented and a review is given of some of the systems that have been developed so far. The majority of these have moved away from conventional architectures by adopting what can be termed an expert systems approach. The thesis then proposes an approach which is based upon the concept of semantic modelling. By representing some of the semantic meaning of data, it is conceived that a system could examine a request to apply a statistical technique and check if the use of the chosen technique was semantically sound, i.e. will the results obtained be meaningful. Current systems, in contrast, can only perform what can be considered as syntactic checks. The prototype system that has been implemented to explore the feasibility of such an approach is presented, the system has been designed as an enhanced variant of a conventional style statistical package. This involved developing a semantic data model to represent some of the statistically relevant knowledge about data and identifying sets of requirements that should be met for the application of the statistical techniques to be valid. Those areas of statistics covered in the prototype are measures of association and tests of location.
Resumo:
Statistical software is now commonly available to calculate Power (P') and sample size (N) for most experimental designs. In many circumstances, however, sample size is constrained by lack of time, cost, and in research involving human subjects, the problems of recruiting suitable individuals. In addition, the calculation of N is often based on erroneous assumptions about variability and therefore such estimates are often inaccurate. At best, we would suggest that such calculations provide only a very rough guide of how to proceed in an experiment. Nevertheless, calculation of P' is very useful especially in experiments that have failed to detect a difference which the experimenter thought was present. We would recommend that P' should always be calculated in these circumstances to determine whether the experiment was actually too small to test null hypotheses adequately.
Resumo:
With the advent of globalisation companies all around the world must improve their performance in order to survive. The threats are coming from everywhere, and in different ways, such as low cost products, high quality products, new technologies, and new products. Different companies in different countries are using various techniques and using quality criteria items to strive for excellence. Continuous improvement techniques are used to enable companies to improve their operations. Therefore, companies are using techniques such as TQM, Kaizen, Six-Sigma, Lean Manufacturing, and quality award criteria items such as Customer Focus, Human Resources, Information & Analysis, and Process Management. The purpose of this paper is to compare the use of these techniques and criteria items in two countries, Mexico and the United Kingdom, which differ in culture and industrial structure. In terms of the use of continuous improvement tools and techniques, Mexico formally started to deal with continuous improvement by creating its National Quality Award soon after the Americans began the Malcolm Baldrige National Quality Award. The United Kingdom formally started by using the European Quality Award (EQA), modified and renamed as the EFQM Excellence Model. The methodology used in this study was to undertake a literature review of the subject matter and to study some general applications around the world. A questionnaire survey was then designed and a survey undertaken based on the same scale, about the same sample size, and the about the same industrial sector within the two countries. The survey presents a brief definition of each of the constructs to facilitate understanding of the questions. The analysis of the data was then conducted with the assistance of a statistical software package. The survey results indicate both similarities and differences in the strengths and weaknesses of the companies in the two countries. One outcome of the analysis is that it enables the companies to use the results to benchmark themselves and thus act to reinforce their strengths and to reduce their weaknesses.
Resumo:
Growth in availability and ability of modern statistical software has resulted in greater numbers of research techniques being applied across the marketing discipline. However, with such advances come concerns that techniques may be misinterpreted by researchers. This issue is critical since misinterpretation could cause erroneous findings. This paper investigates some assumptions regarding: 1) the assessment of discriminant validity; and 2) what confirmatory factor analysis accomplishes. Examples that address these points are presented, and some procedural remedies are suggested based upon the literature. This paper is, therefore, primarily concerned with the development of measurement theory and practice. If advances in theory development are not based upon sound methodological practice, we as researchers could be basing our work upon shaky foundations.
Resumo:
In this second article, statistical ideas are extended to the problem of testing whether there is a true difference between two samples of measurements. First, it will be shown that the difference between the means of two samples comes from a population of such differences which is normally distributed. Second, the 't' distribution, one of the most important in statistics, will be applied to a test of the difference between two means using a simple data set drawn from a clinical experiment in optometry. Third, in making a t-test, a statistical judgement is made as to whether there is a significant difference between the means of two samples. Before the widespread use of statistical software, this judgement was made with reference to a statistical table. Even if such tables are not used, it is useful to understand their logical structure and how to use them. Finally, the analysis of data, which are known to depart significantly from the normal distribution, will be described.
Resumo:
Aim. To test a model of eight thematic determinants of whether nurses intend to remain in nursing roles. Background. Despite the dramatic increase in the supply of nurses in England over the past decade, a combination of the economic downturn, funding constraints and more generally an ageing nursing population means that healthcare organizations are likely to encounter long-term problems in the recruitment and retention of nursing staff. Design. Survey. Method. Data were collected from a large staff survey conducted in the National Health Service in England between September-December 2009. A multi-level model was tested using MPlus statistical software on a sub-sample of 16,707 nurses drawn from 167 healthcare organizations. Results. Findings were generally supportive of the proposed model. Nurses who reported being psychologically engaged with their jobs reported a lower intention to leave their current job. The perceived availability of developmental opportunities, being able to achieve a good work-life balance and whether nurses' encountered work pressures were also influencing factors on their turnover intentions. However, relationships formed with colleagues and patients displayed comparatively small relationships with turnover intentions. Conclusion. The focus at the local level needs to be on promoting employee engagement by equipping staff with the resources (physical and monetary) and control to enable them to perform their tasks to standards they aspire to and creating a work environment where staff are fully involved in the wider running of their organizations, communicating to staff that patient care is important and the top priority of the organization. © 2012 Blackwell Publishing Ltd.
Resumo:
This book is aimed primarily at microbiologists who are undertaking research and who require a basic knowledge of statistics to analyse their experimental data. Computer software employing a wide range of data analysis methods is widely available to experimental scientists. The availability of this software, however, makes it essential that investigators understand the basic principles of statistics. Statistical analysis of data can be complex with many different methods of approach, each of which applies in a particular experimental circumstance. Hence, it is possible to apply an incorrect statistical method to data and to draw the wrong conclusions from an experiment. The purpose of this book, which has its origin in a series of articles published in the Society for Applied Microbiology journal ‘The Microbiologist’, is an attempt to present the basic logic of statistics as clearly as possible and therefore, to dispel some of the myths that often surround the subject. The 28 ‘Statnotes’ deal with various topics that are likely to be encountered, including the nature of variables, the comparison of means of two or more groups, non-parametric statistics, analysis of variance, correlating variables, and more complex methods such as multiple linear regression and principal components analysis. In each case, the relevant statistical method is illustrated with examples drawn from experiments in microbiological research. The text incorporates a glossary of the most commonly used statistical terms and there are two appendices designed to aid the investigator in the selection of the most appropriate test.
Resumo:
DEA literature continues apace but software has lagged behind. This session uses suitably selected data to present newly developed software which includes many of the most recent DEA models. The software enables the user to address a variety of issues not frequently found in existing DEA software such as: -Assessments under a variety of possible assumptions of returns to scale including NIRS and NDRS; -Scale elasticity computations; -Numerous Input/Output variables and truly unlimited number of assessment units (DMUs) -Panel data analysis -Analysis of categorical data (multiple categories) -Malmquist Index and its decompositions -Computations of Supper efficiency -Automated removal of super-efficient outliers under user-specified criteria; -Graphical presentation of results -Integrated statistical tests
Resumo:
The accurate identification of T-cell epitopes remains a principal goal of bioinformatics within immunology. As the immunogenicity of peptide epitopes is dependent on their binding to major histocompatibility complex (MHC) molecules, the prediction of binding affinity is a prerequisite to the reliable prediction of epitopes. The iterative self-consistent (ISC) partial-least-squares (PLS)-based additive method is a recently developed bioinformatic approach for predicting class II peptide−MHC binding affinity. The ISC−PLS method overcomes many of the conceptual difficulties inherent in the prediction of class II peptide−MHC affinity, such as the binding of a mixed population of peptide lengths due to the open-ended class II binding site. The method has applications in both the accurate prediction of class II epitopes and the manipulation of affinity for heteroclitic and competitor peptides. The method is applied here to six class II mouse alleles (I-Ab, I-Ad, I-Ak, I-As, I-Ed, and I-Ek) and included peptides up to 25 amino acids in length. A series of regression equations highlighting the quantitative contributions of individual amino acids at each peptide position was established. The initial model for each allele exhibited only moderate predictivity. Once the set of selected peptide subsequences had converged, the final models exhibited a satisfactory predictive power. Convergence was reached between the 4th and 17th iterations, and the leave-one-out cross-validation statistical terms - q2, SEP, and NC - ranged between 0.732 and 0.925, 0.418 and 0.816, and 1 and 6, respectively. The non-cross-validated statistical terms r2 and SEE ranged between 0.98 and 0.995 and 0.089 and 0.180, respectively. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made freely available online (http://www.jenner.ac.uk/MHCPred).