32 resultados para data analysis: algorithms and implementation
Resumo:
A new surface analysis technique has been developed which has a number of benefits compared to conventional Low Energy Ion Scattering Spectrometry (LEISS). A major potential advantage arising from the absence of charge exchange complications is the possibility of quantification. The instrumentation that has been developed also offers the possibility of unique studies concerning the interaction between low energy ions and atoms and solid surfaces. From these studies it may also be possible, in principle, to generate sensitivity factors to quantify LEISS data. The instrumentation, which is referred to as a Time-of-Flight Fast Atom Scattering Spectrometer has been developed to investigate these conjecture in practice. The development, involved a number of modifications to an existing instrument, and allowed samples to be bombarded with a monoenergetic pulsed beam of either atoms or ions, and provided the capability to analyse the spectra of scattered atoms and ions separately. Further to this a system was designed and constructed to allow incident, exit and azimuthal angles of the particle beam to be varied independently. The key development was that of a pulsed, and mass filtered atom source; which was developed by a cyclic process of design, modelling and experimentation. Although it was possible to demonstrate the unique capabilities of the instrument, problems relating to surface contamination prevented the measurement of the neutralisation probabilities. However, these problems appear to be technical rather than scientific in nature, and could be readily resolved given the appropriate resources. Experimental spectra obtained from a number of samples demonstrate some fundamental differences between the scattered ion and neutral spectra. For practical non-ordered surfaces the ToF spectra are more complex than their LEISS counterparts. This is particularly true for helium scattering where it appears, in the absence of detailed computer simulation, that quantitative analysis is limited to ordered surfaces. Despite this limitation the ToFFASS instrument opens the way for quantitative analysis of the 'true' surface region to a wider range of surface materials.
Resumo:
This article explains first, the reasons why a knowledge of statistics is necessary and describes the role that statistics plays in an experimental investigation. Second, the normal distribution is introduced which describes the natural variability shown by many measurements in optometry and vision sciences. Third, the application of the normal distribution to some common statistical problems including how to determine whether an individual observation is a typical member of a population and how to determine the confidence interval for a sample mean is described.
Resumo:
In this second article, statistical ideas are extended to the problem of testing whether there is a true difference between two samples of measurements. First, it will be shown that the difference between the means of two samples comes from a population of such differences which is normally distributed. Second, the 't' distribution, one of the most important in statistics, will be applied to a test of the difference between two means using a simple data set drawn from a clinical experiment in optometry. Third, in making a t-test, a statistical judgement is made as to whether there is a significant difference between the means of two samples. Before the widespread use of statistical software, this judgement was made with reference to a statistical table. Even if such tables are not used, it is useful to understand their logical structure and how to use them. Finally, the analysis of data, which are known to depart significantly from the normal distribution, will be described.
Resumo:
In any investigation in optometry involving more that two treatment or patient groups, an investigator should be using ANOVA to analyse the results assuming that the data conform reasonably well to the assumptions of the analysis. Ideally, specific null hypotheses should be built into the experiment from the start so that the treatments variation can be partitioned to test these effects directly. If 'post-hoc' tests are used, then an experimenter should examine the degree of protection offered by the test against the possibilities of making either a type 1 or a type 2 error. All experimenters should be aware of the complexity of ANOVA. The present article describes only one common form of the analysis, viz., that which applies to a single classification of the treatments in a randomised design. There are many different forms of the analysis each of which is appropriate to the analysis of a specific experimental design. The uses of some of the most common forms of ANOVA in optometry have been described in a further article. If in any doubt, an investigator should consult a statistician with experience of the analysis of experiments in optometry since once embarked upon an experiment with an unsuitable design, there may be little that a statistician can do to help.
Resumo:
1. Pearson's correlation coefficient only tests whether the data fit a linear model. With large numbers of observations, quite small values of r become significant and the X variable may only account for a minute proportion of the variance in Y. Hence, the value of r squared should always be calculated and included in a discussion of the significance of r. 2. The use of r assumes that a bivariate normal distribution is present and this assumption should be examined prior to the study. If Pearson's r is not appropriate, then a non-parametric correlation coefficient such as Spearman's rs may be used. 3. A significant correlation should not be interpreted as indicating causation especially in observational studies in which there is a high probability that the two variables are correlated because of their mutual correlations with other variables. 4. In studies of measurement error, there are problems in using r as a test of reliability and the ‘intra-class correlation coefficient’ should be used as an alternative. A correlation test provides only limited information as to the relationship between two variables. Fitting a regression line to the data using the method known as ‘least square’ provides much more information and the methods of regression and their application in optometry will be discussed in the next article.
Resumo:
Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.
Resumo:
This thesis seeks to describe the development of an inexpensive and efficient clustering technique for multivariate data analysis. The technique starts from a multivariate data matrix and ends with graphical representation of the data and pattern recognition discriminant function. The technique also results in distances frequency distribution that might be useful in detecting clustering in the data or for the estimation of parameters useful in the discrimination between the different populations in the data. The technique can also be used in feature selection. The technique is essentially for the discovery of data structure by revealing the component parts of the data. lhe thesis offers three distinct contributions for cluster analysis and pattern recognition techniques. The first contribution is the introduction of transformation function in the technique of nonlinear mapping. The second contribution is the us~ of distances frequency distribution instead of distances time-sequence in nonlinear mapping, The third contribution is the formulation of a new generalised and normalised error function together with its optimal step size formula for gradient method minimisation. The thesis consists of five chapters. The first chapter is the introduction. The second chapter describes multidimensional scaling as an origin of nonlinear mapping technique. The third chapter describes the first developing step in the technique of nonlinear mapping that is the introduction of "transformation function". The fourth chapter describes the second developing step of the nonlinear mapping technique. This is the use of distances frequency distribution instead of distances time-sequence. The chapter also includes the new generalised and normalised error function formulation. Finally, the fifth chapter, the conclusion, evaluates all developments and proposes a new program. for cluster analysis and pattern recognition by integrating all the new features.
Resumo:
The research investigates the processes of adoption and implementation, by organisations, of computer aided production management systems (CAPM). It is organised around two different theoretical perspectives. The first part is informed by the Rogers model of the diffusion, adoption and implementation of innovations, and the second part by a social constructionist approach to technology. Rogers' work is critically evaluated and a model of adoption and implementation is distilled from it and applied to a set of empirical case studies. In the light of the case study data, strengths and weaknesses of the model are identified. It is argued that the model is too rational and linear to provide an adequate explanation of adoption processes. It is useful for understanding processes of implementation but requires further development. The model is not able to adequately encompass complex computer based technologies. However, the idea of 'reinvention' is identified as Roger's key concept but it needs to be conceptually extended. Both Roger's model and definition of CAPM found in the literature from production engineering tend to treat CAPM in objectivist terms. The problems with this view are addressed through a review of the literature on the sociology of technology, and it is argued that a social constructionist approach offers a more useful framework for understanding CAPM, its nature, adoption, implementation, and use. CAPM it is argued, must be understood on terms of the ways in which it is constituted in discourse, as part of a 'struggle for meaning' on the part of academics, professional engineers, suppliers, and users.
Resumo:
In the general introduction of the road-accident phenomenon inside and outside Iran, the results of previous research-works and international conferences and seminars on road-safety have been reviewed. Also a sample-road between Tehran and Mashad has been investigated as a case-study. Examining the road-accident data and iriformation,first: the information presented in road-accident report-forms in developed countries is discussed and, second: the procedures for road-accident data collection in Iran are investigated in detail. The data supplied by Iran Road-Police Central Statistics Office, is analysed, different rates are computed, due comparisons with other nations are made, and the results are discussed. Also such analysis and comparisons are presented for different provinces of Iran. It is concluded that each province with its own natural, geographical, social and economical characteristics possesses its own reasons for the quality and quantity of road-accidents and therefore must receive its own appropriate remedial solutions. The question~ of "what is the cost of road-accidents", "why and how evaluate the cost", "what is the appropriate way of approach to such evaluation" are all discussed and then "the cost of road-accidents in Iran" based on two different approaches: "Gross National Output"and "court award" is computed. It is concluded that this cost is about 1.5 per cent of the country's national product. In Appendix 3 an impressive example is given of the trend of costs and benefits that can be attributed to investment in road-safety measures.
Resumo:
We analyze a Big Data set of geo-tagged tweets for a year (Oct. 2013–Oct. 2014) to understand the regional linguistic variation in the U.S. Prior work on regional linguistic variations usually took a long time to collect data and focused on either rural or urban areas. Geo-tagged Twitter data offers an unprecedented database with rich linguistic representation of fine spatiotemporal resolution and continuity. From the one-year Twitter corpus, we extract lexical characteristics for twitter users by summarizing the frequencies of a set of lexical alternations that each user has used. We spatially aggregate and smooth each lexical characteristic to derive county-based linguistic variables, from which orthogonal dimensions are extracted using the principal component analysis (PCA). Finally a regionalization method is used to discover hierarchical dialect regions using the PCA components. The regionalization results reveal interesting linguistic regional variations in the U.S. The discovered regions not only confirm past research findings in the literature but also provide new insights and a more detailed understanding of very recent linguistic patterns in the U.S.
Resumo:
The standard reference clinical score quantifying average Parkinson's disease (PD) symptom severity is the Unified Parkinson's Disease Rating Scale (UPDRS). At present, UPDRS is determined by the subjective clinical evaluation of the patient's ability to adequately cope with a range of tasks. In this study, we extend recent findings that UPDRS can be objectively assessed to clinically useful accuracy using simple, self-administered speech tests, without requiring the patient's physical presence in the clinic. We apply a wide range of known speech signal processing algorithms to a large database (approx. 6000 recordings from 42 PD patients, recruited to a six-month, multi-centre trial) and propose a number of novel, nonlinear signal processing algorithms which reveal pathological characteristics in PD more accurately than existing approaches. Robust feature selection algorithms select the optimal subset of these algorithms, which is fed into non-parametric regression and classification algorithms, mapping the signal processing algorithm outputs to UPDRS. We demonstrate rapid, accurate replication of the UPDRS assessment with clinically useful accuracy (about 2 UPDRS points difference from the clinicians' estimates, p < 0.001). This study supports the viability of frequent, remote, cost-effective, objective, accurate UPDRS telemonitoring based on self-administered speech tests. This technology could facilitate large-scale clinical trials into novel PD treatments.
Resumo:
Saudi Arabian Small and Medium Enterprises (SMEs) will face fierce competition from new entrants to local markets as a result of their accession to the Word Trade Organisation (WTO), and electronic commerce (e-commerce) technologies can reinforce SME’s competitive edge. This study investigates the state of e-commerce adoption and analyses the factors that determine the extent to which SMEs in Saudi Arabia are inclined towards deploying e-commerce technologies. This could assist future firms in designing effective implementation projects. Seven SMEs’ e-commerce adoption levels are studied as a case. The Technology-Organization-Environment (TOE) framework was used as the major source of inspiration in our analysis of e-commerce adoption amongst Saudi SMEs. In addition to advancing research on e-commerce in Saudi Arabia, this chapter also highlights several directions for future inquiry and implications for managers and policymakers.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
Consideration of the influence of test technique and data analysis method is important for data comparison and design purposes. The paper highlights the effects of replication interval, crack growth rate averaging and curve-fitting procedures on crack growth rate results for a Ni-base alloy. It is shown that an upper bound crack growth rate line is not appropriate for use in fatigue design, and that the derivative of a quadratic fit to the a vs N data looks promising. However, this type of averaging, or curve fitting, is not useful in developing an understanding of microstructure/crack tip interactions. For this purpose, simple replica-to-replica growth rate calculations are preferable. © 1988.