979 resultados para STATISTICAL-ANALYSIS
Resumo:
The topic of this thesis is the development of knowledge based statistical software. The shortcomings of conventional statistical packages are discussed to illustrate the need to develop software which is able to exhibit a greater degree of statistical expertise, thereby reducing the misuse of statistical methods by those not well versed in the art of statistical analysis. Some of the issues involved in the development of knowledge based software are presented and a review is given of some of the systems that have been developed so far. The majority of these have moved away from conventional architectures by adopting what can be termed an expert systems approach. The thesis then proposes an approach which is based upon the concept of semantic modelling. By representing some of the semantic meaning of data, it is conceived that a system could examine a request to apply a statistical technique and check if the use of the chosen technique was semantically sound, i.e. will the results obtained be meaningful. Current systems, in contrast, can only perform what can be considered as syntactic checks. The prototype system that has been implemented to explore the feasibility of such an approach is presented, the system has been designed as an enhanced variant of a conventional style statistical package. This involved developing a semantic data model to represent some of the statistically relevant knowledge about data and identifying sets of requirements that should be met for the application of the statistical techniques to be valid. Those areas of statistics covered in the prototype are measures of association and tests of location.
Resumo:
The central argument to this thesis is that the nature and purpose of corporate reporting has changed over time to become a more outward looking and forward looking document designed to promote the company and its performance to a wide range of shareholders, rather than merely to report to its owners upon past performance. it is argued that the discourse of environmental accounting and reporting is one driver for this change but that this discourse has been set up as in conflicting with the discourse of traditional accounting and performance measurement. The effect of this opposition between the discourses is that the two have been interpreted to be different and incompatible dimensions of performance with good performance along one dimension only being achievable through a sacrifice of performance along the other dimension. Thus a perceived dialectic in performance is believed to exist. One of the principal purposes of this thesis is to explore this perceived dialectic and, through analysis, to show that it does not exist and that there is not incompatibility. This exploration and analysis is based upon an investigation of the inherent inconsistencies in such corporate reports and the analysis makes use of both a statistical analysis and a semiotic analysis of corporate reports and the reported performance of companies along these dimensions. Thus the development of a semiology of corporate reporting is one of the significant outcomes of this thesis. A further outcome is a consideration of the implications of the analysis for corporate performance and its measurement. The thesis concludes with a consideration of the way in which the advent of electronic reporting may affect the ability of organisations to maintain the dialectic and the implications for corporate reporting.
Resumo:
Orthodox contingency theory links effective organisational performance to compatible relationships between the environment and organisation strategy and structure and assumes that organisations have the capacity to adapt as the environment changes. Recent contributions to the literature on organisation theory claim that the key to effective performance is effective adaptation which in turn requires the simultaneous reconciliation of efficiency and innovation which is afforded by an unique environment-organisation configuration. The literature on organisation theory recognises the continuing confusion caused by the fragmented and often conflicting results from cross-sectional studies. Although the case is made for longitudinal studies which comprehensively describe the evolving relationship between the environment and the organisation there is little to suggest how such studies should be executed in practice. Typically the choice is between the approaches of the historicised case study and statistical analysis of large populations which examine the relationship between environment and organisation strategy and/or structure and ignore the product-process relationship. This study combines the historicised case study and the multi-variable and ordinal scale approach of statistical analysis to construct an analytical framework which tracks and exposes the environment-organisation-performance relationship over time. The framework examines changes in the environment, strategy and structure and uniquely includes an assessment of the organisation's product-process relationship and its contribution to organisational efficiency and innovation. The analytical framework is applied to examine the evolving environment-organisation relationship of two organisations in the same industry over the same twenty-five year period to provide a sector perspective of organisational adaptation. The findings demonstrate the significance of the environment-organisation configuration to the scope and frequency of adaptation and suggest that the level of sector homogeneity may be linked to the level of product-process standardisation.
Resumo:
Citation information: Armstrong RA, Davies LN, Dunne MCM & Gilmartin B. Statistical guidelines for clinical studies of human vision. Ophthalmic Physiol Opt 2011, 31, 123-136. doi: 10.1111/j.1475-1313.2010.00815.x ABSTRACT: Statistical analysis of data can be complex and different statisticians may disagree as to the correct approach leading to conflict between authors, editors, and reviewers. The objective of this article is to provide some statistical advice for contributors to optometric and ophthalmic journals, to provide advice specifically relevant to clinical studies of human vision, and to recommend statistical analyses that could be used in a variety of circumstances. In submitting an article, in which quantitative data are reported, authors should describe clearly the statistical procedures that they have used and to justify each stage of the analysis. This is especially important if more complex or 'non-standard' analyses have been carried out. The article begins with some general comments relating to data analysis concerning sample size and 'power', hypothesis testing, parametric and non-parametric variables, 'bootstrap methods', one and two-tail testing, and the Bonferroni correction. More specific advice is then given with reference to particular statistical procedures that can be used on a variety of types of data. Where relevant, examples of correct statistical practice are given with reference to recently published articles in the optometric and ophthalmic literature.
Resumo:
In this study, a new entropy measure known as kernel entropy (KerEnt), which quantifies the irregularity in a series, was applied to nocturnal oxygen saturation (SaO 2) recordings. A total of 96 subjects suspected of suffering from sleep apnea-hypopnea syndrome (SAHS) took part in the study: 32 SAHS-negative and 64 SAHS-positive subjects. Their SaO 2 signals were separately processed by means of KerEnt. Our results show that a higher degree of irregularity is associated to SAHS-positive subjects. Statistical analysis revealed significant differences between the KerEnt values of SAHS-negative and SAHS-positive groups. The diagnostic utility of this parameter was studied by means of receiver operating characteristic (ROC) analysis. A classification accuracy of 81.25% (81.25% sensitivity and 81.25% specificity) was achieved. Repeated apneas during sleep increase irregularity in SaO 2 data. This effect can be measured by KerEnt in order to detect SAHS. This non-linear measure can provide useful information for the development of alternative diagnostic techniques in order to reduce the demand for conventional polysomnography (PSG). © 2011 IEEE.
Resumo:
The adverse health effects of long-term exposure to lead are well established, with major uptake into the human body occurring mainly through oral ingestion by young children. Lead-based paint was frequently used in homes built before 1978, particularly in inner-city areas. Minority populations experience the effects of lead poisoning disproportionately. ^ Lead-based paint abatement is costly. In the United States, residents of about 400,000 homes, occupied by 900,000 young children, lack the means to correct lead-based paint hazards. The magnitude of this problem demands research on affordable methods of hazard control. One method is encapsulation, defined as any covering or coating that acts as a permanent barrier between the lead-based paint surface and the environment. ^ Two encapsulants were tested for reliability and effective life span through an accelerated lifetime experiment that applied stresses exceeding those encountered under normal use conditions. The resulting time-to-failure data were used to extrapolate the failure time under conditions of normal use. Statistical analysis and models of the test data allow forecasting of long-term reliability relative to the 20-year encapsulation requirement. Typical housing material specimens simulating walls and doors coated with lead-based paint were overstressed before encapsulation. A second, un-aged set was also tested. Specimens were monitored after the stress test with a surface chemical testing pad to identify the presence of lead breaking through the encapsulant. ^ Graphical analysis proposed by Shapiro and Meeker and the general log-linear model developed by Cox were used to obtain results. Findings for the 80% reliability time to failure varied, with close to 21 years of life under normal use conditions for encapsulant A. The application of product A on the aged gypsum and aged wood substrates yielded slightly lower times. Encapsulant B had an 80% reliable life of 19.78 years. ^ This study reveals that encapsulation technologies can offer safe and effective control of lead-based paint hazards and may be less expensive than other options. The U.S. Department of Health and Human Services and the CDC are committed to eliminating childhood lead poisoning by 2010. This ambitious target is feasible, provided there is an efficient application of innovative technology, a goal to which this study aims to contribute. ^
Resumo:
This study is a variationist sociolinguistic analysis of two speech styles, performance and interview, of a dinner theatre troupe in Ferryland on the Southern Shore of Newfoundland. Five actors and ten of their characters are analyzed to test if their vowels change across styles. The study adopts a variationist framework with a Community of Practice model, drawing on Bell’s audience and referee design to argue that the performers’ stage conventions and identity construction are influenced by a third person referee: the Idealized Authentic Newfoundlander (IAN). Under this view the goal of the performer is to both communicate with and entertain the audience, which requires different tactics when speaking. These tactics manifest phonetically and are discussed in a quantitative, statistical analysis of the acoustic measurements of the vowel tokens [variables FACE, KIT, LOT/PALM and GOAT lexical sets with Newfoundland Irish English (NIE) variants] and a qualitative discussion.
Resumo:
We are grateful for the co-operation and assistance that we received from NHS staff in the co-ordinating centres and clinical sites. We thank the women who participated in TOMBOLA. The TOMBOLA trial was supported by the Medical Research Council (G9700808) and the NHS in England and Scotland. The TOMBOLA Group comprises the following: Grant-holders: University of Aberdeen and NHS Grampian, Aberdeen, Scotland: Maggie Cruickshank, Graeme Murray, David Parkin, Louise Smart, Eric Walker, Norman Waugh (Principal Investigator 2004–2008) University of Nottingham and Nottingham NHS, Nottingham, England: Mark Avis, Claire Chilvers, Katherine Fielding, Rob Hammond, David Jenkins, Jane Johnson, Keith Neal, Ian Russell, Rashmi Seth, Dave Whynes University of Dundee and NHS Tayside, Dundee, Tayside: Ian Duncan, Alistair Robertson (deceased) University of Ottawa, Ottawa, Canada: Julian Little (Principal Investigator 1999–2004) National Cancer Registry, Cork, Ireland: Linda Sharp Bangor University, Bangor, Wales: Ian Russell University of Hull, Hull, England: Leslie G Walker Staff in clinical sites and co-ordinating centres Grampian Breda Anthony, Sarah Bell, Adrienne Bowie, Katrina Brown (deceased), Joe Brown, Kheng Chew, Claire Cochran, Seonaidh Cotton, Jeannie Dean, Kate Dunn, Jane Edwards, David Evans, Julie Fenty, Al Finlayson, Marie Gallagher, Nicola Gray, Maureen Heddle, Alison Innes, Debbie Jobson, Mandy Keillor, Jayne MacGregor, Sheona Mackenzie, Amanda Mackie, Gladys McPherson, Ike Okorocha, Morag Reilly, Joan Rodgers, Alison Thornton, Rachel Yeats Tayside Lindyanne Alexander, Lindsey Buchanan, Susan Henderson, Tine Iterbeke, Susanneke Lucas, Gillian Manderson, Sheila Nicol, Gael Reid, Carol Robinson, Trish Sandilands Nottingham Marg Adrian, Ahmed Al-Sahab, Elaine Bentley, Hazel Brook, Claire Bushby, Rita Cannon, Brenda Cooper, Ruth Dowell, Mark Dunderdale, Dr Gabrawi, Li Guo, Lisa Heideman, Steve Jones, Salli Lawson, Zoë Philips, Christopher Platt, Shakuntala Prabhakaran, John Rippin, Rose Thompson, Elizabeth Williams, Claire Woolley Statistical analysis Seonaidh Cotton, Kirsten Harrild, John Norrie, Linda Sharp External Trial Steering Committee Nicholas Day (chair, 1999–2004), Theresa Marteau (chair 2004-), Mahesh Parmar, Julietta Patnick and Ciaran Woodman.
Resumo:
The role of Constitutional Courts in deeply divided societies is complicated by the danger that the salient societal cleavages may influence judicial decision-making and, consequently, undermine judicial independence and impartiality. With reference to the decisions of the Constitutional Court of Bosnia-Herzegovina, this article investigates the influence of ethno-nationalism on judicial behaviour and the extent to which variation in judicial tenure amplifies or dampens that influence. Based on a statistical analysis of an original dataset of the Court’s decisions, we find that the judges do in fact divide predictably along ethno-national lines, at least in certain types of cases, and that these divisions cannot be reduced to a residual loyalty to their appointing political parties. Contrary to some theoretical expectations, however, we find that long-term tenure does little to dampen the influence of ethno-nationalism on judicial behaviour. Moreover, our findings suggest that the longer a judge serves on the Court the more ethno-national affiliation seems to influence her decision-making. We conclude by considering how alternative arrangements for the selection and tenure of judges might help to ameliorate this problem.
Resumo:
Increases in pediatric thyroid cancer incidence could be partly due to previous clinical intervention. This retrospective cohort study used 1973-2012 data from the Surveillance Epidemiology and End Results program to assess the association between previous radiation therapy exposure in development of second primary thyroid cancer (SPTC) among 0-19-year-old children. Statistical analysis included the calculation of summary statistics and univariable and multivariable logistic regression analysis. Relative to no previous radiation therapy exposure, cases exposed to radiation had 2.46 times the odds of developing SPTC (95% CI: 1.39-4.34). After adjustment for sex and age at diagnosis, Hispanic children who received radiation therapy for a first primary malignancy had 3.51 times the odds of developing SPTC compared to Hispanic children who had not received radiation therapy, [AOR=3.51, 99% CI: 0.69-17.70, p=0.04]. These findings support the development of age-specific guidelines for the use of radiation based interventions among children with and without cancer.
Resumo:
This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.
Resumo:
Resumo:
Background: Statistical analysis of DNA microarray data provides a valuable diagnostic tool for the investigation of genetic components of diseases. To take advantage of the multitude of available data sets and analysis methods, it is desirable to combine both different algorithms and data from different studies. Applying ensemble learning, consensus clustering and cross-study normalization methods for this purpose in an almost fully automated process and linking different analysis modules together under a single interface would simplify many microarray analysis tasks. Results: We present ArrayMining.net, a web-application for microarray analysis that provides easy access to a wide choice of feature selection, clustering, prediction, gene set analysis and cross-study normalization methods. In contrast to other microarray-related web-tools, multiple algorithms and data sets for an analysis task can be combined using ensemble feature selection, ensemble prediction, consensus clustering and cross-platform data integration. By interlinking different analysis tools in a modular fashion, new exploratory routes become available, e.g. ensemble sample classification using features obtained from a gene set analysis and data from multiple studies. The analysis is further simplified by automatic parameter selection mechanisms and linkage to web tools and databases for functional annotation and literature mining. Conclusion: ArrayMining.net is a free web-application for microarray analysis combining a broad choice of algorithms based on ensemble and consensus methods, using automatic parameter selection and integration with annotation databases.
Resumo:
This thesis is concerned with change point analysis for time series, i.e. with detection of structural breaks in time-ordered, random data. This long-standing research field regained popularity over the last few years and is still undergoing, as statistical analysis in general, a transformation to high-dimensional problems. We focus on the fundamental »change in the mean« problem and provide extensions of the classical non-parametric Darling-Erdős-type cumulative sum (CUSUM) testing and estimation theory within highdimensional Hilbert space settings. In the first part we contribute to (long run) principal component based testing methods for Hilbert space valued time series under a rather broad (abrupt, epidemic, gradual, multiple) change setting and under dependence. For the dependence structure we consider either traditional m-dependence assumptions or more recently developed m-approximability conditions which cover, e.g., MA, AR and ARCH models. We derive Gumbel and Brownian bridge type approximations of the distribution of the test statistic under the null hypothesis of no change and consistency conditions under the alternative. A new formulation of the test statistic using projections on subspaces allows us to simplify the standard proof techniques and to weaken common assumptions on the covariance structure. Furthermore, we propose to adjust the principal components by an implicit estimation of a (possible) change direction. This approach adds flexibility to projection based methods, weakens typical technical conditions and provides better consistency properties under the alternative. In the second part we contribute to estimation methods for common changes in the means of panels of Hilbert space valued time series. We analyze weighted CUSUM estimates within a recently proposed »high-dimensional low sample size (HDLSS)« framework, where the sample size is fixed but the number of panels increases. We derive sharp conditions on »pointwise asymptotic accuracy« or »uniform asymptotic accuracy« of those estimates in terms of the weighting function. Particularly, we prove that a covariance-based correction of Darling-Erdős-type CUSUM estimates is required to guarantee uniform asymptotic accuracy under moderate dependence conditions within panels and that these conditions are fulfilled, e.g., by any MA(1) time series. As a counterexample we show that for AR(1) time series, close to the non-stationary case, the dependence is too strong and uniform asymptotic accuracy cannot be ensured. Finally, we conduct simulations to demonstrate that our results are practically applicable and that our methodological suggestions are advantageous.
Resumo:
This thesis analyzes the impact of heat extremes in urban and rural environments, considering processes related to severely high temperatures and unusual dryness. The first part deals with the influence of large-scale heatwave events on the local-scale urban heat island (UHI) effect. The temperatures recorded over a 20-year summer period by meteorological stations in 37 European cities are examined to evaluate the variations of UHI during heatwaves with respect to non-heatwave days. A statistical analysis reveals a negligible impact of large-scale extreme temperatures on the local daytime urban climate, while a notable exacerbation of UHI effect at night. A comparison with the UrbClim model outputs confirms the UHI strengthening during heatwave episodes, with an intensity independent of the climate zone. The investigation of the relationship between large-scale temperature anomalies and UHI highlights a smooth and continuous dependence, but with a strong variability. The lack of a threshold behavior in this relationship suggests that large-scale temperature variability can affect the local-scale UHI even in different conditions than during extreme events. The second part examines the transition from meteorological to agricultural drought, being the first stage of the drought propagation process. A multi-year reanalysis dataset involving numerous drought events over the Iberian Peninsula is considered. The behavior of different non-parametric standardized drought indices in drought detection is evaluated. A statistical approach based on run theory is employed, analyzing the main characteristics of drought propagation. The propagation from meteorological to agricultural drought events is found to develop in about 1-2 months. The duration of agricultural drought appears shorter than that of meteorological drought, but the onset is delayed. The propagation probability increases with the severity of the originating meteorological drought. A new combined agricultural drought index is developed to be a useful tool for balancing the characteristics of other adopted indices.