952 resultados para Linear models (Statistics)
Resumo:
Undoubtedly, statistics has become one of the most important subjects in the modern world, where its applications are ubiquitous. The importance of statistics is not limited to statisticians, but also impacts upon non-statisticians who have to use statistics within their own disciplines. Several studies have indicated that most of the academic departments around the world have realized the importance of statistics to non-specialist students. Therefore, the number of students enrolled in statistics courses has vastly increased, coming from a variety of disciplines. Consequently, research within the scope of statistics education has been able to develop throughout the last few years. One important issue is how statistics is best taught to, and learned by, non-specialist students. This issue is controlled by several factors that affect the learning and teaching of statistics to non-specialist students, such as the use of technology, the role of the English language (especially for those whose first language is not English), the effectiveness of statistics teachers and their approach towards teaching statistics courses, students’ motivation to learn statistics and the relevance of statistics courses to the main subjects of non-specialist students. Several studies, focused on aspects of learning and teaching statistics, have been conducted in different countries around the world, particularly in Western countries. Conversely, the situation in Arab countries, especially in Saudi Arabia, is different; here, there is very little research in this scope, and what there is does not meet the needs of those countries towards the development of learning and teaching statistics to non-specialist students. This research was instituted in order to develop the field of statistics education. The purpose of this mixed methods study was to generate new insights into this subject by investigating how statistics courses are currently taught to non-specialist students in Saudi universities. Hence, this study will contribute towards filling the knowledge gap that exists in Saudi Arabia. This study used multiple data collection approaches, including questionnaire surveys from 1053 non-specialist students who had completed at least one statistics course in different colleges of the universities in Saudi Arabia. These surveys were followed up with qualitative data collected via semi-structured interviews with 16 teachers of statistics from colleges within all six universities where statistics is taught to non-specialist students in Saudi Arabia’s Eastern Region. The data from questionnaires included several types, so different techniques were used in analysis. Descriptive statistics were used to identify the demographic characteristics of the participants. The chi-square test was used to determine associations between variables. Based on the main issues that are raised from literature review, the questions (items scales) were grouped and five key groups of questions were obtained which are: 1) Effectiveness of Teachers; 2) English Language; 3) Relevance of Course; 4) Student Engagement; 5) Using Technology. Exploratory data analysis was used to explore these issues in more detail. Furthermore, with the existence of clustering in the data (students within departments within colleges, within universities), multilevel generalized linear models for dichotomous analysis have been used to clarify the effects of clustering at those levels. Factor analysis was conducted confirming the dimension reduction of variables (items scales). The data from teachers’ interviews were analysed on an individual basis. The responses were assigned to one of the eight themes that emerged from within the data: 1) the lack of students’ motivation to learn statistics; 2) students' participation; 3) students’ assessment; 4) the effective use of technology; 5) the level of previous mathematical and statistical skills of non-specialist students; 6) the English language ability of non-specialist students; 7) the need for extra time for teaching and learning statistics; and 8) the role of administrators. All the data from students and teachers indicated that the situation of learning and teaching statistics to non-specialist students in Saudi universities needs to be improved in order to meet the needs of those students. The findings of this study suggested a weakness in the use of statistical software applications in these courses. This study showed that there is lack of application of technology such as statistical software programs in these courses, which would allow non-specialist students to consolidate their knowledge. The results also indicated that English language is considered one of the main challenges in learning and teaching statistics, particularly in institutions where English is not used as the main language. Moreover, the weakness of mathematical skills of students is considered another major challenge. Additionally, the results indicated that there was a need to tailor statistics courses to the needs of non-specialist students based on their main subjects. The findings indicate that statistics teachers need to choose appropriate methods when teaching statistics courses.
Resumo:
The presence of gap junction coupling among neurons of the central nervous systems has been appreciated for some time now. In recent years there has been an upsurge of interest from the mathematical community in understanding the contribution of these direct electrical connections between cells to large-scale brain rhythms. Here we analyze a class of exactly soluble single neuron models, capable of producing realistic action potential shapes, that can be used as the basis for understanding dynamics at the network level. This work focuses on planar piece-wise linear models that can mimic the firing response of several different cell types. Under constant current injection the periodic response and phase response curve (PRC) is calculated in closed form. A simple formula for the stability of a periodic orbit is found using Floquet theory. From the calculated PRC and the periodic orbit a phase interaction function is constructed that allows the investigation of phase-locked network states using the theory of weakly coupled oscillators. For large networks with global gap junction connectivity we develop a theory of strong coupling instabilities of the homogeneous, synchronous and splay state. For a piece-wise linear caricature of the Morris-Lecar model, with oscillations arising from a homoclinic bifurcation, we show that large amplitude oscillations in the mean membrane potential are organized around such unstable orbits.
Resumo:
Resistant hypertension (RHTN) includes patients with controlled blood pressure (BP) (CRHTN) and uncontrolled BP (UCRHTN). In fact, RHTN patients are more likely to have target organ damage (TOD), and resistin, leptin and adiponectin may affect BP control in these subjects. We assessed the relationship between adipokines levels and arterial stiffness, left ventricular hypertrophy (LVH) and microalbuminuria (MA). This cross-sectional study included CRHTN (n=51) and UCRHTN (n=38) patients for evaluating body mass index, ambulatory blood pressure monitoring, plasma adiponectin, leptin and resistin concentrations, pulse wave velocity (PWV), MA and echocardiography. Leptin and resistin levels were higher in UCRHTN, whereas adiponectin levels were lower in this same subgroup. Similarly, arterial stiffness, LVH and MA were higher in UCRHTN subgroup. Adiponectin levels negatively correlated with PWV (r=-0.42, P<0.01), and MA (r=-0.48, P<0.01) only in UCRHTN. Leptin was positively correlated with PWV (r=0.37, P=0.02) in UCRHTN subgroup, whereas resistin was not correlated with TOD in both subgroups. Adiponectin is associated with arterial stiffness and renal injury in UCRHTN patients, whereas leptin is associated with arterial stiffness in the same subgroup. Taken together, our results showed that those adipokines may contribute to vascular and renal damage in UCRHTN patients.
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física
Resumo:
Diagnostic methods have been an important tool in regression analysis to detect anomalies, such as departures from error assumptions and the presence of outliers and influential observations with the fitted models. Assuming censored data, we considered a classical analysis and Bayesian analysis assuming no informative priors for the parameters of the model with a cure fraction. A Bayesian approach was considered by using Markov Chain Monte Carlo Methods with Metropolis-Hasting algorithms steps to obtain the posterior summaries of interest. Some influence methods, such as the local influence, total local influence of an individual, local influence on predictions and generalized leverage were derived, analyzed and discussed in survival data with a cure fraction and covariates. The relevance of the approach was illustrated with a real data set, where it is shown that, by removing the most influential observations, the decision about which model best fits the data is changed.
Resumo:
In many occupational safety interventions, the objective is to reduce the injury incidence as well as the mean claims cost once injury has occurred. The claims cost data within a period typically contain a large proportion of zero observations (no claim). The distribution thus comprises a point mass at 0 mixed with a non-degenerate parametric component. Essentially, the likelihood function can be factorized into two orthogonal components. These two components relate respectively to the effect of covariates on the incidence of claims and the magnitude of claims, given that claims are made. Furthermore, the longitudinal nature of the intervention inherently imposes some correlation among the observations. This paper introduces a zero-augmented gamma random effects model for analysing longitudinal data with many zeros. Adopting the generalized linear mixed model (GLMM) approach reduces the original problem to the fitting of two independent GLMMs. The method is applied to evaluate the effectiveness of a workplace risk assessment teams program, trialled within the cleaning services of a Western Australian public hospital.
Resumo:
We compare Bayesian methodology utilizing free-ware BUGS (Bayesian Inference Using Gibbs Sampling) with the traditional structural equation modelling approach based on another free-ware package, Mx. Dichotomous and ordinal (three category) twin data were simulated according to different additive genetic and common environment models for phenotypic variation. Practical issues are discussed in using Gibbs sampling as implemented by BUGS to fit subject-specific Bayesian generalized linear models, where the components of variation may be estimated directly. The simulation study (based on 2000 twin pairs) indicated that there is a consistent advantage in using the Bayesian method to detect a correct model under certain specifications of additive genetics and common environmental effects. For binary data, both methods had difficulty in detecting the correct model when the additive genetic effect was low (between 10 and 20%) or of moderate range (between 20 and 40%). Furthermore, neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large (50%). Power was significantly improved with ordinal data for most scenarios, except for the case of low heritability under a true ACE model. We illustrate and compare both methods using data from 1239 twin pairs over the age of 50 years, who were registered with the Australian National Health and Medical Research Council Twin Registry (ATR) and presented symptoms associated with osteoarthritis occurring in joints of the hand.
Resumo:
This paper proposes a template for modelling complex datasets that integrates traditional statistical modelling approaches with more recent advances in statistics and modelling through an exploratory framework. Our approach builds on the well-known and long standing traditional idea of 'good practice in statistics' by establishing a comprehensive framework for modelling that focuses on exploration, prediction, interpretation and reliability assessment, a relatively new idea that allows individual assessment of predictions. The integrated framework we present comprises two stages. The first involves the use of exploratory methods to help visually understand the data and identify a parsimonious set of explanatory variables. The second encompasses a two step modelling process, where the use of non-parametric methods such as decision trees and generalized additive models are promoted to identify important variables and their modelling relationship with the response before a final predictive model is considered. We focus on fitting the predictive model using parametric, non-parametric and Bayesian approaches. This paper is motivated by a medical problem where interest focuses on developing a risk stratification system for morbidity of 1,710 cardiac patients given a suite of demographic, clinical and preoperative variables. Although the methods we use are applied specifically to this case study, these methods can be applied across any field, irrespective of the type of response.
Resumo:
Dissertação apresentada para obtenção do Grau de Doutor em Engenharia Electrotécnica e de Computadores – Sistemas Digitais e Percepcionais pela Universidade Nova de Lisboa, Faculdade de Ciências e Tecnologia
Resumo:
In health related research it is common to have multiple outcomes of interest in a single study. These outcomes are often analysed separately, ignoring the correlation between them. One would expect that a multivariate approach would be a more efficient alternative to individual analyses of each outcome. Surprisingly, this is not always the case. In this article we discuss different settings of linear models and compare the multivariate and univariate approaches. We show that for linear regression models, the estimates of the regression parameters associated with covariates that are shared across the outcomes are the same for the multivariate and univariate models while for outcome-specific covariates the multivariate model performs better in terms of efficiency.
Resumo:
In this work, kriging with covariates is used to model and map the spatial distribution of salinity measurements gathered by an autonomous underwater vehicle in a sea outfall monitoring campaign aiming to distinguish the effluent plume from the receiving waters and characterize its spatial variability in the vicinity of the discharge. Four different geostatistical linear models for salinity were assumed, where the distance to diffuser, the west-east positioning, and the south-north positioning were used as covariates. Sample variograms were fitted by the Mat`ern models using weighted least squares and maximum likelihood estimation methods as a way to detect eventual discrepancies. Typically, the maximum likelihood method estimated very low ranges which have limited the kriging process. So, at least for these data sets, weighted least squares showed to be the most appropriate estimation method for variogram fitting. The kriged maps show clearly the spatial variation of salinity, and it is possible to identify the effluent plume in the area studied. The results obtained show some guidelines for sewage monitoring if a geostatistical analysis of the data is in mind. It is important to treat properly the existence of anomalous values and to adopt a sampling strategy that includes transects parallel and perpendicular to the effluent dispersion.
Resumo:
STUDY DESIGN: Prospective, controlled, observational outcome study using clinical, radiographic, and patient/physician-based questionnaire data, with patient outcomes at 12 months follow-up. OBJECTIVE: To validate appropriateness criteria for low back surgery. SUMMARY OF BACKGROUND DATA: Most surgical treatment failures are attributed to poor patient selection, but no widely accepted consensus exists on detailed indications for appropriate surgery. METHODS: Appropriateness criteria for low back surgery have been developed by a multispecialty panel using the RAND appropriateness method. Based on panel criteria, a prospective study compared outcomes of patients appropriately and inappropriately treated at a single institution with 12 months follow-up assessment. Included were patients with low back pain and/or sciatica referred to the neurosurgical department. Information about symptoms, neurologic signs, the health-related quality of life (SF-36), disability status (Roland-Morris), and pain intensity (VAS) was assessed at baseline, at 6 months, and at 12 months follow-up. The appropriateness criteria were administered prospectively to each clinical situation and outside of the clinical setting, with the surgeon and patients blinded to the results of the panel decision. The patients were further stratified into 2 groups: appropriate treatment group (ATG) and inappropriate treatment group (ITG). RESULTS: Overall, 398 patients completed all forms at 12 months. Treatment was considered appropriate for 365 participants and inappropriate for 33 participants. The mean improvement in the SF-36 physical component score at 12 months was significantly higher in the ATG (mean: 12.3 points) than in the ITG (mean: 6.8 points) (P = 0.01), as well as the mean improvement in the SF-36 mental component score (ATG mean: 5.0 points; ITG mean: -0.5 points) (P = 0.02). Improvement was also significantly higher in the ATG for the mean VAS back pain (ATG mean: 2.3 points; ITG mean: 0.8 points; P = 0.02) and Roland-Morris disability score (ATG mean: 7.7 points; ITG mean: 4.2 points; P = 0.004). The ATG also had a higher improvement in mean VAS for sciatica (4.0 points) than the ITG (2.8 points), but the difference was not significant (P = 0.08). The SF-36 General Health score declined in both groups after 12 months, however, the decline was worse in the ITG (mean decline: 8.2 points) than in the ATG (mean decline: 1.2 points) (P = 0.04). Overall, in comparison to ITG patients, ATG patients had significantly higher improvement at 12 months, both statistically and clinically. CONCLUSION: In comparison to previously reported literature, our study is the first to assess the utility of appropriateness criteria for low back surgery at 1-year follow-up with multiple outcome dimensions. Our results confirm the hypothesis that application of appropriateness criteria can significantly improve patient outcomes.
Resumo:
In automobile insurance, it is useful to achieve a priori ratemaking by resorting to gene- ralized linear models, and here the Poisson regression model constitutes the most widely accepted basis. However, insurance companies distinguish between claims with or without bodily injuries, or claims with full or partial liability of the insured driver. This paper exa- mines an a priori ratemaking procedure when including two di®erent types of claim. When assuming independence between claim types, the premium can be obtained by summing the premiums for each type of guarantee and is dependent on the rating factors chosen. If the independence assumption is relaxed, then it is unclear as to how the tari® system might be a®ected. In order to answer this question, bivariate Poisson regression models, suitable for paired count data exhibiting correlation, are introduced. It is shown that the usual independence assumption is unrealistic here. These models are applied to an automobile insurance claims database containing 80,994 contracts belonging to a Spanish insurance company. Finally, the consequences for pure and loaded premiums when the independence assumption is relaxed by using a bivariate Poisson regression model are analysed.
Resumo:
BACKGROUND: To date, there is no quality assurance program that correlates patient outcome to perfusion service provided during cardiopulmonary bypass (CPB). A score was devised, incorporating objective parameters that would reflect the likelihood to influence patient outcome. The purpose was to create a new method for evaluating the quality of care the perfusionist provides during CPB procedures and to deduce whether it predicts patient morbidity and mortality. METHODS: We analysed 295 consecutive elective patients. We chose 10 parameters: fluid balance, blood transfused, Hct, ACT, PaO2, PaCO2, pH, BE, potassium and CPB time. Distribution analysis was performed using the Shapiro-Wilcoxon test. This made up the PerfSCORE and we tried to find a correlation to mortality rate, patient stay in the ICU and length of mechanical ventilation. Univariate analysis (UA) using linear regression was established for each parameter. Statistical significance was established when p < 0.05. Multivariate analysis (MA) was performed with the same parameters. RESULTS: The mean age was 63.8 +/- 12.6 years with 70% males. There were 180 CABG, 88 valves, and 27 combined CABG/valve procedures. The PerfSCORE of 6.6 +/- 2.4 (0-20), mortality of 2.7% (8/295), CPB time 100 +/- 41 min (19-313), ICU stay 52 +/- 62 hrs (7-564) and mechanical ventilation of 10.5 +/- 14.8 hrs (0-564) was calculated. CPB time, fluid balance, PaO2, PerfSCORE and blood transfused were significantly correlated to mortality (UA, p < 0.05). Also, CPB time, blood transfused and PaO2 were parameters predicting mortality (MA, p < 0.01). Only pH was significantly correlated for predicting ICU stay (UA). Ultrafiltration (UF) and CPB time were significantly correlated (UA, p < 0.01) while UF (p < 0.05) was the only parameter predicting mechanical ventilation duration (MA). CONCLUSIONS: CPB time, blood transfused and PaO2 are independent risk factors of mortality. Fluid balance, blood transfusion, PaO2, PerfSCORE and CPB time are independent parameters for predicting morbidity. PerfSCORE is a quality of perfusion measure that objectively quantifies perfusion performance.
Resumo:
1. Model-based approaches have been used increasingly in conservation biology over recent years. Species presence data used for predictive species distribution modelling are abundant in natural history collections, whereas reliable absence data are sparse, most notably for vagrant species such as butterflies and snakes. As predictive methods such as generalized linear models (GLM) require absence data, various strategies have been proposed to select pseudo-absence data. However, only a few studies exist that compare different approaches to generating these pseudo-absence data. 2. Natural history collection data are usually available for long periods of time (decades or even centuries), thus allowing historical considerations. However, this historical dimension has rarely been assessed in studies of species distribution, although there is great potential for understanding current patterns, i.e. the past is the key to the present. 3. We used GLM to model the distributions of three 'target' butterfly species, Melitaea didyma, Coenonympha tullia and Maculinea teleius, in Switzerland. We developed and compared four strategies for defining pools of pseudo-absence data and applied them to natural history collection data from the last 10, 30 and 100 years. Pools included: (i) sites without target species records; (ii) sites where butterfly species other than the target species were present; (iii) sites without butterfly species but with habitat characteristics similar to those required by the target species; and (iv) a combination of the second and third strategies. Models were evaluated and compared by the total deviance explained, the maximized Kappa and the area under the curve (AUC). 4. Among the four strategies, model performance was best for strategy 3. Contrary to expectations, strategy 2 resulted in even lower model performance compared with models with pseudo-absence data simulated totally at random (strategy 1). 5. Independent of the strategy model, performance was enhanced when sites with historical species presence data were not considered as pseudo-absence data. Therefore, the combination of strategy 3 with species records from the last 100 years achieved the highest model performance. 6. Synthesis and applications. The protection of suitable habitat for species survival or reintroduction in rapidly changing landscapes is a high priority among conservationists. Model-based approaches offer planning authorities the possibility of delimiting priority areas for species detection or habitat protection. The performance of these models can be enhanced by fitting them with pseudo-absence data relying on large archives of natural history collection species presence data rather than using randomly sampled pseudo-absence data.