902 resultados para Generalised Linear Modelling
Resumo:
Behavioral researchers commonly use single subject designs to evaluate the effects of a given treatment. Several different methods of data analysis are used, each with their own set of methodological strengths and limitations. Visual inspection is commonly used as a method of analyzing data which assesses the variability, level, and trend both within and between conditions (Cooper, Heron, & Heward, 2007). In an attempt to quantify treatment outcomes, researchers developed two methods for analysing data called Percentage of Non-overlapping Data Points (PND) and Percentage of Data Points Exceeding the Median (PEM). The purpose of the present study is to compare and contrast the use of Hierarchical Linear Modelling (HLM), PND and PEM in single subject research. The present study used 39 behaviours, across 17 participants to compare treatment outcomes of a group cognitive behavioural therapy program, using PND, PEM, and HLM on three response classes of Obsessive Compulsive Behaviour in children with Autism Spectrum Disorder. Findings suggest that PEM and HLM complement each other and both add invaluable information to the overall treatment results. Future research should consider using both PEM and HLM when analysing single subject designs, specifically grouped data with variability.
Resumo:
We introduce a diagnostic test for the mixing distribution in a generalised linear mixed model. The test is based on the difference between the marginal maximum likelihood and conditional maximum likelihood estimates of a subset of the fixed effects in the model. We derive the asymptotic variance of this difference, and propose a test statistic that has a limiting chi-square distribution under the null hypothesis that the mixing distribution is correctly specified. For the important special case of the logistic regression model with random intercepts, we evaluate via simulation the power of the test in finite samples under several alternative distributional forms for the mixing distribution. We illustrate the method by applying it to data from a clinical trial investigating the effects of hormonal contraceptives in women.
Resumo:
Protocols for bioassessment often relate changes in summary metrics that describe aspects of biotic assemblage structure and function to environmental stress. Biotic assessment using multimetric indices now forms the basis for setting regulatory standards for stream quality and a range of other goals related to water resource management in the USA and elsewhere. Biotic metrics are typically interpreted with reference to the expected natural state to evaluate whether a site is degraded. It is critical that natural variation in biotic metrics along environmental gradients is adequately accounted for, in order to quantify human disturbance-induced change. A common approach used in the IBI is to examine scatter plots of variation in a given metric along a single stream size surrogate and a fit a line (drawn by eye) to form the upper bound, and hence define the maximum likely value of a given metric in a site of a given environmental characteristic (termed the 'maximum species richness line' - MSRL). In this paper we examine whether the use of a single environmental descriptor and the MSRL is appropriate for defining the reference condition for a biotic metric (fish species richness) and for detecting human disturbance gradients in rivers of south-eastern Queensland, Australia. We compare the accuracy and precision of the MSRL approach based on single environmental predictors, with three regression-based prediction methods (Simple Linear Regression, Generalised Linear Modelling and Regression Tree modelling) that use (either singly or in combination) a set of landscape and local scale environmental variables as predictors of species richness. We compared the frequency of classification errors from each method against set biocriteria and contrast the ability of each method to accurately reflect human disturbance gradients at a large set of test sites. The results of this study suggest that the MSRL based upon variation in a single environmental descriptor could not accurately predict species richness at minimally disturbed sites when compared with SLR's based on equivalent environmental variables. Regression-based modelling incorporating multiple environmental variables as predictors more accurately explained natural variation in species richness than did simple models using single environmental predictors. Prediction error arising from the MSRL was substantially higher than for the regression methods and led to an increased frequency of Type I errors (incorrectly classing a site as disturbed). We suggest that problems with the MSRL arise from the inherent scoring procedure used and that it is limited to predicting variation in the dependent variable along a single environmental gradient.
Resumo:
Grazing by domestic livestock is one of the most widespread uses of the rangelands of Australia. There is limited information on the effects of grazing by domestic livestock on the vertebrate fauna of Australia and the establishment of a long-term grazing experiment in north-eastern Queensland at Wambiana provided an opportunity to attempt an examination of the changes in vertebrate fauna as a consequence of the manipulation of stocking rates. The aim was to identify what the relative effects of vegetation type, stocking rate and other landscape-scale environmental factors were on the patterns recorded. Sixteen 1-ha sites were established within three replicated treatments (moderate, heavy and variable stocking rates). The sites were sampled in the wet and dry seasons in 1999-2000 (T-0) and again in 2003-04 (T-1). All paddocks of the treatments were burnt in 1999. Average annual rainfall declined markedly between the two sampling periods, which made interpretation of the data difficult. A total of 127 species of vertebrate fauna comprising five amphibian, 83 bird, 27 reptile and 12 mammal species were recorded. There was strong separation in faunal composition from T-0 to T-1 although changes in mean compositional dissimilarity between the grazing stocking rate treatments were less well defined. There was a relative change in abundance of 24 bird, four mammal and five reptile species from T-0 to T-1. The generalised linear modelling identified that, in the T-1 data, there was significant variation in the abundance of 16 species explained by the grazing and vegetation factors. This study demonstrated that vertebrate fauna assemblage did change and that these changes were attributable to the interplay between the stocking rates, the vegetation types on the sites surveyed, the burning of the experimental paddocks and the decrease in rainfall over the course of the two surveys. It is recommended that the experiment is sampled again but that the focus should be on a rapid survey of abundant taxa (i.e. birds and reptiles) to allow an increase in the frequency of sampling and replication of the data. This would help to articulate more clearly the trajectory of vertebrate change due to the relative effects of stocking rates compared with wider landscape environmental changes. Given the increasing focus on pastoral development in northern Australia, any opportunity to incorporate the collection of data on biodiversity into grazing manipulation experiments should be taken for the assessment of the effects of land management on faunal species.
Resumo:
Number of days spent in acute hospitals (DAH) at the end of life is regarded as an important care quality indicator for cancer patients. We analysed DAH during 90 days prior to death in patients from four Swiss cantons. Claims data from an insurance provider with about 20% market share and patient record review identified 2086 patients as dying of cancer. We calculated total DAH per patient. Multivariable generalised linear modelling served to evaluate potential explanatory variables. Mean DAH was 26 days. In the multivariable model, using complementary and alternative medicine (DAH = 33.9; +8.8 days compared to non-users) and canton of residence (for patient receiving anti-cancer therapy, Zürich DAH = 22.8 versus Basel DAH = 31.4; for other patients, Valais DAH = 22.7 versus Ticino DAH = 33.7) had the strongest influence. Age at death and days spent in other institutions were additional significant predictors. DAH during the last 90 days of life of cancer patients from four Swiss cantons is high compared to most other countries. Several factors influence DAH. Resulting differences are likely to have financial impact, as DAH is a major cost driver for end-of-life care. Whether they are supply- or demand-driven and whether patients would prefer fewer days in hospital remains to be established.
Resumo:
El análisis del rendimiento en deportes juega un papel esencial en el fútbol profesional. Aunque el estudio del análisis del juego en fútbol se ha utilizado desde diferentes ámbitos y situaciones, todavía existen diferentes aspectos y componentes del juego que siguen sin estar estudiados. En este sentido existen diferentes aspectos que deben de superar los estudios previos centrados en el componente descriptivo tales como el uso de variables/ indicadores de rendimiento que no se han definido ni estudiado, la validez de los métodos observaciones que no han sido testados con los softwares específicos en fútbol, la aplicación y utilidad de los resultados, así como las limitaciones del estudio de las variables situacionales/contextuales. Con el objetivo de cubrir las citadas limitaciones se han diseñado 6 estudios independientes e inter-relacionados que tratan de estudiar los aspectos anteriormente referidos. El primer estudio evalua la fiabilidad inter-observadores de las estadísticas de juego de la empresa privada OPTA Sportsdata, estos datos son la muestra de estudio de la presente tesis doctoral. Dos grupos de observadores experimentados se requieren para analizar un partido de la liga española de manera independiente. Los resultados muestran que los eventos de equipos y porteros codificados por los inter-operadores alcanzan un acuerdo muy bueno (valores kappa entre 0.86 y 0.94). La validez inter-observadores de las acciones de juego y los datos de jugadores individuales se evaluó con elevados niveles de acuerdo (valores del coeficiente de correlación intraclase entre 0.88 hasta 1.00, el error típico estandarizado variaba entre 0.00 hasta 0.37). Los resultados sugieren que las estadísticas de juego registradas por los operadores de la empresa OPTA Sportsdata están bien entrenados y son fiables. El segundo, tercer y cuarto estudio se centran en resaltar la aplicabilidad del análisis de rendimiento en el fútbol así como para explicar en profundidad las influencias de las variables situacionales. Utilizando la técnica de los perfiles de rendimiento de jugadores y equipos de fútbol se puede evaluar y comparar de manera gráfica, fácil y visual. Así mismo, mediante esta técnica se puede controlar el efecto de las variables situacionales (localización del partido, nivel del equipo y del oponente, y el resultado final del partido). Los perfiles de rendimiento de porteros (n = 46 porteros, 744 observaciones) y jugadores de campo (n = 409 jugadores, 5288 observaciones) de la primera division professional de fútbol Española (La Liga, temporada 2012-13), los equipos (n = 496 partidos, 992 observaciones) de la UEFA Champions League (temporadas 2009-10 a 2012-13) fueron analizados registrando la media, desviación típica, mediana, cuartiles superior e inferior y el recuento de valores de cada indicador de rendimiento y evento, los cuales se presentaron en su forma tipificada y normalizada. Los valores medios de los porteros de los equipos de diferentes niveles de La Liga y de los equipos de diferente nivel de la UEFA Champions League cuando jugaban en diferentes contextos de juego y situaciones (variables situacionales) fueron comparados utilizando el ANOVA de un factor y la prueba t para muestras independientes (localización del partido, diferencias entre casa y fuera), y fueron establecidos en los perfiles de red después de unificar todos los registros en la misma escala derivada con valores estandarizados. Mientras que las diferencias de rendimiento entre los jugadores de los mejores equipos (Top3) y los peores (Bottom3) fueron comparados mediante el uso de diferencias en la magnitud del tamaño del efecto. El quinto y el sexto estudio analizaban el rendimiento del fútbol desde un punto de vista de predicción del rendimiento. El modelo linear general y el modelo lineal general mixto fue empleado para analizar la magnitud de las relaciones de los indicadores y estadísticas de juego con el resultado final del partido en función del tipo de partido (partidos ajustados o todos los partidos) en la fase de grupos de la Copa del Mundo 2014 de Brasil (n = 48 partidos, 38 partidos ajustados) y La Liga 2012-13 (n = 320 partidos ajustados). Las relaciones fueron evaluadas mediante las inferencias en la magnitud de las diferencias y se expresaron como partidos extra ganados o perdidos por cada 10 partidos mediante la variable calculada en 2 desviaciones típicas. Los resultados mostraron que, para los 48 partidos de la fase de grupos de la Copa del Mundo 2014, nueve variables tuvieron un efecto positive en la probabilidad de ganar (tiros, tiros a puerta, tiros de contraataque, tiros dentro del área, posesión de balón, pases en corto, media de secuencia de pases, duelos aéreos y entradas), cuatro tuvieron efectos negativos (tiros bloqueados, centros, regates y tarjetas amarillas), y otras 12 variables tenían efectos triviales o poco claros. Mientras que los 38 partidos ajustados, el efecto de duelos aéreos y tarjetas amarillas fueron triviales y claramente negativos respectivamente. En la La Liga, existió un efecto moderado positive para cada equipo para los tiros a puerta (3.4 victorias extras por cada 10 partidos; 99% IC ±1.0), y un efecto positivo reducido para tiros totales (1.7 victorias extrsa; ±1.0). Los efectos de la mayoría de los eventos se han relacionado con la posesión del balón, la cual obtuvo efectos negativos entre equipos (1.2 derrotas extras; ±1.0) pero un efecto positivo pequeño entra equipos (1.7 victorias extras; ±1.4). La localización del partido mostró un efecto positive reducido dentro de los equipos (1.9 victorias extras; ±0.9). Los resultados obtenidos en los perfiles y el modelado del rendimiento permiten ofrecer una información detallada y avanzada para el entrenamiento, la preparación previa a los partidos, el control de la competición y el análisis post-partido, así como la evaluación e identificación del talento de los jugadores. ABSTRACT Match performance analysis plays an important role in the modern professional football. Although the research in football match analysis is well-developed, there are still some issues and problems remaining in this field, which mainly include the lack of operational definitions of variables, reliability issues, applicability of the findings, the lack of contextual/situational variables, and focusing too much on descriptive and comparative analysis. In order to address these issues, six independent but related studies were conducted in the current thesis. The first study evaluated the inter-operator reliability of football match statistics from OPTA Sportsdata Company which is the data resourse of the thesis. Two groups of experienced operators were required to analyse a Spanish league match independently in the experiment. Results showed that team events and goalkeeper actions coded by independent operators reached a very good agreement (kappa values between 0.86 and 0.94). The inter-operator reliability of match actions and events of individual outfield players was also tested to be at a high level (intra-class correlation coefficients ranged from 0.88 to 1.00, standardised typical error varied from 0.00 to 0.37). These results suggest that the football match statistics collected by well-trained operators from OPTA Sportsdata Company are reliable. The second, third and fourth study aims to enhance the applicability of football match performance analysis and to explore deeply the influences of situational variables. By using a profiling technique, technical and tactical performances of football players and teams can be interpreted, evaluated and compared more easily and straightforwardly, meanwhile, influences and effects from situational variables (match location, strength of team and opposition, and match outcome) on the performances can be properly incorporated. Performance profiles of goalkeepers (n = 46 goalkeepers, 744 full match observations) and outfield players (n = 409 players, 5288 full match observations) from the Spanish First Division Professional Football League (La Liga, season 2012-13), teams (n = 496 matches, 992 observations) from UEFA Champions League (seasons 2009-10 to 2012-13) were set up by presenting the mean, standard deviation, median, lower and upper quartiles of the count values of each performance-related match action and event to represent their typical performances and spreads. Means of goalkeeper from different levels of team in La Liga and teams of different strength in UEFA Champions League when playing under different situational conditions were compared by using one-way ANOVA and independent sample t test (for match location, home and away differences), and were plotted into the same radar charts after unifying all the event counts by standardised score. While differences between the performances of outfield players from Top3 and from Bottom3 teams were compared by magnitude-based inferences. The fifth and sixth study aims to move from the descriptive and comparative football match analysis to a more predictive one. Generalised linear modelling and generalised mixed linear modelling were undertaken to quantify relationships of the performance-related match events, actions and variables with the match outcome in different types of games (close games and all games) in the group stage of 2014 Brazil FIFA World Cup (n = 48 games, 38 close games) and La Liga 2012-13 (n = 320 close games). Relationships were evaluated with magnitude-based inferences and were expressed as extra matches won or lost per 10 matches for an increase of two standard deviations of a variable. Results showed that, for all the 48 games in the group stage of 2014 FIFA World Cup, nine variables had clearly positive effects on the probability of winning (shot, shot on target, shot from counter attack, shot from inside area, ball possession, short pass, average pass streak, aerial advantage, and tackle), four had clearly negative effects (shot blocked, cross, dribble and red card), other 12 variabless had either trivial or unclear effects. While for the 38 close games, the effects of aerial advantage and yellow card turned to trivial and clearly negative, respectively. In the La Liga, there was a moderate positive within-team effect from shots on target (3.4 extra wins per 10 matches; 99% confidence limits ±1.0), and a small positive within-team effect from total shots (1.7 extra wins; ±1.0). Effects of most other match events were related to ball possession, which had a small negative within-team effect (1.2 extra losses; ±1.0) but a small positive between-team effect (1.7 extra wins; ±1.4). Game location showed a small positive within-team effect (1.9 extra wins; ±0.9). Results from the established performance profiles and modelling can provide detailed and straightforward information for training, pre-match preparations, in-match tactical approaches and post-match evaluations, as well as for player identification and development. 摘要 比赛表现分析在现代足球中起着举足轻重的作用。尽管如今对足球比赛表现分析的研究已经相对完善,但仍有很多不足之处。这些不足主要体现在:研究中缺乏对研究变量的清晰定义、数据信效度缺失、研究结果的实用性受限、比赛情境因素缺失以及过于集中在描述性和对比性分析等。针对这些问题,本论文通过六个独立而又相互联系的研究,进一步对足球比赛表现分析进行完善。 第一个研究对本论文的数据源--OPTA Sportsdata公司的足球比赛数据的信效度进行了实验检验。实验中,两组数据收集人员被要求对同一场西班牙足球甲级联赛的比赛进行分析。研究结果显示,两组收集人员记录下的球队比赛事件和守门员比赛行为具有高度的一致性(卡帕系数介于0.86和0.94)。收集人员输出的外场球员的比赛行为和比赛事件也具有很高的组间一致性(ICC相关系数介于0.88和1.00,标准化典型误差介于0.00和0.37)。实验结果证明了OPTA Sportsdata公司收集的足球比赛数据具有足够高的信效度。 第二、三、四个研究旨在提升足球比赛表现分析研究结果的实用性以及深度探讨比赛情境因素对足球比赛表现的影响。通过对足球运动员和运动队的比赛技战术表现进行档案创建,可以对运动员和运动队的比赛表现进行简直接而直观的呈现、评价和对比,同时,情境变量(比赛场地、球队和对手实力、比赛结果)对比赛表现的影响也可以被整合到表现档案中。本部分对2012-13赛季西班牙足球甲级联赛的参赛守门员(n = 46球员人次,744比赛场次)和外场球员(n = 409球员人次, 5288比赛场次)以及2009-10至2012-13赛季欧洲足球冠军联赛的参赛球队(n = 496比赛场次)的比赛技战术表现进行了档案创建。在表现档案中,各项比赛技战术指标的均值、标准差、中位数和大小四分位数被用来展现守门员、外场球员和球队的普遍表现和表现浮动性。方差分析(ANOVA)被用来对西甲不同水平球队的守门员、欧冠中不同水平球队在不同比赛情境下的普遍表现(各项指标的均值)进行对比,独立样本t检验被用来对比主客场比赛普遍表现的差异。数据量级推断(magnitude-based inferences)的方法则被用来对西甲前三名和最后三名球队外场球员的普遍表现进行对比分析。所有来自不同水平球队的运动员和不同水平运动队的各项比赛指标皆被转换成了标准分数,从而能把他们在各种不同比赛情境下的普遍表现(各项比赛指标的均值)投到相同的雷达图中进行直观的对比。 第五和第六个研究目的在于进行预测性足球比赛表现分析,从而跨越之前固有的描述性和对比性分析。广义线性模型和广义混合线性模型被用来对2014年巴西世界杯小组赛(n = 48 比赛场次,38小分差场次)和2012-13赛季西甲联赛(n = 320小分差场次)的比赛中各表现相关比赛事件、行为和变量与比赛结果(胜、平、负)的关系进行建模。模型中的关系通过数据量级推断(magnitude-based inferences)的方法来界定,具体表现为某个变量增加两个标准差对比赛结果的影响(每10场比赛中额外取胜或失利的场数)。研究结果显示,在2014年巴西世界杯小组赛的所有48场比赛中,9个变量(射门、射正、反击中射门、禁区内射门、控球、短传、连续传球平均次数、高空球争抢成功率和抢断)与赢球概率有清晰的正相关关系,4个变量(射门被封堵、传中、过人和红牌)与赢球概率有清晰的负相关关系,其他12个被分析的变量与赢球概率的相关关系微小或不清晰。而在38场小分差比赛中,高空球争抢成功率由正相关变为微小关系,黄牌则由微小关系变为清晰的负相关。在西甲联赛中,每一支球队增加两个标准差的“射正球门”可以给每10场比赛带来3.4场额外胜利(99%置信区间±1.0场),而所有球队作为一个整体,每增加两个标准差的“射正球门”可以给每10场比赛带来1.7场额外胜利(99%置信区间±1.0场)。其他大多数比赛相关事件与比赛结果的相关关系与“控球”相关联。每一支球队增加两个标准差的“控球”将会给每10场比赛带来1.2场额外失利(99%置信区间±1.0场),而所有球队作为一个整体,每增加两个标准差的“控球”可以给每10场比赛带来1.7场额外胜利(99%置信区间±1.4场)。与客场比赛相对,主场能给球队带来1.9 /10场额外胜利(99%置信区间±0.9场)。 比赛表现档案和模型中得出的研究结果可以为俱乐部、足球队、教练组、表现分析师和运动员提供详细而直接的参考信息。这些信息可用于训练指导、赛前备战、赛中技战术调整和赛后技战术表现分析,也可运用于足球运动员选材、培养和发展。
Resumo:
Statistical methods are often used to analyse commercial catch and effort data to provide standardised fishing effort and/or a relative index of fish abundance for input into stock assessment models. Achieving reliable results has proved difficult in Australia's Northern Prawn Fishery (NPF), due to a combination of such factors as the biological characteristics of the animals, some aspects of the fleet dynamics, and the changes in fishing technology. For this set of data, we compared four modelling approaches (linear models, mixed models, generalised estimating equations, and generalised linear models) with respect to the outcomes of the standardised fishing effort or the relative index of abundance. We also varied the number and form of vessel covariates in the models. Within a subset of data from this fishery, modelling correlation structures did not alter the conclusions from simpler statistical models. The random-effects models also yielded similar results. This is because the estimators are all consistent even if the correlation structure is mis-specified, and the data set is very large. However, the standard errors from different models differed, suggesting that different methods have different statistical efficiency. We suggest that there is value in modelling the variance function and the correlation structure, to make valid and efficient statistical inferences and gain insight into the data. We found that fishing power was separable from the indices of prawn abundance only when we offset the impact of vessel characteristics at assumed values from external sources. This may be due to the large degree of confounding within the data, and the extreme temporal changes in certain aspects of individual vessels, the fleet and the fleet dynamics.
Resumo:
Vigilance declines when exposed to highly predictable and uneventful tasks. Monotonous tasks provide little cognitive and motor stimulation and contribute to human errors. This paper aims to model and detect vigilance decline in real time through participant’s reaction times during a monotonous task. A lab-based experiment adapting the Sustained Attention to Response Task (SART) is conducted to quantify the effect of monotony on overall performance. Then relevant parameters are used to build a model detecting hypovigilance throughout the experiment. The accuracy of different mathematical models are compared to detect in real-time – minute by minute - the lapses in vigilance during the task. We show that monotonous tasks can lead to an average decline in performance of 45%. Furthermore, vigilance modelling enables to detect vigilance decline through reaction times with an accuracy of 72% and a 29% false alarm rate. Bayesian models are identified as a better model to detect lapses in vigilance as compared to Neural Networks and Generalised Linear Mixed Models. This modelling could be used as a framework to detect vigilance decline of any human performing monotonous tasks.
Resumo:
This paper describes a generalised linear mixed model (GLMM) approach for understanding spatial patterns of participation in population health screening, in the presence of multiple screening facilities. The models presented have dual focus, namely the prediction of expected patient flows from regions to services and relative rates of participation by region- service combination, with both outputs having meaningful implications for the monitoring of current service uptake and provision. The novelty of this paper lies with the former focus, and an approach for distributing expected participation by region based on proximity to services is proposed. The modelling of relative rates of participation is achieved through the combination of different random effects, as a means of assigning excess participation to different sources. The methodology is applied to participation data collected from a government-funded mammography program in Brisbane, Australia.
Resumo:
Numerous expert elicitation methods have been suggested for generalised linear models (GLMs). This paper compares three relatively new approaches to eliciting expert knowledge in a form suitable for Bayesian logistic regression. These methods were trialled on two experts in order to model the habitat suitability of the threatened Australian brush-tailed rock-wallaby (Petrogale penicillata). The first elicitation approach is a geographically assisted indirect predictive method with a geographic information system (GIS) interface. The second approach is a predictive indirect method which uses an interactive graphical tool. The third method uses a questionnaire to elicit expert knowledge directly about the impact of a habitat variable on the response. Two variables (slope and aspect) are used to examine prior and posterior distributions of the three methods. The results indicate that there are some similarities and dissimilarities between the expert informed priors of the two experts formulated from the different approaches. The choice of elicitation method depends on the statistical knowledge of the expert, their mapping skills, time constraints, accessibility to experts and funding available. This trial reveals that expert knowledge can be important when modelling rare event data, such as threatened species, because experts can provide additional information that may not be represented in the dataset. However care must be taken with the way in which this information is elicited and formulated.