921 resultados para random coefficient regression model
Resumo:
The relative paleointensity (RPI) method assumes that the intensity of post depositional remanent magnetization (PDRM) depends exclusively on the magnetic field strength and the concentration of the magnetic carriers. Sedimentary remanence is regarded as an equilibrium state between aligning geomagnetic and randomizing interparticle forces. Just how strong these mechanical and electrostatic forces are, depends on many petrophysical factors related to mineralogy, particle size and shape of the matrix constituents. We therefore test the hypothesis that variations in sediment lithology modulate RPI records. For 90 selected Late Quaternary sediment samples from the subtropical and subantarctic South Atlantic Ocean a combined paleomagnetic and sedimentological dataset was established. Misleading alterations of the magnetic mineral fraction were detected by a routine Fe/kappa test (Funk, J., von Dobeneck, T., Reitz, A., 2004. Integrated rock magnetic and geochemical quantification of redoxomorphic iron mineral diagenesis in Late Quaternary sediments from the Equatorial Atlantic. In: Wefer, G., Mulitza, S., Ratmeyer, V. (Eds.), The South Atlantic in the Late Quaternary: reconstruction of material budgets and current systems. Springer-Verlag, Berlin/Heidelberg/New York/Tokyo, pp. 239-262). Samples with any indication of suboxic magnetite dissolution were excluded from the dataset. The parameters under study include carbonate, opal and terrigenous content, grain size distribution and clay mineral composition. Their bi- and multivariate correlations with the RPI signal were statistically investigated using standard techniques and criteria. While several of the parameters did not yield significant results, clay grain size and chlorite correlate weakly and opal, illite and kaolinite correlate moderately to the NRM/ARM signal used here as a RPI measure. The most influential single sedimentological factor is the kaolinite/illite ratio with a Pearson's coefficient of 0.51 and 99.9% significance. A three-member regression model suggests that matrix effects can make up over 50% of the observed RPI dynamics.
Resumo:
The effect of the tumour-forming disease, fibropapillomatosis, on the somatic growth dynamics of green turtles resident in the Pala'au foraging grounds (Moloka'i, Hawai'i) was evaluated using a Bayesian generalised additive mixed modelling approach. This regression model enabled us to account for fixed effects (fibropapilloma tumour severity), nonlinear covariate functional form (carapace size, sampling year) as well as random effects due to individual heterogeneity and correlation between repeated growth measurements on some turtles. Somatic growth rates were found to be nonlinear functions of carapace size and sampling year but were not a function of low-to-moderate tumour severity. On the other hand, growth rates were significantly lower for turtles with advanced fibropapillomatosis, which suggests a limited or threshold-specific disease effect. However, tumour severity was an increasing function of carapace size-larger turtles tended to have higher tumour severity scores, presumably due to longer exposure of larger (older) turtles to the factors that cause the disease. Hence turtles with advanced fibropapillomatosis tended to be the larger turtles, which confounds size and tumour severity in this study. But somatic growth rates for the Pala'au population have also declined since the mid-1980s (sampling year effect) while disease prevalence and severity increased from the mid-1980s before levelling off by the mid-1990s. It is unlikely that this decline was related to the increasing tumour severity because growth rates have also declined over the last 10-20 years for other green turtle populations resident in Hawaiian waters that have low or no disease prevalence. The declining somatic growth rate trends evident in the Hawaiian stock are more likely a density-dependent effect caused by a dramatic increase in abundance by this once-seriously-depleted stock since the mid-1980s. So despite increasing fibropapillomatosis risk over the last 20 years, only a limited effect on somatic growth dynamics was apparent and the Hawaiian green turtle stock continues to increase in abundance.
Resumo:
Pharmacodynamics (PD) is the study of the biochemical and physiological effects of drugs. The construction of optimal designs for dose-ranging trials with multiple periods is considered in this paper, where the outcome of the trial (the effect of the drug) is considered to be a binary response: the success or failure of a drug to bring about a particular change in the subject after a given amount of time. The carryover effect of each dose from one period to the next is assumed to be proportional to the direct effect. It is shown for a logistic regression model that the efficiency of optimal parallel (single-period) or crossover (two-period) design is substantially greater than a balanced design. The optimal designs are also shown to be robust to misspecification of the value of the parameters. Finally, the parallel and crossover designs are combined to provide the experimenter with greater flexibility.
Resumo:
Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.
Resumo:
National guidance and clinical guidelines recommended multidisciplinary teams (MDTs) for cancer services in order to bring specialists in relevant disciplines together, ensure clinical decisions are fully informed, and to coordinate care effectively. However, the effectiveness of cancer teams was not previously evaluated systematically. A random sample of 72 breast cancer teams in England was studied (548 members in six core disciplines), stratified by region and caseload. Information about team constitution, processes, effectiveness, clinical performance, and members' mental well-being was gathered using appropriate instruments. Two input variables, team workload (P=0.009) and the proportion of breast care nurses (P=0.003), positively predicted overall clinical performance in multivariate analysis using a two-stage regression model. There were significant correlations between individual team inputs, team composition variables, and clinical performance. Some disciplines consistently perceived their team's effectiveness differently from the mean. Teams with shared leadership of their clinical decision-making were most effective. The mental well-being of team members appeared significantly better than in previous studies of cancer clinicians, the NHS, and the general population. This study established that team composition, working methods, and workloads are related to measures of effectiveness, including the quality of clinical care. © 2003 Cancer Research UK.
Resumo:
In this paper we present a novel method for emulating a stochastic, or random output, computer model and show its application to a complex rabies model. The method is evaluated both in terms of accuracy and computational efficiency on synthetic data and the rabies model. We address the issue of experimental design and provide empirical evidence on the effectiveness of utilizing replicate model evaluations compared to a space-filling design. We employ the Mahalanobis error measure to validate the heteroscedastic Gaussian process based emulator predictions for both the mean and (co)variance. The emulator allows efficient screening to identify important model inputs and better understanding of the complex behaviour of the rabies model.
Resumo:
Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.
Resumo:
2002 Mathematics Subject Classification: 62M20, 62-07, 62J05, 62P20.
Resumo:
2000 Mathematics Subject Classification: 62J12, 62F35
Resumo:
The solution of a TU cooperative game can be a distribution of the value of the grand coalition, i.e. it can be a distribution of the payo (utility) all the players together achieve. In a regression model, the evaluation of the explanatory variables can be a distribution of the overall t, i.e. the t of the model every regressor variable is involved. Furthermore, we can take regression models as TU cooperative games where the explanatory (regressor) variables are the players. In this paper we introduce the class of regression games, characterize it and apply the Shapley value to evaluating the explanatory variables in regression models. In order to support our approach we consider Young (1985)'s axiomatization of the Shapley value, and conclude that the Shapley value is a reasonable tool to evaluate the explanatory variables of regression models.
Resumo:
This paper explains how Poisson regression can be used in studies in which the dependent variable describes the number of occurrences of some rare event such as suicide. After pointing out why ordinary linear regression is inappropriate for treating dependent variables of this sort, we go on to present the basic Poisson regression model and show how it fits in the broad class of generalized linear models. Then we turn to discussing a major problem of Poisson regression known as overdispersion and suggest possible solutions, including the correction of standard errors and negative binomial regression. The paper ends with a detailed empirical example, drawn from our own research on suicide.
Resumo:
The dissertation takes a multivariate approach to answer the question of how applicant age, after controlling for other variables, affects employment success in a public organization. In addition to applicant age, there are five other categories of variables examined: organization/applicant variables describing the relationship of the applicant to the organization; organization/position variables describing the target position as it relates to the organization; episodic variables such as applicant age relative to the ages of competing applicants; economic variables relating to the salary needs of older applicants; and cognitive variables that may affect the decision maker's evaluation of the applicant. ^ An exploratory phase of research employs archival data from approximately 500 decisions made in the past three years to hire or promote applicants for positions in one public health administration organization. A logit regression model is employed to examine the probability that the variables modify the effect of applicant age on employment success. A confirmatory phase of the dissertation is a controlled experiment in which hiring decision makers from the same public organization perform a simulated hiring decision exercise to evaluate hypothetical applicants of similar qualifications but of different ages. The responses of the decision makers to a series of bipolar adjective scales add support to the cognitive component of the theoretical model of the hiring decision. A final section contains information gathered from interviews with key informants. ^ Applicant age has tended to have a curvilinear relationship with employment success. For some positions, the mean age of the applicants most likely to succeed varies with the values of the five groups of moderating variables. The research contributes not only to the practice of public personnel administration, but is useful in examining larger public policy issues associated with an aging workforce. ^
Resumo:
The aim of the present study was to trace the mortality profile of the elderly in Brazil using two neighboring age groups: 60 to 69 years (young-old) and 80 years or more (oldest-old). To do this, we sought to characterize the trend and distinctions of different mortality profiles, as well as the quality of the data and associations with socioeconomic and sanitary conditions in the micro-regions of Brazil. Data was collected from the Mortality Information System (SIM) and the Brazilian Institute of Geography and Statistics (IBGE). Based on these data, the coefficients of mortality were calculated for the chapters of the International Disease Classification (ICD-10). A polynomial regression model was used to ascertain the trend of the main chapters. Non-hierarchical cluster analysis (K-Means) was used to obtain the profiles for different Brazilian micro-regions. Factorial analysis of the contextual variables was used to obtain the socio-economic and sanitary deprivation indices (IPSS). The trend of the CMId and of the ratio of its values in the two age groups confirmed a decrease in most of the indicators, particularly for badly-defined causes among the oldest-old. Among the young-old, the following profiles emerged: the Development Profile; the Modernity Profile; the Epidemiological Paradox Profile and the Ignorance Profile. Among the oldest-old, the latter three profiles were confirmed, in addition to the Low Mortality Rates Profile. When comparing the mean IPSS values in global terms, all of the groups were different in both of the age groups. The Ignorance Profile was compared with the other profiles using orthogonal contrasts. This profile differed from all of the others in isolation and in clusters. However, the mean IPSS was similar for the Low Mortality Rates Profile among the oldest-old. Furthermore, associations were found between the data quality indicators, the CMId for badly-defined causes, the general coefficient of mortality for each age group (CGMId) and the IPSS of the micro-regions. The worst rates were recorded in areas with the greatest socioeconomic and sanitary deprivation. The findings of the present study show that, despite the decrease in the mortality coefficients, there are notable differences in the profiles related to contextual conditions, including regional differences in data quality. These differences increase the vulnerability of the age groups studied and the health iniquities that are already present.
Resumo:
A manutenção e evolução de sistemas de software tornou-se uma tarefa bastante crítica ao longo dos últimos anos devido à diversidade e alta demanda de funcionalidades, dispositivos e usuários. Entender e analisar como novas mudanças impactam os atributos de qualidade da arquitetura de tais sistemas é um pré-requisito essencial para evitar a deterioração de sua qualidade durante sua evolução. Esta tese propõe uma abordagem automatizada para a análise de variação do atributo de qualidade de desempenho em termos de tempo de execução (tempo de resposta). Ela é implementada por um framework que adota técnicas de análise dinâmica e mineração de repositório de software para fornecer uma forma automatizada de revelar fontes potenciais – commits e issues – de variação de desempenho em cenários durante a evolução de sistemas de software. A abordagem define quatro fases: (i) preparação – escolher os cenários e preparar os releases alvos; (ii) análise dinâmica – determinar o desempenho de cenários e métodos calculando seus tempos de execução; (iii) análise de variação – processar e comparar os resultados da análise dinâmica para releases diferentes; e (iv) mineração de repositório – identificar issues e commits associados com a variação de desempenho detectada. Estudos empíricos foram realizados para avaliar a abordagem de diferentes perspectivas. Um estudo exploratório analisou a viabilidade de se aplicar a abordagem em sistemas de diferentes domínios para identificar automaticamente elementos de código fonte com variação de desempenho e as mudanças que afetaram tais elementos durante uma evolução. Esse estudo analisou três sistemas: (i) SIGAA – um sistema web para gerência acadêmica; (ii) ArgoUML – uma ferramenta de modelagem UML; e (iii) Netty – um framework para aplicações de rede. Outro estudo realizou uma análise evolucionária ao aplicar a abordagem em múltiplos releases do Netty, e dos frameworks web Wicket e Jetty. Nesse estudo foram analisados 21 releases (sete de cada sistema), totalizando 57 cenários. Em resumo, foram encontrados 14 cenários com variação significante de desempenho para Netty, 13 para Wicket e 9 para Jetty. Adicionalmente, foi obtido feedback de oito desenvolvedores desses sistemas através de um formulário online. Finalmente, no último estudo, um modelo de regressão para desempenho foi desenvolvido visando indicar propriedades de commits que são mais prováveis a causar degradação de desempenho. No geral, 997 commits foram minerados, sendo 103 recuperados de elementos de código fonte degradados e 19 de otimizados, enquanto 875 não tiveram impacto no tempo de execução. O número de dias antes de disponibilizar o release e o dia da semana se mostraram como as variáveis mais relevantes dos commits que degradam desempenho no nosso modelo. A área de característica de operação do receptor (ROC – Receiver Operating Characteristic) do modelo de regressão é 60%, o que significa que usar o modelo para decidir se um commit causará degradação ou não é 10% melhor do que uma decisão aleatória.
Resumo:
A manutenção e evolução de sistemas de software tornou-se uma tarefa bastante crítica ao longo dos últimos anos devido à diversidade e alta demanda de funcionalidades, dispositivos e usuários. Entender e analisar como novas mudanças impactam os atributos de qualidade da arquitetura de tais sistemas é um pré-requisito essencial para evitar a deterioração de sua qualidade durante sua evolução. Esta tese propõe uma abordagem automatizada para a análise de variação do atributo de qualidade de desempenho em termos de tempo de execução (tempo de resposta). Ela é implementada por um framework que adota técnicas de análise dinâmica e mineração de repositório de software para fornecer uma forma automatizada de revelar fontes potenciais – commits e issues – de variação de desempenho em cenários durante a evolução de sistemas de software. A abordagem define quatro fases: (i) preparação – escolher os cenários e preparar os releases alvos; (ii) análise dinâmica – determinar o desempenho de cenários e métodos calculando seus tempos de execução; (iii) análise de variação – processar e comparar os resultados da análise dinâmica para releases diferentes; e (iv) mineração de repositório – identificar issues e commits associados com a variação de desempenho detectada. Estudos empíricos foram realizados para avaliar a abordagem de diferentes perspectivas. Um estudo exploratório analisou a viabilidade de se aplicar a abordagem em sistemas de diferentes domínios para identificar automaticamente elementos de código fonte com variação de desempenho e as mudanças que afetaram tais elementos durante uma evolução. Esse estudo analisou três sistemas: (i) SIGAA – um sistema web para gerência acadêmica; (ii) ArgoUML – uma ferramenta de modelagem UML; e (iii) Netty – um framework para aplicações de rede. Outro estudo realizou uma análise evolucionária ao aplicar a abordagem em múltiplos releases do Netty, e dos frameworks web Wicket e Jetty. Nesse estudo foram analisados 21 releases (sete de cada sistema), totalizando 57 cenários. Em resumo, foram encontrados 14 cenários com variação significante de desempenho para Netty, 13 para Wicket e 9 para Jetty. Adicionalmente, foi obtido feedback de oito desenvolvedores desses sistemas através de um formulário online. Finalmente, no último estudo, um modelo de regressão para desempenho foi desenvolvido visando indicar propriedades de commits que são mais prováveis a causar degradação de desempenho. No geral, 997 commits foram minerados, sendo 103 recuperados de elementos de código fonte degradados e 19 de otimizados, enquanto 875 não tiveram impacto no tempo de execução. O número de dias antes de disponibilizar o release e o dia da semana se mostraram como as variáveis mais relevantes dos commits que degradam desempenho no nosso modelo. A área de característica de operação do receptor (ROC – Receiver Operating Characteristic) do modelo de regressão é 60%, o que significa que usar o modelo para decidir se um commit causará degradação ou não é 10% melhor do que uma decisão aleatória.