916 resultados para Multilevel Linear Models


Relevância:

80.00% 80.00%

Publicador:

Resumo:

We consider consider the problem of dichotomizing a continuous covariate when performing a regression analysis based on a generalized estimation approach. The problem involves estimation of the cutpoint for the covariate and testing the hypothesis that the binary covariate constructed from the continuous covariate has a significant impact on the outcome. Due to the multiple testing used to find the optimal cutpoint, we need to make an adjustment to the usual significance test to preserve the type-I error rates. We illustrate the techniques on one data set of patients given unrelated hematopoietic stem cell transplantation. Here the question is whether the CD34 cell dose given to patient affects the outcome of the transplant and what is the smallest cell dose which is needed for good outcomes. (C) 2010 Elsevier BM. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this article, we deal with the issue of performing accurate small-sample inference in the Birnbaum-Saunders regression model, which can be useful for modeling lifetime or reliability data. We derive a Bartlett-type correction for the score test and numerically compare the corrected test with the usual score test and some other competitors.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Calculations of local influence curvatures and leverage have been well developed when the parameters are unrestricted. In this article, we discuss the assessment of local influence and leverage under linear equality parameter constraints with extensions to inequality constraints. Using a penalized quadratic function we express the normal curvature of local influence for arbitrary perturbation schemes and the generalized leverage matrix in interpretable forms, which depend on restricted and unrestricted components. The results are quite general and can be applied in various statistical models. In particular, we derive the normal curvature under three useful perturbation schemes for generalized linear models. Four illustrative examples are analyzed by the methodology developed in the article.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Influence diagnostics methods are extended in this article to the Grubbs model when the unknown quantity x (latent variable) follows a skew-normal distribution. Diagnostic measures are derived from the case-deletion approach and the local influence approach under several perturbation schemes. The observed information matrix to the postulated model and Delta matrices to the corresponding perturbed models are derived. Results obtained for one real data set are reported, illustrating the usefulness of the proposed methodology.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Animal traits differ not only in mean, but also in variation around the mean. For instance, one sire’s daughter group may be very homogeneous, while another sire’s daughters are much more heterogeneous in performance. The difference in residual variance can partially be explained by genetic differences. Models for such genetic heterogeneity of environmental variance include genetic effects for the mean and residual variance, and a correlation between the genetic effects for the mean and residual variance to measure how the residual variance might vary with the mean. The aim of this thesis was to develop a method based on double hierarchical generalized linear models for estimating genetic heteroscedasticity, and to apply it on four traits in two domestic animal species; teat count and litter size in pigs, and milk production and somatic cell count in dairy cows. The method developed is fast and has been implemented in software that is widely used in animal breeding, which makes it convenient to use. It is based on an approximation of double hierarchical generalized linear models by normal distributions. When having repeated observations on individuals or genetic groups, the estimates were found to be unbiased. For the traits studied, the estimated heritability values for the mean and the residual variance, and the genetic coefficients of variation, were found in the usual ranges reported. The genetic correlation between mean and residual variance was estimated for the pig traits only, and was found to be favorable for litter size, but unfavorable for teat count.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision.  Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes.  The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sistemas de previsão de cheias podem ser adequadamente utilizados quando o alcance é suficiente, em comparação com o tempo necessário para ações preventivas ou corretivas. Além disso, são fundamentalmente importantes a confiabilidade e a precisão das previsões. Previsões de níveis de inundação são sempre aproximações, e intervalos de confiança não são sempre aplicáveis, especialmente com graus de incerteza altos, o que produz intervalos de confiança muito grandes. Estes intervalos são problemáticos, em presença de níveis fluviais muito altos ou muito baixos. Neste estudo, previsões de níveis de cheia são efetuadas, tanto na forma numérica tradicional quanto na forma de categorias, para as quais utiliza-se um sistema especialista baseado em regras e inferências difusas. Metodologias e procedimentos computacionais para aprendizado, simulação e consulta são idealizados, e então desenvolvidos sob forma de um aplicativo (SELF – Sistema Especialista com uso de Lógica “Fuzzy”), com objetivo de pesquisa e operação. As comparações, com base nos aspectos de utilização para a previsão, de sistemas especialistas difusos e modelos empíricos lineares, revelam forte analogia, apesar das diferenças teóricas fundamentais existentes. As metodologias são aplicadas para previsão na bacia do rio Camaquã (15543 km2), para alcances entre 10 e 48 horas. Dificuldades práticas à aplicação são identificadas, resultando em soluções as quais constituem-se em avanços do conhecimento e da técnica. Previsões, tanto na forma numérica quanto categorizada são executadas com sucesso, com uso dos novos recursos. As avaliações e comparações das previsões são feitas utilizandose um novo grupo de estatísticas, derivadas das freqüências simultâneas de ocorrência de valores observados e preditos na mesma categoria, durante a simulação. Os efeitos da variação da densidade da rede são analisados, verificando-se que sistemas de previsão pluvio-hidrométrica em tempo atual são possíveis, mesmo com pequeno número de postos de aquisição de dados de chuva, para previsões sob forma de categorias difusas.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Market timing performance of mutual funds is usually evaluated with linear models with dummy variables which allow for the beta coefficient of CAPM to vary across two regimes: bullish and bearish market excess returns. Managers, however, use their predictions of the state of nature to deÞne whether to carry low or high beta portfolios instead of the observed ones. Our approach here is to take this into account and model market timing as a switching regime in a way similar to Hamilton s Markov-switching GNP model. We then build a measure of market timing success and apply it to simulated and real world data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Esta dissertação de mestrado em economia foi motivada por uma questão complexa bastante estudada na literatura de economia política nos dias de hoje: as formas como campanhas políticas afetam votação em uma eleição. estudo procura modelar mercado eleitoral brasileiro para deputados federais senadores. Através de um modelo linear, conclui-se que os gastos em campanha eleitoral são fatores decisivos para eleição de um candidato deputado federal. Após reconhecer que variável que mede os gastos em campanha possui erro de medida (devido ao famoso "caixa dois", por exemplo), além de ser endógena uma vez que candidatos com maiores possibilidades de conseguir votos conseguem mais fontes de financiamento -, modelo foi estimado por variáveis instrumentais. Para senadores, utilizando modelos lineares modelos com variável resposta binaria, verifica-se também importância, ainda que em menor escala, da campanha eleitoral, sendo que um fator mais importante para corrida ao senado parece ser uma percepção priori da qualidade do candidato.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

O presente texto desenvolve, com fins didáticos, as aplicações do Método Generalizado dos Momentos (MGM) ao procedimento de variáveis instrumentais, em modelos lineares e não-lineares. Faz parte de obra (livro) em elaboração

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a study carried out with customers with credit card of a large retailer to measure the risk of abandonment of a relationship, when this has already purchase history. Two activities are the most important in this study: the theoretical and methodological procedures. The first step was to the understanding of the problem, the importance of theme and the definition of search methods. The study brings a bibliographic survey comprising several authors and shows that the loyalty of customers is the basis that gives sustainability and profitability for organizations of various market segments, examines the satisfaction as the key to success for achievement and specially for the loyalty of customers. To perform this study were adjusted logistic-linear models and through the test Kolmogorov - Smirnov (KS) and the curve Receiver Operating Characteristic (ROC) selected the best model. Had been used cadastral and transactional data of 100,000 customers of credit card issuer, the software used was SPSS which is a modern system of data manipulation, statistical analysis and presentation graphics. In research, we identify the risk of each customer leave the product through a score.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we investigate how several national educational policies and practices influence both students' average reading achievement and the social distributioll of achievement within schools and countries. Data come fJ:om the 2000/2001 administration of PISA (programme for International Student Assessment) by the Organization for Economic Cooperation and Developrnent (OECD). They include observations from 212,880 lS-year-old students attending 8,038 secondary schools, which are located in 39 countries. We analyze these data with three-level Hierarchical Linear Models (HLM), with students nested in schools, which are nested within countries. Results focus on the role played by three country-level educational policies: (1) retention/repetition; (2) the mix of students in schools based on socioeconomic status (school social mix); and vocational education. We explore how these policies influence the social distribution of achievemer.t between schools within countries. Implications of these findings are discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents new methodology for making Bayesian inference about dy~ o!s for exponential famiIy observations. The approach is simulation-based _~t> use of ~vlarkov chain Monte Carlo techniques. A yletropolis-Hastings i:U~UnLlllll 1::; combined with the Gibbs sampler in repeated use of an adjusted version of normal dynamic linear models. Different alternative schemes are derived and compared. The approach is fully Bayesian in obtaining posterior samples for state parameters and unknown hyperparameters. Illustrations to real data sets with sparse counts and missing values are presented. Extensions to accommodate for general distributions for observations and disturbances. intervention. non-linear models and rnultivariate time series are outlined.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Este estudo identificou a relação da aglomeração de firmas de uma mesma atividade econômica na taxa de crescimento do emprego local. Dados das firmas industriais do Estado de São Paulo constantes da Relação Anual de Informações Sociais [RAIS] nos anos de 1996 a 2005 foram coletados. Foram analisadas 263.020 observações de nível de emprego de 26.231 combinações de município-CNAE e 296 diferentes atividades. Os critérios de Puga (2003) e Suzigan, Furtado, Garcia, Sampaio (2003) foram usados para identificar as aglomerações. Uma análise de curva de crescimento, usando-se um modelo multinível, foi desenvolvida no software Hierarchical Linear Models [HLM]. Os resultados evidenciam que existe uma relação positiva entre aglomeração de firmas de uma mesma atividade econômica e o crescimento de emprego. Considerando as externalidades previstas pelo fato de as empresas estarem localizadas em uma mesma região, pode-se sugerir que, em termos comparativos, firmas de uma mesma atividade econômica, localizadas em aglomeração, podem, perceber crescimento maior que suas concorrentes localizadas fora de um aglomerado. Este resultado é relevante, tanto para a empresa individual, como para o estabelecimento de políticas públicas que apóiam o desenvolvimento regional, no nível do município. As evidências confirmam estudos anteriores de caso, permitindo dar mais robustez à teoria