932 resultados para Generalized linear models
Resumo:
Predictors of random effects are usually based on the popular mixed effects (ME) model developed under the assumption that the sample is obtained from a conceptual infinite population; such predictors are employed even when the actual population is finite. Two alternatives that incorporate the finite nature of the population are obtained from the superpopulation model proposed by Scott and Smith (1969. Estimation in multi-stage surveys. J. Amer. Statist. Assoc. 64, 830-840) or from the finite population mixed model recently proposed by Stanek and Singer (2004. Predicting random effects from finite population clustered samples with response error. J. Amer. Statist. Assoc. 99, 1119-1130). Predictors derived under the latter model with the additional assumptions that all variance components are known and that within-cluster variances are equal have smaller mean squared error (MSE) than the competitors based on either the ME or Scott and Smith`s models. As population variances are rarely known, we propose method of moment estimators to obtain empirical predictors and conduct a simulation study to evaluate their performance. The results suggest that the finite population mixed model empirical predictor is more stable than its competitors since, in terms of MSE, it is either the best or the second best and when second best, its performance lies within acceptable limits. When both cluster and unit intra-class correlation coefficients are very high (e.g., 0.95 or more), the performance of the empirical predictors derived under the three models is similar. (c) 2007 Elsevier B.V. All rights reserved.
Resumo:
We review some issues related to the implications of different missing data mechanisms on statistical inference for contingency tables and consider simulation studies to compare the results obtained under such models to those where the units with missing data are disregarded. We confirm that although, in general, analyses under the correct missing at random and missing completely at random models are more efficient even for small sample sizes, there are exceptions where they may not improve the results obtained by ignoring the partially classified data. We show that under the missing not at random (MNAR) model, estimates on the boundary of the parameter space as well as lack of identifiability of the parameters of saturated models may be associated with undesirable asymptotic properties of maximum likelihood estimators and likelihood ratio tests; even in standard cases the bias of the estimators may be low only for very large samples. We also show that the probability of a boundary solution obtained under the correct MNAR model may be large even for large samples and that, consequently, we may not always conclude that a MNAR model is misspecified because the estimate is on the boundary of the parameter space.
Resumo:
Influence diagnostics methods are extended in this article to the Grubbs model when the unknown quantity x (latent variable) follows a skew-normal distribution. Diagnostic measures are derived from the case-deletion approach and the local influence approach under several perturbation schemes. The observed information matrix to the postulated model and Delta matrices to the corresponding perturbed models are derived. Results obtained for one real data set are reported, illustrating the usefulness of the proposed methodology.
Resumo:
BACKGROUND: Canalization is defined as the stability of a genotype against minor variations in both environment and genetics. Genetic variation in degree of canalization causes heterogeneity of within-family variance. The aims of this study are twofold: (1) quantify genetic heterogeneity of (within-family) residual variance in Atlantic salmon and (2) test whether the observed heterogeneity of (within-family) residual variance can be explained by simple scaling effects. RESULTS: Analysis of body weight in Atlantic salmon using a double hierarchical generalized linear model (DHGLM) revealed substantial heterogeneity of within-family variance. The 95% prediction interval for within-family variance ranged from ~0.4 to 1.2 kg2, implying that the within-family variance of the most extreme high families is expected to be approximately three times larger than the extreme low families. For cross-sectional data, DHGLM with an animal mean sub-model resulted in severe bias, while a corresponding sire-dam model was appropriate. Heterogeneity of variance was not sensitive to Box-Cox transformations of phenotypes, which implies that heterogeneity of variance exists beyond what would be expected from simple scaling effects. CONCLUSIONS: Substantial heterogeneity of within-family variance was found for body weight in Atlantic salmon. A tendency towards higher variance with higher means (scaling effects) was observed, but heterogeneity of within-family variance existed beyond what could be explained by simple scaling effects. For cross-sectional data, using the animal mean sub-model in the DHGLM resulted in biased estimates of variance components, which differed substantially both from a standard linear mean animal model and a sire-dam DHGLM model. Although genetic differences in canalization were observed, selection for increased canalization is difficult, because there is limited individual information for the variance sub-model, especially when based on cross-sectional data. Furthermore, potential macro-environmental changes (diet, climatic region, etc.) may make genetic heterogeneity of variance a less stable trait over time and space.
Resumo:
OBJECTIVE: This study aimed to assess women´s acceptability of diagnosis and treatment of incomplete abortion with misoprostol by midwives, compared with physicians. METHODS: This was an analysis of secondary outcomes from a multi-centre randomized controlled equivalence trial at district level in Uganda. Women with first trimester incomplete abortion were randomly allocated to clinical assessment and treatment with misoprostol by a physician or a midwife. The randomisation (1:1) was done in blocks of 12 and stratified for health care facility. Acceptability was measured in expectations and satisfaction at a follow up visit 14-28 days following treatment. Analysis of women's overall acceptability was done using a generalized linear mixed-effects model with an equivalence range of -4% to 4%. The study was not masked. The trial is registered at ClinicalTrials.org, NCT 01844024. RESULTS: From April 2013 to June 2014, 1108 women were assessed for eligibility of which 1010 were randomized (506 to midwife and 504 to physician). 953 women were successfully followed up and included in the acceptability analysis. 95% (904) of the participants found the treatment satisfactory and overall acceptability was found to be equivalent between the two study groups. Treatment failure, not feeling calm and safe following treatment, experiencing severe abdominal pain or heavy bleeding following treatment, were significantly associated with non-satisfaction. No serious adverse events were recorded. CONCLUSIONS: Treatment of incomplete abortion with misoprostol by midwives and physician was highly, and equally, acceptable to women. TRIAL REGISTRATION: ClinicalTrials.gov NCT01844024.
Resumo:
Sistemas de previsão de cheias podem ser adequadamente utilizados quando o alcance é suficiente, em comparação com o tempo necessário para ações preventivas ou corretivas. Além disso, são fundamentalmente importantes a confiabilidade e a precisão das previsões. Previsões de níveis de inundação são sempre aproximações, e intervalos de confiança não são sempre aplicáveis, especialmente com graus de incerteza altos, o que produz intervalos de confiança muito grandes. Estes intervalos são problemáticos, em presença de níveis fluviais muito altos ou muito baixos. Neste estudo, previsões de níveis de cheia são efetuadas, tanto na forma numérica tradicional quanto na forma de categorias, para as quais utiliza-se um sistema especialista baseado em regras e inferências difusas. Metodologias e procedimentos computacionais para aprendizado, simulação e consulta são idealizados, e então desenvolvidos sob forma de um aplicativo (SELF – Sistema Especialista com uso de Lógica “Fuzzy”), com objetivo de pesquisa e operação. As comparações, com base nos aspectos de utilização para a previsão, de sistemas especialistas difusos e modelos empíricos lineares, revelam forte analogia, apesar das diferenças teóricas fundamentais existentes. As metodologias são aplicadas para previsão na bacia do rio Camaquã (15543 km2), para alcances entre 10 e 48 horas. Dificuldades práticas à aplicação são identificadas, resultando em soluções as quais constituem-se em avanços do conhecimento e da técnica. Previsões, tanto na forma numérica quanto categorizada são executadas com sucesso, com uso dos novos recursos. As avaliações e comparações das previsões são feitas utilizandose um novo grupo de estatísticas, derivadas das freqüências simultâneas de ocorrência de valores observados e preditos na mesma categoria, durante a simulação. Os efeitos da variação da densidade da rede são analisados, verificando-se que sistemas de previsão pluvio-hidrométrica em tempo atual são possíveis, mesmo com pequeno número de postos de aquisição de dados de chuva, para previsões sob forma de categorias difusas.
Resumo:
Market timing performance of mutual funds is usually evaluated with linear models with dummy variables which allow for the beta coefficient of CAPM to vary across two regimes: bullish and bearish market excess returns. Managers, however, use their predictions of the state of nature to deÞne whether to carry low or high beta portfolios instead of the observed ones. Our approach here is to take this into account and model market timing as a switching regime in a way similar to Hamilton s Markov-switching GNP model. We then build a measure of market timing success and apply it to simulated and real world data.
Resumo:
Esta dissertação de mestrado em economia foi motivada por uma questão complexa bastante estudada na literatura de economia política nos dias de hoje: as formas como campanhas políticas afetam votação em uma eleição. estudo procura modelar mercado eleitoral brasileiro para deputados federais senadores. Através de um modelo linear, conclui-se que os gastos em campanha eleitoral são fatores decisivos para eleição de um candidato deputado federal. Após reconhecer que variável que mede os gastos em campanha possui erro de medida (devido ao famoso "caixa dois", por exemplo), além de ser endógena uma vez que candidatos com maiores possibilidades de conseguir votos conseguem mais fontes de financiamento -, modelo foi estimado por variáveis instrumentais. Para senadores, utilizando modelos lineares modelos com variável resposta binaria, verifica-se também importância, ainda que em menor escala, da campanha eleitoral, sendo que um fator mais importante para corrida ao senado parece ser uma percepção priori da qualidade do candidato.
Resumo:
O presente texto desenvolve, com fins didáticos, as aplicações do Método Generalizado dos Momentos (MGM) ao procedimento de variáveis instrumentais, em modelos lineares e não-lineares. Faz parte de obra (livro) em elaboração
Resumo:
This paper presents a study carried out with customers with credit card of a large retailer to measure the risk of abandonment of a relationship, when this has already purchase history. Two activities are the most important in this study: the theoretical and methodological procedures. The first step was to the understanding of the problem, the importance of theme and the definition of search methods. The study brings a bibliographic survey comprising several authors and shows that the loyalty of customers is the basis that gives sustainability and profitability for organizations of various market segments, examines the satisfaction as the key to success for achievement and specially for the loyalty of customers. To perform this study were adjusted logistic-linear models and through the test Kolmogorov - Smirnov (KS) and the curve Receiver Operating Characteristic (ROC) selected the best model. Had been used cadastral and transactional data of 100,000 customers of credit card issuer, the software used was SPSS which is a modern system of data manipulation, statistical analysis and presentation graphics. In research, we identify the risk of each customer leave the product through a score.
Resumo:
O presente trabalho tem por objetivo avaliar o impacto das concentrações regionais no desempenho organizacional das empresas brasileiras com ênfase no setor serviços. Com o intuito de atingir este objetivo realizou-se uma comparação entre o desempenho organizacional das firmas localizadas em áreas de concentração geográficas e aquelas situadas fora destas áreas. Além disso, procurou-se contrastar o efeito da concentração regional sobre o desempenho das empresas de serviços com as empresas do setor industrial. A revisão literária evidenciou a existência de vantagens para empresas concentradas regionalmente, o que levou à principal hipótese deste trabalho, de que tais vantagens ocasionariam melhor desempenho das firmas. Desta forma, buscou-se averiguar a existência de uma relação entre o desempenho organizacional e a localização geográfica das empresas de serviços regionalmente concentradas. O trabalho de identificação das concentrações regionais foi realizado adaptando-se os critérios utilizados no setor industrial para o setor serviços, a partir dos dados de número de estabelecimentos e de funcionários, obtidos através da base dados da Relação Anual de Informações Sociais (RAIS). O desempenho organizacional foi mensurado por dois indicadores: lucratividade e o crescimento de vendas. A fonte de dados de desempenho utilizada foi a base de microdados das seguintes pesquisas do Instituto Brasileiro de Geografia e Estatística (IBGE): Pesquisa Industrial Anual (PIA) e Pesquisa Anual de Serviços (PAS). A amostra utilizada incluiu 78.789 observações de prestadoras de serviços e 22.460 observações de empresas do setor industrial, entre 2001 e 2005. Os resultados foram produzidos por meio da aplicação dos modelos hierárquicos ou modelos multiníveis. Os resultados revelaram um efeito positivo sobre o crescimento das empresas situadas em áreas de concentração regional (tanto do setor serviços quanto da indústria), porém não foram encontradas evidências de maior lucratividade das mesmas. As conclusões deste trabalho contribuem para a tomada de decisão dos gestores, ao avaliar se deverão ou não situar seu empreendimento em uma área de concentração regional. Além de apresentar implicações para as políticas públicas, pois a constatação de um efeito positivo sobre o crescimento das firmas em determinadas concentrações pode direcionar políticas de incentivo, com o objetivo de estimular a formação de tais concentrações em determinadas localidades para desenvolvimento regional.
Resumo:
In this paper we investigate how several national educational policies and practices influence both students' average reading achievement and the social distributioll of achievement within schools and countries. Data come fJ:om the 2000/2001 administration of PISA (programme for International Student Assessment) by the Organization for Economic Cooperation and Developrnent (OECD). They include observations from 212,880 lS-year-old students attending 8,038 secondary schools, which are located in 39 countries. We analyze these data with three-level Hierarchical Linear Models (HLM), with students nested in schools, which are nested within countries. Results focus on the role played by three country-level educational policies: (1) retention/repetition; (2) the mix of students in schools based on socioeconomic status (school social mix); and vocational education. We explore how these policies influence the social distribution of achievemer.t between schools within countries. Implications of these findings are discussed.
Resumo:
This paper presents new methodology for making Bayesian inference about dy~ o!s for exponential famiIy observations. The approach is simulation-based _~t> use of ~vlarkov chain Monte Carlo techniques. A yletropolis-Hastings i:U~UnLlllll 1::; combined with the Gibbs sampler in repeated use of an adjusted version of normal dynamic linear models. Different alternative schemes are derived and compared. The approach is fully Bayesian in obtaining posterior samples for state parameters and unknown hyperparameters. Illustrations to real data sets with sparse counts and missing values are presented. Extensions to accommodate for general distributions for observations and disturbances. intervention. non-linear models and rnultivariate time series are outlined.
Resumo:
Este estudo identificou a relação da aglomeração de firmas de uma mesma atividade econômica na taxa de crescimento do emprego local. Dados das firmas industriais do Estado de São Paulo constantes da Relação Anual de Informações Sociais [RAIS] nos anos de 1996 a 2005 foram coletados. Foram analisadas 263.020 observações de nível de emprego de 26.231 combinações de município-CNAE e 296 diferentes atividades. Os critérios de Puga (2003) e Suzigan, Furtado, Garcia, Sampaio (2003) foram usados para identificar as aglomerações. Uma análise de curva de crescimento, usando-se um modelo multinível, foi desenvolvida no software Hierarchical Linear Models [HLM]. Os resultados evidenciam que existe uma relação positiva entre aglomeração de firmas de uma mesma atividade econômica e o crescimento de emprego. Considerando as externalidades previstas pelo fato de as empresas estarem localizadas em uma mesma região, pode-se sugerir que, em termos comparativos, firmas de uma mesma atividade econômica, localizadas em aglomeração, podem, perceber crescimento maior que suas concorrentes localizadas fora de um aglomerado. Este resultado é relevante, tanto para a empresa individual, como para o estabelecimento de políticas públicas que apóiam o desenvolvimento regional, no nível do município. As evidências confirmam estudos anteriores de caso, permitindo dar mais robustez à teoria
Resumo:
The central aims of this study were: (1) to construct age- and gender-specific percentiles for motor coordination (MC), (2) to analyze the change, stability, and prediction of MC, (3) to investigate the relationship between motor performance and body fatness, and (4) to evaluate the relationships between skeletal maturation and fundamental motor skills (FMS) and MC. The data collected was from the ‘Healthy Growth of Madeira Children Study’ and from the ‘Madeira Child Growth Study’. In these studies, MC, FMS, skeletal age, growth characteristics, motor performance, physical activity, socioeconomic status, and geographical area were assessed/measured. Generalized additive models for location, scale and shape, mixed between-within subjects ANOVA, multilevel models, and hierarchical regression (blocks) were some of the statistical procedures used in the analyses. Scores on walking backwards and moving sideways improved with age. It was also found that boys performed better than girls on moving sideways. Normal-weight children outperformed obese peers in almost all gross MC tests. Inter-age correlations were calculated to be between 0.15 and 0.60. Age was associated with a better performance in catching, scramble, speed run, standing long jump, balance, and tennis ball throwing. Body mass index was positively associated with scramble and speed run, and negatively related to the standing long jump. Physical activity was negatively associated with scramble. Semi-urban children displayed better catching skills relative to their urban peers. The standardized residual of skeletal age on chronological age (SAsr) and its interaction with stature and/or body mass accounted for the maximum of 7.0% of variance in FMS and MC over that attributed to body size per se. SAsr alone accounted for a maximum of 9.0% variance in FMS and MC over that attributed to body size per se and interactions between SAsr and body size. This study demonstrates the need to promote FMS, MC, motor performance, and physical activity in children.