863 resultados para REGRESSION MULTINOMIAL ANALYSIS
Resumo:
A manutenção e evolução de sistemas de software tornou-se uma tarefa bastante crítica ao longo dos últimos anos devido à diversidade e alta demanda de funcionalidades, dispositivos e usuários. Entender e analisar como novas mudanças impactam os atributos de qualidade da arquitetura de tais sistemas é um pré-requisito essencial para evitar a deterioração de sua qualidade durante sua evolução. Esta tese propõe uma abordagem automatizada para a análise de variação do atributo de qualidade de desempenho em termos de tempo de execução (tempo de resposta). Ela é implementada por um framework que adota técnicas de análise dinâmica e mineração de repositório de software para fornecer uma forma automatizada de revelar fontes potenciais – commits e issues – de variação de desempenho em cenários durante a evolução de sistemas de software. A abordagem define quatro fases: (i) preparação – escolher os cenários e preparar os releases alvos; (ii) análise dinâmica – determinar o desempenho de cenários e métodos calculando seus tempos de execução; (iii) análise de variação – processar e comparar os resultados da análise dinâmica para releases diferentes; e (iv) mineração de repositório – identificar issues e commits associados com a variação de desempenho detectada. Estudos empíricos foram realizados para avaliar a abordagem de diferentes perspectivas. Um estudo exploratório analisou a viabilidade de se aplicar a abordagem em sistemas de diferentes domínios para identificar automaticamente elementos de código fonte com variação de desempenho e as mudanças que afetaram tais elementos durante uma evolução. Esse estudo analisou três sistemas: (i) SIGAA – um sistema web para gerência acadêmica; (ii) ArgoUML – uma ferramenta de modelagem UML; e (iii) Netty – um framework para aplicações de rede. Outro estudo realizou uma análise evolucionária ao aplicar a abordagem em múltiplos releases do Netty, e dos frameworks web Wicket e Jetty. Nesse estudo foram analisados 21 releases (sete de cada sistema), totalizando 57 cenários. Em resumo, foram encontrados 14 cenários com variação significante de desempenho para Netty, 13 para Wicket e 9 para Jetty. Adicionalmente, foi obtido feedback de oito desenvolvedores desses sistemas através de um formulário online. Finalmente, no último estudo, um modelo de regressão para desempenho foi desenvolvido visando indicar propriedades de commits que são mais prováveis a causar degradação de desempenho. No geral, 997 commits foram minerados, sendo 103 recuperados de elementos de código fonte degradados e 19 de otimizados, enquanto 875 não tiveram impacto no tempo de execução. O número de dias antes de disponibilizar o release e o dia da semana se mostraram como as variáveis mais relevantes dos commits que degradam desempenho no nosso modelo. A área de característica de operação do receptor (ROC – Receiver Operating Characteristic) do modelo de regressão é 60%, o que significa que usar o modelo para decidir se um commit causará degradação ou não é 10% melhor do que uma decisão aleatória.
Resumo:
This research explores Bayesian updating as a tool for estimating parameters probabilistically by dynamic analysis of data sequences. Two distinct Bayesian updating methodologies are assessed. The first approach focuses on Bayesian updating of failure rates for primary events in fault trees. A Poisson Exponentially Moving Average (PEWMA) model is implemnented to carry out Bayesian updating of failure rates for individual primary events in the fault tree. To provide a basis for testing of the PEWMA model, a fault tree is developed based on the Texas City Refinery incident which occurred in 2005. A qualitative fault tree analysis is then carried out to obtain a logical expression for the top event. A dynamic Fault Tree analysis is carried out by evaluating the top event probability at each Bayesian updating step by Monte Carlo sampling from posterior failure rate distributions. It is demonstrated that PEWMA modeling is advantageous over conventional conjugate Poisson-Gamma updating techniques when failure data is collected over long time spans. The second approach focuses on Bayesian updating of parameters in non-linear forward models. Specifically, the technique is applied to the hydrocarbon material balance equation. In order to test the accuracy of the implemented Bayesian updating models, a synthetic data set is developed using the Eclipse reservoir simulator. Both structured grid and MCMC sampling based solution techniques are implemented and are shown to model the synthetic data set with good accuracy. Furthermore, a graphical analysis shows that the implemented MCMC model displays good convergence properties. A case study demonstrates that Likelihood variance affects the rate at which the posterior assimilates information from the measured data sequence. Error in the measured data significantly affects the accuracy of the posterior parameter distributions. Increasing the likelihood variance mitigates random measurement errors, but casuses the overall variance of the posterior to increase. Bayesian updating is shown to be advantageous over deterministic regression techniques as it allows for incorporation of prior belief and full modeling uncertainty over the parameter ranges. As such, the Bayesian approach to estimation of parameters in the material balance equation shows utility for incorporation into reservoir engineering workflows.
Resumo:
The L-moments based index-flood procedure had been successfully applied for Regional Flood Frequency Analysis (RFFA) for the Island of Newfoundland in 2002 using data up to 1998. This thesis, however, considered both Labrador and the Island of Newfoundland using the L-Moments index-flood method with flood data up to 2013. For Labrador, the homogeneity test showed that Labrador can be treated as a single homogeneous region and the generalized extreme value (GEV) was found to be more robust than any other frequency distributions. The drainage area (DA) is the only significant variable for estimating the index-flood at ungauged sites in Labrador. In previous studies, the Island of Newfoundland has been considered as four homogeneous regions (A,B,C and D) as well as two Water Survey of Canada's Y and Z sub-regions. Homogeneous regions based on Y and Z was found to provide more accurate quantile estimates than those based on four homogeneous regions. Goodness-of-fit test results showed that the generalized extreme value (GEV) distribution is most suitable for the sub-regions; however, the three-parameter lognormal (LN3) gave a better performance in terms of robustness. The best fitting regional frequency distribution from 2002 has now been updated with the latest flood data, but quantile estimates with the new data were not very different from the previous study. Overall, in terms of quantile estimation, in both Labrador and the Island of Newfoundland, the index-flood procedure based on L-moments is highly recommended as it provided consistent and more accurate result than other techniques such as the regression on quantile technique that is currently used by the government.
Resumo:
The known moss flora of Terra Nova National Park, eastern Newfoundland, comp~ises 210 species. Eighty-two percent of the moss species occurring in Terra Nova are widespread or widespread-sporadic in Newfoundland. Other Newfoundland distributional elements present in the Terra Nova moss flora are the northwestern, southern, southeastern, and disjunct elements, but four of the mosses occurring in Terra Nova appear to belong to a previously unrecognized northeastern element of the Newfoundland flora. The majority (70.9%) of Terra Nova's mosses are of boreal affinity and are widely distributed in the North American coniferous forest belt. An additional 10.5 percent of the Terra Nova mosses are cosmopolitan while 9.5 percent are temperate and 4.8 percent are arctic-montane species. The remaining 4.3 percent of the mosses are of montane affinity, and disjunct between eastern and western North America. In Terra Nova, temperate species at their northern limit are concentrated in balsam fir stands, while arctic-montane species are restricted to exposed cliffs, scree slopes, and coastal exposures. Montane species are largely confined to exposed or freshwater habitats. Inability to tolerate high summer temperatures limits the distributions of both arctic-montane and montane species. In Terra Nova, species of differing phytogeographic affinities co-occur on cliffs and scree slopes. The microhabitat relationships of five selected species from such habitats were evaluated by Discriminant Functions Analysis and Multiple Regression Analysis. The five mosses have distinct and different microhabitats on cliffs and scree slopes in Terra Nova, and abundance of all but one is associated with variation in at least one microhabitat variable. Micro-distribution of Grimmia torquata, an arctic-montane species at its southern limit, appears to be deterJ]lined by sensitivity to high summer temperatures. Both southern mosses at their northern limit (Aulacomnium androgynum, Isothecium myosuroides) appear to be limited by water availability and, possibly, by low winter temperatures. The two species whose distributions extend both north and south or the study area (Encalypta procera, Eurhynchium pulchellum) show no clear relationship with microclimate. Dispersal factors have played a significant role in the development of the Terra Nova moss flora. Compared to the most likely colonizing source (i .e. the rest of the island of Newfoundland), species with small diaspores have colonized the study area to a proportionately much greater extent than have species with large diaspores. Hierarchical log-linear analysis indicates that this is so for all affinity groups present in Terra Nova. The apparent dispersal effects emphasize the comparatively recent glaciation of the area, and may also have been enhanced by anthropogenic influences. The restriction of some species to specific habitats, or to narrowly defined microhabitats, appears to strengthen selection for easily dispersed taxa.
Resumo:
The problems faced by scientists in charge of managing Atlantic salmon (Salmo salar) stocks are : i) how to maintain spawning runs consisting of repeat spawners and large multi-sea-winter (MSW) adults in the face of selective homewater and distant commercial fisheries and , ii) how to more accurately predict returns of adults. Using data from scales collected from maiden Atlantic salmon grilse from two locations on the Northern Peninsula of Newfoundland, St. Barbe Bay and Western Arm Brook, their length at smolting was back calculated. These data were then used to examine whether the St. Barbe commercial fishery is selective for salmon of particular smolt age and/or size. Analysis indicated that come commercial fishery selected larger, but not necessarily older adults that those escaping to Western Arm Brook over the period of this study, 1978-1987. It was determined that less than average size smolts survived better than above average size smolts. Slection for repeat spawners, large MSW salmon, and larger grilse has meant reductions in the proportions of these adults in the spawning runs on Western Arm Brook. This may impact the Western Arm Brook salmon stock by increasing the population instability. Sea survival was significantly correlated with selection by the commercial fishery. Characteristics of adults in Western Arm Brook during the period of study (1978-1987) did not help in explaining yearly variation in sea survival. The characteristics of smolts, however, when subjected to multiple regression analysis explained 57.2 percent of the yearly variation in sea survival.
Resumo:
Omnibus tests of significance in contingency tables use statistics of the chi-square type. When the null is rejected, residual analyses are conducted to identify cells in which observed frequencies differ significantly from expected frequencies. Residual analyses are thus conditioned on a significant omnibus test. Conditional approaches have been shown to substantially alter type I error rates in cases involving t tests conditional on the results of a test of equality of variances, or tests of regression coefficients conditional on the results of tests of heteroscedasticity. We show that residual analyses conditional on a significant omnibus test are also affected by this problem, yielding type I error rates that can be up to 6 times larger than nominal rates, depending on the size of the table and the form of the marginal distributions. We explored several unconditional approaches in search for a method that maintains the nominal type I error rate and found out that a bootstrap correction for multiple testing achieved this goal. The validity of this approach is documented for two-way contingency tables in the contexts of tests of independence, tests of homogeneity, and fitting psychometric functions. Computer code in MATLAB and R to conduct these analyses is provided as Supplementary Material.
Resumo:
Funding: Verity Watson acknowledges financial support from the Chief Scientist Office of the Scottish Government Health and Social Care Directorates. The funders had no role in the study design; in the collection, analysis and interpretation of data; in the writing of the report; and in the decision to submit the article for publication. Acknowledgements: We thank Marjon van der Pol, Mandy Ryan and Rainer Schulz for helpful comments and suggestions throughout the project. We also thank Karen Gerard and Tim Bolt for comparing the results of our systematic review with a similar systematic review they are conducting at the same time. We would like to thank Douglas Olley for excellent research assistance.
Resumo:
Background: Parental obesity is a predominant risk factor for childhood obesity. Family factors including socio-economic status (SES) play a role in determining parent weight. It is essential to unpick how shared family factors impact on child weight. This study aims to investigate the association between measured parent weight status, familial socio-economic factors and the risk of childhood obesity at age 9. Methodology/Principal Findings: Cross sectional analysis of the first wave (2008) of the Growing Up in Ireland (GUI) study. GUI is a nationally representative study of 9-year-old children (N = 8,568). Schools were selected from the national total (response rate 82%) and age eligible children (response rate 57%) were invited to participate. Children and their parents had height and weight measurements taken using standard methods. Data were reweighted to account for the sampling design. Childhood overweight and obesity prevalence were calculated using International Obesity Taskforce definitions. Multinomial logistic regression examined the association between parent weight status, indicators of SES and child weight. Overall, 25% of children were either overweight (19.3%) or obese (6.6%). Parental obesity was a significant predictor of child obesity. Of children with normal weight parents, 14.4% were overweight or obese whereas 46.2% of children with obese parents were overweight or obese. Maternal education and household class were more consistently associated with a child being in a higher body mass index category than household income. Adjusted regression indicated that female gender, one parent family type, lower maternal education, lower household class and a heavier parent weight status significantly increased the odds of childhood obesity. Conclusions/Significance: Parental weight appears to be the most influential factor driving the childhood obesity epidemic in Ireland and is an independent predictor of child obesity across SES groups. Due to the high prevalence of obesity in parents and children, population based interventions are required.