908 resultados para Analysis of multiple regression
Resumo:
Hydrophobicity as measured by Log P is an important molecular property related to toxicity and carcinogenicity. With increasing public health concerns for the effects of Disinfection By-Products (DBPs), there are considerable benefits in developing Quantitative Structure and Activity Relationship (QSAR) models capable of accurately predicting Log P. In this research, Log P values of 173 DBP compounds in 6 functional classes were used to develop QSAR models, by applying 3 molecular descriptors, namely, Energy of the Lowest Unoccupied Molecular Orbital (ELUMO), Number of Chlorine (NCl) and Number of Carbon (NC) by Multiple Linear Regression (MLR) analysis. The QSAR models developed were validated based on the Organization for Economic Co-operation and Development (OECD) principles. The model Applicability Domain (AD) and mechanistic interpretation were explored. Considering the very complex nature of DBPs, the established QSAR models performed very well with respect to goodness-of-fit, robustness and predictability. The predicted values of Log P of DBPs by the QSAR models were found to be significant with a correlation coefficient R2 from 81% to 98%. The Leverage Approach by Williams Plot was applied to detect and remove outliers, consequently increasing R 2 by approximately 2% to 13% for different DBP classes. The developed QSAR models were statistically validated for their predictive power by the Leave-One-Out (LOO) and Leave-Many-Out (LMO) cross validation methods. Finally, Monte Carlo simulation was used to assess the variations and inherent uncertainties in the QSAR models of Log P and determine the most influential parameters in connection with Log P prediction. The developed QSAR models in this dissertation will have a broad applicability domain because the research data set covered six out of eight common DBP classes, including halogenated alkane, halogenated alkene, halogenated aromatic, halogenated aldehyde, halogenated ketone, and halogenated carboxylic acid, which have been brought to the attention of regulatory agencies in recent years. Furthermore, the QSAR models are suitable to be used for prediction of similar DBP compounds within the same applicability domain. The selection and integration of various methodologies developed in this research may also benefit future research in similar fields.
Resumo:
A manutenção e evolução de sistemas de software tornou-se uma tarefa bastante crítica ao longo dos últimos anos devido à diversidade e alta demanda de funcionalidades, dispositivos e usuários. Entender e analisar como novas mudanças impactam os atributos de qualidade da arquitetura de tais sistemas é um pré-requisito essencial para evitar a deterioração de sua qualidade durante sua evolução. Esta tese propõe uma abordagem automatizada para a análise de variação do atributo de qualidade de desempenho em termos de tempo de execução (tempo de resposta). Ela é implementada por um framework que adota técnicas de análise dinâmica e mineração de repositório de software para fornecer uma forma automatizada de revelar fontes potenciais – commits e issues – de variação de desempenho em cenários durante a evolução de sistemas de software. A abordagem define quatro fases: (i) preparação – escolher os cenários e preparar os releases alvos; (ii) análise dinâmica – determinar o desempenho de cenários e métodos calculando seus tempos de execução; (iii) análise de variação – processar e comparar os resultados da análise dinâmica para releases diferentes; e (iv) mineração de repositório – identificar issues e commits associados com a variação de desempenho detectada. Estudos empíricos foram realizados para avaliar a abordagem de diferentes perspectivas. Um estudo exploratório analisou a viabilidade de se aplicar a abordagem em sistemas de diferentes domínios para identificar automaticamente elementos de código fonte com variação de desempenho e as mudanças que afetaram tais elementos durante uma evolução. Esse estudo analisou três sistemas: (i) SIGAA – um sistema web para gerência acadêmica; (ii) ArgoUML – uma ferramenta de modelagem UML; e (iii) Netty – um framework para aplicações de rede. Outro estudo realizou uma análise evolucionária ao aplicar a abordagem em múltiplos releases do Netty, e dos frameworks web Wicket e Jetty. Nesse estudo foram analisados 21 releases (sete de cada sistema), totalizando 57 cenários. Em resumo, foram encontrados 14 cenários com variação significante de desempenho para Netty, 13 para Wicket e 9 para Jetty. Adicionalmente, foi obtido feedback de oito desenvolvedores desses sistemas através de um formulário online. Finalmente, no último estudo, um modelo de regressão para desempenho foi desenvolvido visando indicar propriedades de commits que são mais prováveis a causar degradação de desempenho. No geral, 997 commits foram minerados, sendo 103 recuperados de elementos de código fonte degradados e 19 de otimizados, enquanto 875 não tiveram impacto no tempo de execução. O número de dias antes de disponibilizar o release e o dia da semana se mostraram como as variáveis mais relevantes dos commits que degradam desempenho no nosso modelo. A área de característica de operação do receptor (ROC – Receiver Operating Characteristic) do modelo de regressão é 60%, o que significa que usar o modelo para decidir se um commit causará degradação ou não é 10% melhor do que uma decisão aleatória.
Resumo:
A manutenção e evolução de sistemas de software tornou-se uma tarefa bastante crítica ao longo dos últimos anos devido à diversidade e alta demanda de funcionalidades, dispositivos e usuários. Entender e analisar como novas mudanças impactam os atributos de qualidade da arquitetura de tais sistemas é um pré-requisito essencial para evitar a deterioração de sua qualidade durante sua evolução. Esta tese propõe uma abordagem automatizada para a análise de variação do atributo de qualidade de desempenho em termos de tempo de execução (tempo de resposta). Ela é implementada por um framework que adota técnicas de análise dinâmica e mineração de repositório de software para fornecer uma forma automatizada de revelar fontes potenciais – commits e issues – de variação de desempenho em cenários durante a evolução de sistemas de software. A abordagem define quatro fases: (i) preparação – escolher os cenários e preparar os releases alvos; (ii) análise dinâmica – determinar o desempenho de cenários e métodos calculando seus tempos de execução; (iii) análise de variação – processar e comparar os resultados da análise dinâmica para releases diferentes; e (iv) mineração de repositório – identificar issues e commits associados com a variação de desempenho detectada. Estudos empíricos foram realizados para avaliar a abordagem de diferentes perspectivas. Um estudo exploratório analisou a viabilidade de se aplicar a abordagem em sistemas de diferentes domínios para identificar automaticamente elementos de código fonte com variação de desempenho e as mudanças que afetaram tais elementos durante uma evolução. Esse estudo analisou três sistemas: (i) SIGAA – um sistema web para gerência acadêmica; (ii) ArgoUML – uma ferramenta de modelagem UML; e (iii) Netty – um framework para aplicações de rede. Outro estudo realizou uma análise evolucionária ao aplicar a abordagem em múltiplos releases do Netty, e dos frameworks web Wicket e Jetty. Nesse estudo foram analisados 21 releases (sete de cada sistema), totalizando 57 cenários. Em resumo, foram encontrados 14 cenários com variação significante de desempenho para Netty, 13 para Wicket e 9 para Jetty. Adicionalmente, foi obtido feedback de oito desenvolvedores desses sistemas através de um formulário online. Finalmente, no último estudo, um modelo de regressão para desempenho foi desenvolvido visando indicar propriedades de commits que são mais prováveis a causar degradação de desempenho. No geral, 997 commits foram minerados, sendo 103 recuperados de elementos de código fonte degradados e 19 de otimizados, enquanto 875 não tiveram impacto no tempo de execução. O número de dias antes de disponibilizar o release e o dia da semana se mostraram como as variáveis mais relevantes dos commits que degradam desempenho no nosso modelo. A área de característica de operação do receptor (ROC – Receiver Operating Characteristic) do modelo de regressão é 60%, o que significa que usar o modelo para decidir se um commit causará degradação ou não é 10% melhor do que uma decisão aleatória.
Resumo:
The known moss flora of Terra Nova National Park, eastern Newfoundland, comp~ises 210 species. Eighty-two percent of the moss species occurring in Terra Nova are widespread or widespread-sporadic in Newfoundland. Other Newfoundland distributional elements present in the Terra Nova moss flora are the northwestern, southern, southeastern, and disjunct elements, but four of the mosses occurring in Terra Nova appear to belong to a previously unrecognized northeastern element of the Newfoundland flora. The majority (70.9%) of Terra Nova's mosses are of boreal affinity and are widely distributed in the North American coniferous forest belt. An additional 10.5 percent of the Terra Nova mosses are cosmopolitan while 9.5 percent are temperate and 4.8 percent are arctic-montane species. The remaining 4.3 percent of the mosses are of montane affinity, and disjunct between eastern and western North America. In Terra Nova, temperate species at their northern limit are concentrated in balsam fir stands, while arctic-montane species are restricted to exposed cliffs, scree slopes, and coastal exposures. Montane species are largely confined to exposed or freshwater habitats. Inability to tolerate high summer temperatures limits the distributions of both arctic-montane and montane species. In Terra Nova, species of differing phytogeographic affinities co-occur on cliffs and scree slopes. The microhabitat relationships of five selected species from such habitats were evaluated by Discriminant Functions Analysis and Multiple Regression Analysis. The five mosses have distinct and different microhabitats on cliffs and scree slopes in Terra Nova, and abundance of all but one is associated with variation in at least one microhabitat variable. Micro-distribution of Grimmia torquata, an arctic-montane species at its southern limit, appears to be deterJ]lined by sensitivity to high summer temperatures. Both southern mosses at their northern limit (Aulacomnium androgynum, Isothecium myosuroides) appear to be limited by water availability and, possibly, by low winter temperatures. The two species whose distributions extend both north and south or the study area (Encalypta procera, Eurhynchium pulchellum) show no clear relationship with microclimate. Dispersal factors have played a significant role in the development of the Terra Nova moss flora. Compared to the most likely colonizing source (i .e. the rest of the island of Newfoundland), species with small diaspores have colonized the study area to a proportionately much greater extent than have species with large diaspores. Hierarchical log-linear analysis indicates that this is so for all affinity groups present in Terra Nova. The apparent dispersal effects emphasize the comparatively recent glaciation of the area, and may also have been enhanced by anthropogenic influences. The restriction of some species to specific habitats, or to narrowly defined microhabitats, appears to strengthen selection for easily dispersed taxa.
Resumo:
The problems faced by scientists in charge of managing Atlantic salmon (Salmo salar) stocks are : i) how to maintain spawning runs consisting of repeat spawners and large multi-sea-winter (MSW) adults in the face of selective homewater and distant commercial fisheries and , ii) how to more accurately predict returns of adults. Using data from scales collected from maiden Atlantic salmon grilse from two locations on the Northern Peninsula of Newfoundland, St. Barbe Bay and Western Arm Brook, their length at smolting was back calculated. These data were then used to examine whether the St. Barbe commercial fishery is selective for salmon of particular smolt age and/or size. Analysis indicated that come commercial fishery selected larger, but not necessarily older adults that those escaping to Western Arm Brook over the period of this study, 1978-1987. It was determined that less than average size smolts survived better than above average size smolts. Slection for repeat spawners, large MSW salmon, and larger grilse has meant reductions in the proportions of these adults in the spawning runs on Western Arm Brook. This may impact the Western Arm Brook salmon stock by increasing the population instability. Sea survival was significantly correlated with selection by the commercial fishery. Characteristics of adults in Western Arm Brook during the period of study (1978-1987) did not help in explaining yearly variation in sea survival. The characteristics of smolts, however, when subjected to multiple regression analysis explained 57.2 percent of the yearly variation in sea survival.
Resumo:
Omnibus tests of significance in contingency tables use statistics of the chi-square type. When the null is rejected, residual analyses are conducted to identify cells in which observed frequencies differ significantly from expected frequencies. Residual analyses are thus conditioned on a significant omnibus test. Conditional approaches have been shown to substantially alter type I error rates in cases involving t tests conditional on the results of a test of equality of variances, or tests of regression coefficients conditional on the results of tests of heteroscedasticity. We show that residual analyses conditional on a significant omnibus test are also affected by this problem, yielding type I error rates that can be up to 6 times larger than nominal rates, depending on the size of the table and the form of the marginal distributions. We explored several unconditional approaches in search for a method that maintains the nominal type I error rate and found out that a bootstrap correction for multiple testing achieved this goal. The validity of this approach is documented for two-way contingency tables in the contexts of tests of independence, tests of homogeneity, and fitting psychometric functions. Computer code in MATLAB and R to conduct these analyses is provided as Supplementary Material.
Resumo:
Peer reviewed
Resumo:
ACKNOWLEDGMENT We are thankful to RTE for financial support of this project.
Resumo:
Marine protected areas (MPAs) are commonly employed to protect ecosystems from threats like overfishing. Ideally, MPA design should incorporate movement data from multiple target species to ensure sufficient habitat is protected. We used long-term acoustic telemetry and network analysis to determine the fine-scale space use of five shark and one turtle species at a remote atoll in the Seychelles, Indian Ocean, and evaluate the efficacy of a proposed MPA. Results revealed strong, species-specific habitat use in both sharks and turtles, with corresponding variation in MPA use. Defining the MPA's boundary from the edge of the reef flat at low tide instead of the beach at high tide (the current best in Seychelles) significantly increased the MPA's coverage of predator movements by an average of 34%. Informed by these results, the larger MPA was adopted by the Seychelles government, demonstrating how telemetry data can improve shark spatial conservation by affecting policy directly.
Resumo:
Marine protected areas (MPAs) are commonly employed to protect ecosystems from threats like overfishing. Ideally, MPA design should incorporate movement data from multiple target species to ensure sufficient habitat is protected. We used long-term acoustic telemetry and network analysis to determine the fine-scale space use of five shark and one turtle species at a remote atoll in the Seychelles, Indian Ocean, and evaluate the efficacy of a proposed MPA. Results revealed strong, species-specific habitat use in both sharks and turtles, with corresponding variation in MPA use. Defining the MPA's boundary from the edge of the reef flat at low tide instead of the beach at high tide (the current best in Seychelles) significantly increased the MPA's coverage of predator movements by an average of 34%. Informed by these results, the larger MPA was adopted by the Seychelles government, demonstrating how telemetry data can improve shark spatial conservation by affecting policy directly.
Resumo:
We performed fluorescent in situ hybridization (FISH) for 16q23 abnormalities in 861 patients with newly diagnosed multiple myeloma and identified deletion of 16q [del(16q)] in 19.5%. In 467 cases in which demographic and survival data were available, del(16q) was associated with a worse overall survival (OS). It was an independent prognostic marker and conferred additional adverse survival impact in cases with the known poor-risk cytogenetic factors t(4;14) and del(17p). Gene expression profiling and gene mapping using 500K single-nucleotide polymorphism (SNP) mapping arrays revealed loss of heterozygosity (LOH) involving 3 regions: the whole of 16q, a region centered on 16q12 (the location of CYLD), and a region centered on 16q23 (the location of the WW domain-containing oxidoreductase gene WWOX). CYLD is a negative regulator of the NF-kappaB pathway, and cases with low expression of CYLD were used to define a "low-CYLD signature." Cases with 16q LOH or t(14;16) had significantly reduced WWOX expression. WWOX, the site of the translocation breakpoint in t(14;16) cases, is a known tumor suppressor gene involved in apoptosis, and we were able to generate a "low-WWOX signature" defined by WWOX expression. These 2 genes and their corresponding pathways provide an important insight into the potential mechanisms by which 16q LOH confers poor prognosis.
Resumo:
BACKGROUND AND OBJECTIVE: Molecular analysis by PCR of monoclonally rearranged immunoglobulin (Ig) genes can be used for diagnosis in B-cell lymphoproliferative disorders (LPD), as well as for monitoring minimal residual disease (MRD) after treatment. This technique has the risk of false-positive results due to the "background" amplification of similar rearrangements derived from polyclonal B-cells. This problem can be resolved in advance by additional analyses that discern between polyclonal and monoclonal PCR products, such as the heteroduplex analysis. A second problem is that PCR frequently fails to amplify the junction regions, mainly due to somatic mutations frequently present in mature (post-follicular) B-cell lymphoproliferations. The use of additional targets (e.g. Ig light chain genes) can avoid this problem. DESIGN AND METHODS: We studied the specificity of heteroduplex PCR analysis of several Ig junction regions to detect monoclonal products in samples from 84 MM patients and 24 patients with B cell polyclonal disorders. RESULTS: Using two distinct VH consensus primers (FR3 and FR2) in combination with one JH primer, 79% of the MM displayed monoclonal products. The percentage of positive cases was increased by amplification of the Vlamda-Jlamda junction regions or kappa(de) rearrangements, using two or five pairs of consensus primers, respectively. After including these targets in the heteroduplex PCR analysis, 93% of MM cases displayed monoclonal products. None of the polyclonal samples analyzed resulted in monoclonal products. Dilution experiments showed that monoclonal rearrangements could be detected with a sensitivity of at least 10(-2) in a background with >30% polyclonal B-cells, the sensitivity increasing up to 10(-3) when the polyclonal background was
Resumo:
Purpose – The objective of this exploratory study is to investigate the “flow-through” or relationship between top-line measures of hotel operating performance (occupancy, average daily rate and revenue per available room) and bottom-line measures of profitability (gross operating profit and net operating income), before and during the recent great recession. Design/methodology/approach – This study uses data provided by PKF Hospitality Research for the period from 2007-2009. A total of 714 hotels were analyzed and various top-line and bottom-line profitability changes were computed using both absolute levels and percentages. Multiple regression analysis was used to examine the relationship between top and bottom line measures, and to derive flow-through ratios. Findings – The results show that average daily rate (ADR) and occupancy are significantly and positively related to gross operating profit per available room (GOPPAR) and net operating income per available room (NOIPAR). The evidence indicates that ADR, rather than occupancy, appears to be the stronger predictor and better measure of RevPAR growth and bottom-line profitability. The correlations and explained variances are also higher than those reported in prior research. Flow-through ratios range between 1.83 and 1.91 for NOIPAR, and between 1.55 and 1.65 for GOPPAR, across all chain-scales. Research limitations/implications – Limitations of this study include the limited number of years in the study period, limited number of hotels in a competitive set, and self-selection of hotels by the researchers. Practical implications – While ADR and occupancy work in combination to drive profitability, the authors' study shows that ADR is the stronger predictor of profitability. Hotel managers can use flow-through ratios to make financial forecasts, or use them as inputs in valuation models, to forecast future profitability. Originality/value – This paper extends prior research on the relationship between top-line measures and bottom-line profitability and serves to inform lodging owners, operators and asset managers about flow-through ratios, and how these ratios impact hotel profitability.
Resumo:
This research looks into forms of state crime taking place around the U.S.-Mexico border. On the Mexican side of the border violent corruption and criminal activities stemming from state actors complicity with drug trafficking organisations has produced widespread violence and human casualty while forcing many to cross the border legally or illegally in fear for their lives. Upon their arrival on the U.S. side of the border, these individuals are treated as criminal suspects. They are held in immigration detention facilities, interrogated and categorised as inadmissible ‘economic migrants’ or ‘drug offenders’ only to be denied asylum status and deported to dangerous and violent zones in Mexico. These individuals have been persecuted and victimised by the state during the 2007-2012 counter narcotic operations on one side of the border while criminalised and punished by a categorizing anti-immigration regime on the other side of the border. This thesis examines this border crisis as injurious actions against border residents have been executed by the states under legal and illegal formats in violation of criminal law and human rights conventions. The ethnographic research uses data to develop a nuanced understanding of individuals’ experiences of state victimisation on both sides of the border. In contributing to state crime scholarship it presents a multidimensional theoretical lens by using organised crime theoretical models and critical criminology concepts to explain the role of the state in producing multiple insecurities that exclude citizens and non-citizens through criminalisation processes.