941 resultados para Log-linear model
Resumo:
Generalized linear Poisson and logistic regression models were utilized to examine the relationship between temperature and precipitation and cases of Saint Louis encephalitis virus spread in the Houston metropolitan area. The models were investigated with and without repeated measures, with a first order autoregressive (AR1) correlation structure used for the repeated measures model. The two types of Poisson regression models, with and without correlation structure, showed that a unit increase in temperature measured in degrees Fahrenheit increases the occurrence of the virus 1.7 times and a unit increase in precipitation measured in inches increases the occurrence of the virus 1.5 times. Logistic regression did not show these covariates to be significant as predictors for encephalitis activity in Houston for either correlation structure. This discrepancy for the logistic model could be attributed to the small data set.^ Keywords: Saint Louis Encephalitis; Generalized Linear Model; Poisson; Logistic; First Order Autoregressive; Temperature; Precipitation. ^
Resumo:
Objective: In this secondary data analysis, three statistical methodologies were implemented to handle cases with missing data in a motivational interviewing and feedback study. The aim was to evaluate the impact that these methodologies have on the data analysis. ^ Methods: We first evaluated whether the assumption of missing completely at random held for this study. We then proceeded to conduct a secondary data analysis using a mixed linear model to handle missing data with three methodologies (a) complete case analysis, (b) multiple imputation with explicit model containing outcome variables, time, and the interaction of time and treatment, and (c) multiple imputation with explicit model containing outcome variables, time, the interaction of time and treatment, and additional covariates (e.g., age, gender, smoke, years in school, marital status, housing, race/ethnicity, and if participants play on athletic team). Several comparisons were conducted including the following ones: 1) the motivation interviewing with feedback group (MIF) vs. the assessment only group (AO), the motivation interviewing group (MIO) vs. AO, and the intervention of the feedback only group (FBO) vs. AO, 2) MIF vs. FBO, and 3) MIF vs. MIO.^ Results: We first evaluated the patterns of missingness in this study, which indicated that about 13% of participants showed monotone missing patterns, and about 3.5% showed non-monotone missing patterns. Then we evaluated the assumption of missing completely at random by Little's missing completely at random (MCAR) test, in which the Chi-Square test statistic was 167.8 with 125 degrees of freedom, and its associated p-value was p=0.006, which indicated that the data could not be assumed to be missing completely at random. After that, we compared if the three different strategies reached the same results. For the comparison between MIF and AO as well as the comparison between MIF and FBO, only the multiple imputation with additional covariates by uncongenial and congenial models reached different results. For the comparison between MIF and MIO, all the methodologies for handling missing values obtained different results. ^ Discussions: The study indicated that, first, missingness was crucial in this study. Second, to understand the assumptions of the model was important since we could not identify if the data were missing at random or missing not at random. Therefore, future researches should focus on exploring more sensitivity analyses under missing not at random assumption.^
Resumo:
Background Past and recent evidence shows that radionuclides in drinking water may be a public health concern. Developmental thresholds for birth defects with respect to chronic low level domestic radiation exposures, such as through drinking water, have not been definitely recognized, and there is a strong need to address this deficiency in information. In this study we examined the geographic distribution of orofacial cleft birth defects in and around uranium mining district Counties in South Texas (Atascosa, Bee, Brooks, Calhoun, Duval, Goliad, Hidalgo, Jim Hogg, Jim Wells, Karnes, Kleberg, Live Oak, McMullen, Nueces, San Patricio, Refugio, Starr, Victoria, Webb, and Zavala), from 1999 to 2007. The probable association of cleft birth defect rates by ZIP codes classified according to uranium and radium concentrations in drinking water supplies was evaluated. Similar associations between orofacial cleft birth defects and radium/radon in drinking water were reported earlier by Cech and co-investigators in another of the Gulf Coast region (Harris County, Texas).50, 55 Since substantial uranium mining activity existed and still exists in South Texas, contamination of drinking water sources with radiation and its relation to birth defects is a ground for concern. ^ Methods Residential addresses of orofacial cleft birth defect cases, as well as live births within the twenty Counties during 1999-2007 were geocoded and mapped. Prevalence rates were calculated by ZIP codes and were mapped accordingly. Locations of drinking water supplies were also geocoded and mapped. ZIP codes were stratified as having high combined uranium (≥30μg/L) vs. low combined uranium (<30μg/L). Likewise, ZIP codes having the uranium isotope, Ra-226 in drinking water, were also stratified as having elevated radium (≥3 pCi/L) vs. low radium (<3 pCi/L). A linear regression was performed using STATA® generalized linear model (GLM) program to evaluate the probable association between cleft birth defect rates by ZIP codes and concentration of uranium and radium via domestic water supply. These rates were further adjusted for potentially confounding variables such as maternal age, education, occupation, and ethnicity. ^ Results This study showed higher rates of cleft births in ZIP codes classified as having high combined uranium versus ZIP codes having low combined uranium. The model was further improved by adding radium stratified as explained above. Adjustment for maternal age and ethnicity did not substantially affect the statistical significance of uranium or radium concentrations in household water supplies. ^ Conclusion Although this study lacks individual exposure levels, the findings suggest a significant association between elevated uranium and radium concentrations in tap water and high orofacial birth defect rates by ZIP codes. Future case-control studies that can measure individual exposure levels and adjust for contending risk factors could result in a better understanding of the exposure-disease association.^
Resumo:
The infant mortality rate (IMR) is considered to be one of the most important indices of a country's well-being. Countries around the world and other health organizations like the World Health Organization are dedicating their resources, knowledge and energy to reduce the infant mortality rates. The well-known Millennium Development Goal 4 (MDG 4), whose aim is to archive a two thirds reduction of the under-five mortality rate between 1990 and 2015, is an example of the commitment. ^ In this study our goal is to model the trends of IMR between the 1950s to 2010s for selected countries. We would like to know how the IMR is changing overtime and how it differs across countries. ^ IMR data collected over time forms a time series. The repeated observations of IMR time series are not statistically independent. So in modeling the trend of IMR, it is necessary to account for these correlations. We proposed to use the generalized least squares method in general linear models setting to deal with the variance-covariance structure in our model. In order to estimate the variance-covariance matrix, we referred to the time-series models, especially the autoregressive and moving average models. Furthermore, we will compared results from general linear model with correlation structure to that from ordinary least squares method without taking into account the correlation structure to check how significantly the estimates change.^
Resumo:
Cardiovascular disease (CVD) is a threat to public health. It has been reported to be the leading cause of death in United States. The invention of next generation sequencing (NGS) technology has revolutionized the biomedical research. To investigate NGS data of CVD related quantitative traits would contribute to address the unknown etiology and disease mechanism of CVD. NHLBI's Exome Sequencing Project (ESP) contains CVD related phenotypes and their associated NGS exomes sequence data. Initially, a subset of next generation sequencing data consisting of 13 CVD-related quantitative traits was investigated. Only 6 traits, systolic blood pressure (SBP), diastolic blood pressure (DBP), height, platelet counts, waist circumference, and weight, were analyzed by functional linear model (FLM) and 7 currently existing methods. FLM outperformed all currently existing methods by identifying the highest number of significant genes and had identified 96, 139, 756, 1162, 1106, and 298 genes associated with SBP, DBP, Height, Platelet, Waist, and Weight respectively. ^
New methods for quantification and analysis of quantitative real-time polymerase chain reaction data
Resumo:
Quantitative real-time polymerase chain reaction (qPCR) is a sensitive gene quantitation method that has been widely used in the biological and biomedical fields. The currently used methods for PCR data analysis, including the threshold cycle (CT) method, linear and non-linear model fitting methods, all require subtracting background fluorescence. However, the removal of background fluorescence is usually inaccurate, and therefore can distort results. Here, we propose a new method, the taking-difference linear regression method, to overcome this limitation. Briefly, for each two consecutive PCR cycles, we subtracted the fluorescence in the former cycle from that in the later cycle, transforming the n cycle raw data into n-1 cycle data. Then linear regression was applied to the natural logarithm of the transformed data. Finally, amplification efficiencies and the initial DNA molecular numbers were calculated for each PCR run. To evaluate this new method, we compared it in terms of accuracy and precision with the original linear regression method with three background corrections, being the mean of cycles 1-3, the mean of cycles 3-7, and the minimum. Three criteria, including threshold identification, max R2, and max slope, were employed to search for target data points. Considering that PCR data are time series data, we also applied linear mixed models. Collectively, when the threshold identification criterion was applied and when the linear mixed model was adopted, the taking-difference linear regression method was superior as it gave an accurate estimation of initial DNA amount and a reasonable estimation of PCR amplification efficiencies. When the criteria of max R2 and max slope were used, the original linear regression method gave an accurate estimation of initial DNA amount. Overall, the taking-difference linear regression method avoids the error in subtracting an unknown background and thus it is theoretically more accurate and reliable. This method is easy to perform and the taking-difference strategy can be extended to all current methods for qPCR data analysis.^
Resumo:
Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.
Resumo:
Up to now, snow cover on Antarctic sea ice and its impact on radar backscatter, particularly after the onset of freeze/thaw processes, are not well understood. Here we present a combined analysis of in situ observations of snow properties from the landfast sea ice in Atka Bay, Antarctica, and high-resolution TerraSAR-X backscatter data, for the transition from austral spring (November 2012) to summer (January 2013). The physical changes in the seasonal snow cover during that time are reflected in the evolution of TerraSAR-X backscatter. We are able to explain 76-93% of the spatio-temporal variability of the TerraSAR-X backscatter signal with up to four snowpack parameters with a root-mean-squared error of 0.87-1.62 dB, using a simple multiple linear model. Over the complete study, and especially after the onset of early-melt processes and freeze/thaw cycles, the majority of variability in the backscatter is influenced by changes in snow/ice interface temperature, snow depth and top-layer grain size. This suggests it may be possible to retrieve snow physical properties over Antarctic sea ice from X-band SAR backscatter.
Resumo:
The hydraulic effect of asymmetric compound bedforms on tidal currents was assessed from field measurements of flow velocity in the Knudedyb tidal inlet, Denmark. Large asymmetric bedforms with smaller superimposed ones are a common feature of sandy shallow water environments and are known to act as hydraulic roughness elements in dependence with flow direction. The presence of a flow separation zone on the bedform lee was estimated through analysis of the measured velocity directions and the calculation of the flow separation line. The Law of the Wall was used to calculate roughness lengths and shear velocities from log-linear segments sought on transect-averaged and single-location velocity profiles. During the ebb tide a permanent flow separation zone was established over the steep (10-20°) lee sides of the ebb-oriented primary bedforms, which generated a consequent drag on the flow. During the flood, no flow separation was induced by the gentle (2°) lee side of the primary bedforms except over the steepest (10°) part of the lee side where a small separation zone was sometimes observed. As a result, hydraulic roughness was only due to the superimposed bedforms. The parameterized flow separation line was found to underestimate the length of the flow separation zone of the primary bedforms. A better estimation of the presence and shape of the flow separation zone over complex bedforms in a tidal environment still needs to be determined; in particular the relationship between flow separation zone and bedform geometry (asymmetry, relative height or slope of the lee side) is unclear. This would improve the prediction of complex bedform roughness in tidal flows.
Resumo:
Phycobiliproteins are a family of water-soluble pigment proteins that play an important role as accessory or antenna pigments and absorb in the green part of the light spectrum poorly used by chlorophyll a. The phycoerythrins (PEs) are one of four types of phycobiliproteins that are generally distinguished based on their absorption properties. As PEs are water soluble, they are generally not captured with conventional pigment analysis. Here we present a statistical model based on in situ measurements of three transatlantic cruises which allows us to derive relative PE concentration from standardized hyperspectral underwater radiance measurements (Lu). The model relies on Empirical Orthogonal Function (EOF) analysis of Lu spectra and, subsequently, a Generalized Linear Model with measured PE concentrations as the response variable and EOF loadings as predictor variables. The method is used to predict relative PE concentrations throughout the water column and to calculate integrated PE estimates based on those profiles.
Resumo:
Ocean acidification can have negative repercussions from the organism to ecosystem levels. Octocorals deposit high-magnesium calcite in their skeletons, and according to different models, they could be more susceptible to the depletion of carbonate ions than either calcite or aragonite-depositing organisms. This study investigated the response of the gorgonian coral Eunicea fusca to a range of CO2 concentrations from 285 to 4,568 ppm (pH range 8.1-7.1) over a 4-week period. Gorgonian growth and calcification were measured at each level of CO2 as linear extension rate and percent change in buoyant weight and calcein incorporation in individual sclerites, respectively. There was a significant negative relationship for calcification and CO2 concentration that was well explained by a linear model regression analysis for both buoyant weight and calcein staining. In general, growth and calcification did not stop in any of the concentrations of pCO2; however, some of the octocoral fragments experienced negative calcification at undersaturated levels of calcium carbonate (>4,500 ppm) suggesting possible dissolution effects. These results highlight the susceptibility of the gorgonian coral E. fusca to elevated levels of carbon dioxide but suggest that E. fusca could still survive well in mid-term ocean acidification conditions expected by the end of this century, which provides important information on the effects of ocean acidification on the dynamics of coral reef communities. Gorgonian corals can be expected to diversify and thrive in the Atlantic-Eastern Pacific; as scleractinian corals decline, it is likely to expect a shift in these reef communities from scleractinian coral dominated to octocoral/soft coral dominated under a "business as usual" scenario of CO2 emissions.
Resumo:
Question: How do tree species identity, microhabitat and water availability affect inter- and intra-specific interactions between juvenile and adult woody plants? Location: Continental Mediterranean forests in Alto Tajo Natural Park, Guadalajara, Spain. Methods: A total of 2066 juveniles and adults of four co-occurring tree species were mapped in 17 plots. The frequency of juveniles at different microhabitats and water availability levels was analysed using log-linear models. We used nearest-neighbour contingency table analysis of spatial segregation and J-functions to describe the spatial patterns. Results: We found a complex spatial pattern that varied according to species identity and microhabitat. Recruitment was more frequent in gaps for Quercus ilex, while the other three species recruited preferentially under shrubs or trees depending on the water availability level. Juveniles were not spatially associated to conspecific adults, experiencing segregation from them inmany cases. Spatial associations, both positive and negative, were more common at higher water availability levels. Conclusions: Our results do not agree with expectations from the stressgradient hypothesis, suggesting that positive interactions do not increase in importance with increasing aridity in the study ecosystem. Regeneration patterns are species-specific and depend on microhabitat characteristics and dispersal strategies. In general, juveniles do not look for conspecific adult protection. This work contributes to the understanding of species co-existence, proving the importance of considering a multispecies approach at several plots to overcome limitations of simple pair-wise comparisons in a limited number of sites.
Resumo:
Este estudio aborda la recopilación de nuevas tendencias del diseño sismorresistente, enfocándose en la técnica del aislamiento de base, por ser la más efectiva, difundida y utilizada; y el análisis de las ventajas que puede tener una edificación que aplica dicha técnica, desde el punto de vista estructural y económico. Se elige la tipología más frecuente o común de edificios de hormigón armado propensos a ser aislados, que en este caso es un hospital, cuyo modelo empotrado se somete a varias normas sismorresistentes comparando principalmente fuerzas de cortante basal, y considerando la interacción suelo-estructura; para asistir a este cálculo se desarrolla un programa de elementos viga de 6 gdl por nodo en código Matlab. El modelo aislado incluye el análisis de tres combinaciones de tipos de aisladores HDR, LPR y FPS, alternando modelos lineales simplificados de 1 y 3 gdl por piso, evaluando diferencias de respuestas de la estructura, y procediendo a la elección de la combinación que de resultados más convenientes; para la modelación no lineal de cada sistema de aislamiento se utiliza el método explícito de diferencias centrales. Finalmente, se realiza un análisis comparativo de daños esperados en el caso de la ocurrencia del sismo de diseño, utilizando el método rápido y tomando como referencia el desplazamiento espectral del último piso; llegando a dar conclusiones y recomendaciones para el uso de sistemas de aislamiento. This study addresses the collection of new seismic design trends, focusing on base isolation technique, as the most effective and widely used, and the analysis of the advantages in buildings that apply this technique, from the structurally and economically point of view. Choosing the most common types of concrete buildings likely to be isolated, which in this case is a hospital, the fix model is subjected to various seismic codes mainly comparing base shear forces, and considering the soil-structure interaction; for this calculation attend a program of bars 6 dof per node is made in Matlab code. The isolated model includes analysis of three types of isolators combinations HDR, LPR and FPS, alternating simplified linear model of 1 and 3 dof per floor, evaluating differences in the response of the structure, and proceeding to the choice of the combination of results more convenient; for modeling nonlinear each insulation system, the explicit central difference method is used. Finally, a comparative analysis of expected damage in the case of the design earthquake, using a fast combined method and by reference to the spectral displacement of the top floor; reaching conclusions and give recommendations for the use of insulation systems.
Resumo:
Natural regeneration in stone pine (Pinus pinea L.) managed forests in the Spanish Northern Plateau is not achieved successfully under current silviculture practices, constituting a main concern for forest managers. We modelled spatio-temporal features of primary dispersal to test whether (a) present low stand densities constrain natural regeneration success and (b) seed release is a climate-controlled process. The present study is based on data collected from a 6 years seed trap experiment considering different regeneration felling intensities. From a spatial perspective, we attempted alternate established kernels under different data distribution assumptions to fit a spatial model able to predict P. pinea seed rain. Due to P. pinea umbrella-like crown, models were adapted to account for crown effect through correction of distances between potential seed arrival locations and seed sources. In addition, individual tree fecundity was assessed independently from existing models, improving parameter estimation stability. Seed rain simulation enabled to calculate seed dispersal indexes for diverse silvicultural regeneration treatments. The selected spatial model of best fit (Weibull, Poisson assumption) predicted a highly clumped dispersal pattern that resulted in a proportion of gaps where no seed arrival is expected (dispersal limitation) between 0.25 and 0.30 for intermediate intensity regeneration fellings and over 0.50 for intense fellings. To describe the temporal pattern, the proportion of seeds released during monthly intervals was modelled as a function of climate variables – rainfall events – through a linear model that considered temporal autocorrelation, whereas cone opening took place over a temperature threshold. Our findings suggest the application of less intensive regeneration fellings, to be carried out after years of successful seedling establishment and, seasonally, subsequent to the main rainfall period (late fall). This schedule would avoid dispersal limitation and would allow for a complete seed release. These modifications in present silviculture practices would produce a more efficient seed shadow in managed stands.
Resumo:
Lately, several researchers have pointed out that climate change is expected to increase temperatures and lower rainfall in Mediterranean regions, simultaneously increasing the intensity of extreme rainfall events. These changes could have consequences regarding rainfall regime, erosion, sediment transport and water quality, soil management, and new designs in diversion ditches. Climate change is expected to result in increasingly unpredictable and variable rainfall, in amount and timing, changing seasonal patterns and increasing the frequency of extreme weather events. Consequently, the evolution of frequency and intensity of drought periods is of most important as in agro-ecosystems many processes will be affected by them. Realising the complex and important consequences of an increasing frequency of extreme droughts at the Ebro River basin, our aim is to study the evolution of drought events at this site statistically, with emphasis on the occurrence and intensity of them. For this purpose, fourteen meteorological stations were selected based on the length of the rainfall series and the climatic classification to obtain a representative untreated dataset from the river basin. Daily rainfall series from 1957 to 2002 were obtained from each meteorological station and no-rain period frequency as the consecutive numbers of days were extracted. Based on this data, we study changes in the probability distribution in several sub-periods. Moreover we used the Standardized Precipitation Index (SPI) for identification of drought events in a year scale and then we use this index to fit log-linear models to the contingency tables between the SPI index and the sub-periods, this adjusted is carried out with the help of ANOVA inference.