6 resultados para Linear and nonlinear methods
em DigitalCommons@The Texas Medical Center
Resumo:
Life expectancy has consistently increased over the last 150 years due to improvements in nutrition, medicine, and public health. Several studies found that in many developed countries, life expectancy continued to rise following a nearly linear trend, which was contrary to a common belief that the rate of improvement in life expectancy would decelerate and was fit with an S-shaped curve. Using samples of countries that exhibited a wide range of economic development levels, we explored the change in life expectancy over time by employing both nonlinear and linear models. We then observed if there were any significant differences in estimates between linear models, assuming an auto-correlated error structure. When data did not have a sigmoidal shape, nonlinear growth models sometimes failed to provide meaningful parameter estimates. The existence of an inflection point and asymptotes in the growth models made them inflexible with life expectancy data. In linear models, there was no significant difference in the life expectancy growth rate and future estimates between ordinary least squares (OLS) and generalized least squares (GLS). However, the generalized least squares model was more robust because the data involved time-series variables and residuals were positively correlated. ^
Resumo:
Macromolecular interactions, such as protein-protein interactions and protein-DNA interactions, play important roles in executing biological functions in cells. However the complexity of such interactions often makes it very challenging to elucidate the structural details of these subjects. In this thesis, two different research strategies were applied on two different two macromolecular systems: X-ray crystallography on three tandem FF domains of transcription regulator CA150 and electron microscopy on STAT1-importin α5 complex. The results from these studies provide novel insights into the function-structure relationships of transcription coupled RNA splicing mediated by CA150 and the nuclear import process of the JAK-STAT signaling pathway. ^ The first project aimed at the protein-protein interaction module FF domain, which often occurs as tandem repeats. Crystallographic structure of the first three FF domains of human CA150 was determined to 2.7 Å resolution. This is the only crystal structure of an FF domain and the only structure on tandem FF domains to date. It revealed a striking connectivity between an FF domain and the next. Peptide binding assay with the potential binding ligand of FF domains was performed using fluorescence polarization. Furthermore, for the first time, FF domains were found to potentially interact with DNA. DNA binding assays were also performed and the results were supportive to this newly proposed functionality of an FF domain. ^ The second project aimed at understanding the molecular mechanism of the nuclear import process of transcription factor STAT1. The first structural model of pSTAT1-importin α5 complex in solution was built from the images of negative staining electron microscopy. Two STAT1 molecules were observed to interact with one molecule of importin α5 in an asymmetric manner. This seems to imply that STAT1 interacts with importin α5 with a novel mechanism that is different from canonical importin α-cargo interactions. Further in vitro binding assays were performed to obtain more details on the pSTAT1-importin α5 interaction. ^
Resumo:
This dissertation develops and tests a comparative effectiveness methodology utilizing a novel approach to the application of Data Envelopment Analysis (DEA) in health studies. The concept of performance tiers (PerT) is introduced as terminology to express a relative risk class for individuals within a peer group and the PerT calculation is implemented with operations research (DEA) and spatial algorithms. The analysis results in the discrimination of the individual data observations into a relative risk classification by the DEA-PerT methodology. The performance of two distance measures, kNN (k-nearest neighbor) and Mahalanobis, was subsequently tested to classify new entrants into the appropriate tier. The methods were applied to subject data for the 14 year old cohort in the Project HeartBeat! study.^ The concepts presented herein represent a paradigm shift in the potential for public health applications to identify and respond to individual health status. The resultant classification scheme provides descriptive, and potentially prescriptive, guidance to assess and implement treatments and strategies to improve the delivery and performance of health systems. ^
Resumo:
A population-based cross-sectional survey of socio-environmental factors associated with the prevalence of Dracunculus medinensis (guinea worm disease) was conducted in Idere, a rural agricultural community in Ibarapa, Oyo state, Nigeria, during 1982.^ The epidemiologic data were collected by household interview of all 501 households. The environmental data were collected by analysis of water samples collected from all domestic water sources and rainfall records.^ The specific objectives of this research were to: (a) Describe the prevalence of guinea worm disease in Idere during 1982 by age, sex, area of residence, drinking water source, religion and weekly amount of money spent by the household to collect potable drinking water. (b) Compare the characteristics of cases with non-cases of guinea worm in order to identify factors associated with high risk of infection. (c) Investigate domestic water sources for the distribution of Cyclops. (d) Determine the extent of potable water shortage with a view to identifying factors responsible for such shortage in the community. (e) Describe the effects of guinea worm on school attendance during 1980/1982 school years by class and location of school from piped water supply.^ The findings of this research indicate that during 1982, 31.8 percent of Idere's 6,527 residents experienced guinea worm infection, with higher prevalence of infection recorded in males in their most productive years and females in their teenage years. The role of sex and age to risk of higher infection rate was explained in the context of water related exposure and water intake due to dehydration from physical occupational actitives of subgroups.^ Potable water available to residents was considerably below the minimum recommended by WHO for tropical climates, with sixty-eight percent of water needs of the residents coming from unprotected surface water which harbour Cyclops, the obligatory intermediate host of Dracunculus medinensis. An association was found between periods of relative high density of Cyclops in domestic water and rainfall.^ Impact of guinea worm infection on educational activities was considerable and its implications were discussed, including the implications of the research findings in relation to control of guinea worm disease in Ibarapa. ^
Resumo:
Accurate quantitative estimation of exposure using retrospective data has been one of the most challenging tasks in the exposure assessment field. To improve these estimates, some models have been developed using published exposure databases with their corresponding exposure determinants. These models are designed to be applied to reported exposure determinants obtained from study subjects or exposure levels assigned by an industrial hygienist, so quantitative exposure estimates can be obtained. ^ In an effort to improve the prediction accuracy and generalizability of these models, and taking into account that the limitations encountered in previous studies might be due to limitations in the applicability of traditional statistical methods and concepts, the use of computer science- derived data analysis methods, predominantly machine learning approaches, were proposed and explored in this study. ^ The goal of this study was to develop a set of models using decision trees/ensemble and neural networks methods to predict occupational outcomes based on literature-derived databases, and compare, using cross-validation and data splitting techniques, the resulting prediction capacity to that of traditional regression models. Two cases were addressed: the categorical case, where the exposure level was measured as an exposure rating following the American Industrial Hygiene Association guidelines and the continuous case, where the result of the exposure is expressed as a concentration value. Previously developed literature-based exposure databases for 1,1,1 trichloroethane, methylene dichloride and, trichloroethylene were used. ^ When compared to regression estimations, results showed better accuracy of decision trees/ensemble techniques for the categorical case while neural networks were better for estimation of continuous exposure values. Overrepresentation of classes and overfitting were the main causes for poor neural network performance and accuracy. Estimations based on literature-based databases using machine learning techniques might provide an advantage when they are applied to other methodologies that combine `expert inputs' with current exposure measurements, like the Bayesian Decision Analysis tool. The use of machine learning techniques to more accurately estimate exposures from literature-based exposure databases might represent the starting point for the independence from the expert judgment.^
New methods for quantification and analysis of quantitative real-time polymerase chain reaction data
Resumo:
Quantitative real-time polymerase chain reaction (qPCR) is a sensitive gene quantitation method that has been widely used in the biological and biomedical fields. The currently used methods for PCR data analysis, including the threshold cycle (CT) method, linear and non-linear model fitting methods, all require subtracting background fluorescence. However, the removal of background fluorescence is usually inaccurate, and therefore can distort results. Here, we propose a new method, the taking-difference linear regression method, to overcome this limitation. Briefly, for each two consecutive PCR cycles, we subtracted the fluorescence in the former cycle from that in the later cycle, transforming the n cycle raw data into n-1 cycle data. Then linear regression was applied to the natural logarithm of the transformed data. Finally, amplification efficiencies and the initial DNA molecular numbers were calculated for each PCR run. To evaluate this new method, we compared it in terms of accuracy and precision with the original linear regression method with three background corrections, being the mean of cycles 1-3, the mean of cycles 3-7, and the minimum. Three criteria, including threshold identification, max R2, and max slope, were employed to search for target data points. Considering that PCR data are time series data, we also applied linear mixed models. Collectively, when the threshold identification criterion was applied and when the linear mixed model was adopted, the taking-difference linear regression method was superior as it gave an accurate estimation of initial DNA amount and a reasonable estimation of PCR amplification efficiencies. When the criteria of max R2 and max slope were used, the original linear regression method gave an accurate estimation of initial DNA amount. Overall, the taking-difference linear regression method avoids the error in subtracting an unknown background and thus it is theoretically more accurate and reliable. This method is easy to perform and the taking-difference strategy can be extended to all current methods for qPCR data analysis.^