9 resultados para ree software environment for statistical computing and graphics R

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software for use with patient records is challenging to design and difficult to evaluate because of the tremendous variability of patient circumstances. A method was devised by the authors to overcome a number of difficulties. The method evaluates and compares objectively various software products for use in emergency departments and compares software to conventional methods like dictation and templated chart forms. The technique utilizes oral case simulation and video recording for analysis. The methodology and experiences of executing a study using this case simulation are discussed in this presentation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nuclear morphometry (NM) uses image analysis to measure features of the cell nucleus which are classified as: bulk properties, shape or form, and DNA distribution. Studies have used these measurements as diagnostic and prognostic indicators of disease with inconclusive results. The distributional properties of these variables have not been systematically investigated although much of the medical data exhibit nonnormal distributions. Measurements are done on several hundred cells per patient so summary measurements reflecting the underlying distribution are needed.^ Distributional characteristics of 34 NM variables from prostate cancer cells were investigated using graphical and analytical techniques. Cells per sample ranged from 52 to 458. A small sample of patients with benign prostatic hyperplasia (BPH), representing non-cancer cells, was used for general comparison with the cancer cells.^ Data transformations such as log, square root and 1/x did not yield normality as measured by the Shapiro-Wilks test for normality. A modulus transformation, used for distributions having abnormal kurtosis values, also did not produce normality.^ Kernel density histograms of the 34 variables exhibited non-normality and 18 variables also exhibited bimodality. A bimodality coefficient was calculated and 3 variables: DNA concentration, shape and elongation, showed the strongest evidence of bimodality and were studied further.^ Two analytical approaches were used to obtain a summary measure for each variable for each patient: cluster analysis to determine significant clusters and a mixture model analysis using a two component model having a Gaussian distribution with equal variances. The mixture component parameters were used to bootstrap the log likelihood ratio to determine the significant number of components, 1 or 2. These summary measures were used as predictors of disease severity in several proportional odds logistic regression models. The disease severity scale had 5 levels and was constructed of 3 components: extracapsulary penetration (ECP), lymph node involvement (LN+) and seminal vesicle involvement (SV+) which represent surrogate measures of prognosis. The summary measures were not strong predictors of disease severity. There was some indication from the mixture model results that there were changes in mean levels and proportions of the components in the lower severity levels. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present the Cellular Dynamic Simulator (CDS) for simulating diffusion and chemical reactions within crowded molecular environments. CDS is based on a novel event driven algorithm specifically designed for precise calculation of the timing of collisions, reactions and other events for each individual molecule in the environment. Generic mesh based compartments allow the creation / importation of very simple or detailed cellular structures that exist in a 3D environment. Multiple levels of compartments and static obstacles can be used to create a dense environment to mimic cellular boundaries and the intracellular space. The CDS algorithm takes into account volume exclusion and molecular crowding that may impact signaling cascades in small sub-cellular compartments such as dendritic spines. With the CDS, we can simulate simple enzyme reactions; aggregation, channel transport, as well as highly complicated chemical reaction networks of both freely diffusing and membrane bound multi-protein complexes. Components of the CDS are generally defined such that the simulator can be applied to a wide range of environments in terms of scale and level of detail. Through an initialization GUI, a simple simulation environment can be created and populated within minutes yet is powerful enough to design complex 3D cellular architecture. The initialization tool allows visual confirmation of the environment construction prior to execution by the simulator. This paper describes the CDS algorithm, design implementation, and provides an overview of the types of features available and the utility of those features are highlighted in demonstrations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Genetic anticipation is defined as a decrease in age of onset or increase in severity as the disorder is transmitted through subsequent generations. Anticipation has been noted in the literature for over a century. Recently, anticipation in several diseases including Huntington's Disease, Myotonic Dystrophy and Fragile X Syndrome were shown to be caused by expansion of triplet repeats. Anticipation effects have also been observed in numerous mental disorders (e.g. Schizophrenia, Bipolar Disorder), cancers (Li-Fraumeni Syndrome, Leukemia) and other complex diseases. ^ Several statistical methods have been applied to determine whether anticipation is a true phenomenon in a particular disorder, including standard statistical tests and newly developed affected parent/affected child pair methods. These methods have been shown to be inappropriate for assessing anticipation for a variety of reasons, including familial correlation and low power. Therefore, we have developed family-based likelihood modeling approaches to model the underlying transmission of the disease gene and penetrance function and hence detect anticipation. These methods can be applied in extended families, thus improving the power to detect anticipation compared with existing methods based only upon parents and children. The first method we have proposed is based on the regressive logistic hazard model. This approach models anticipation by a generational covariate. The second method allows alleles to mutate as they are transmitted from parents to offspring and is appropriate for modeling the known triplet repeat diseases in which the disease alleles can become more deleterious as they are transmitted across generations. ^ To evaluate the new methods, we performed extensive simulation studies for data simulated under different conditions to evaluate the effectiveness of the algorithms to detect genetic anticipation. Results from analysis by the first method yielded empirical power greater than 87% based on the 5% type I error critical value identified in each simulation depending on the method of data generation and current age criteria. Analysis by the second method was not possible due to the current formulation of the software. The application of this method to Huntington's Disease and Li-Fraumeni Syndrome data sets revealed evidence for a generation effect in both cases. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The built environment is part of the physical environment made by people and for people. Because the built environment is such a ubiquitous component of the environment, it acts as an important pathway in determining health outcomes. Zoning, a type of urban planning policy, is one of the most important mechanisms connecting the built environment to public health. This policy analysis research paper explores how zoning regulations in Austin, Texas promote or prohibit the development of a healthy built environment. A systematic literature review was obtained from Active Living Research, which contained literature published about the relationships between the built environment, physical activity, and health. The results of these studies identified the following four components of the built environment that were associated to health: access to recreational facilities, sprawl and residential density, land use mix, and sidewalks and their walkability. A hierarchy analysis was then performed to demonstrate the association between these aspects of the built environment and health outcomes such as obesity, cardiovascular disease, and general health. Once these associations had been established, the components of the built environment were adapted into the evaluation criteria used to conduct a public health analysis of Austin's zoning ordinance. A total of eighty-eight regulations were identified to be related to these components and their varying associations to human health. Eight regulations were projected to have a negative association to health, three would have both a positive and negative association simultaneously, and nine were indeterminable with the information obtained through the literature review. The remaining sixty-eight regulations were projected to be associated in a beneficial manner to human health. Therefore, it was concluded that Austin's zoning ordinance would have an overwhelmingly positive impact on the public's health based on identified associations between the built environment and health outcomes.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background. Physical Activity (PA) is a central part in the fight to reduce obesity rates that are higher in Mexican Americans in the United States than any other ethnic groups. More than half of all Americans do not meet the daily PA recommendations and 48% of Mexican Americans do not exercise. The built environment is believed to affect participation in physical activity. The influence of the built environmental on physical activity levels in low-income Mexican Americans living along the Texas-Mexico border has not been investigated. ^ Purpose. The purpose of this secondary data analysis was trifold: (1) to determine the levels of self-reported PA in adults living in Brownsville, Texas; (2) to characterize the perceptions of this population regarding the built environment; and (3) to determine the association between self-reported PA and the built environment in Mexican Americans living in Brownsville, Texas. ^ Methods. 400 participants from the Tu Salud ¡Sí Cuenta! (TSSC) community-wide campaign were included in this secondary data analysis. Percentages for level of physical activity and the built environment were calculated using SPSS. Perceptions of the built environment were assessed by 14 items. Logistic regression analysis was used to assess the relationship between physical activity and built environment. All models were adjusted for age, gender, and level of education. ^ Results. The majority of men (41.97%) and women (59%), combined (56.7%)did not meet the 2008 PA Guidelines for Americans. We analyzed 14 built environment variables to characterize participants’ perceptions of the built environment. We conducted odds ratio (OR) to find if those who met PA levels associated the built environment such as neighborhood shops ([OR:1.806], CI:1.074,3.038 ]) bus stops ([OR:1.436], CI:.806,2.558) unattended stray dogs ([OR: 1.806], CI:1. 074,3.038), sidewalk access ([OR: .858],CI:.437,1.686), access to free parks ([OR:.549],CI:.335,.900) heavy traffic in neighborhood ([OR:.802], CI:.501,1.285), crime rate ([OR:.779], CI:.494,1.228) ranked the highest by mean score. The association between physical activity and the perceived built environment factors for Mexican Americans participating in the TSSCStudy were weakly associated. ^ Conclusions. This study provides evidence that PA levels are low in this Mexican American population. The built environment factors assessed in this study characterized the need for further studies of the variables that are seen as important to the Mexican American population. Lastly, the association of PA levels to the built environment was weak overall and further studies are recommended of the built environment.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An investigation was undertaken to determine the chemical characterization of inhalable particulate matter in the Houston area, with special emphasis on source identification and apportionment of outdoor and indoor atmospheric aerosols using multivariate statistical analyses.^ Fine (<2.5 (mu)m) particle aerosol samples were collected by means of dichotomous samplers at two fixed site (Clear Lake and Sunnyside) ambient monitoring stations and one mobile monitoring van in the Houston area during June-October 1981 as part of the Houston Asthma Study. The mobile van allowed particulate sampling to take place both inside and outside of twelve homes.^ The samples collected for 12-h sampling on a 7 AM-7 PM and 7 PM-7 AM (CDT) schedule were analyzed for mass, trace elements, and two anions. Mass was determined gravimetrically. An energy-dispersive X-ray fluorescence (XRF) spectrometer was used for determination of elemental composition. Ion chromatography (IC) was used to determine sulfate and nitrate.^ Average chemical compositions of fine aerosol at each site were presented. Sulfate was found to be the largest single component in the fine fraction mass, comprising approximately 30% of the fine mass outdoors and 12% indoors, respectively.^ Principal components analysis (PCA) was applied to identify sources of aerosols and to assess the role of meteorological factors on the variation in particulate samples. The results suggested that meteorological parameters were not associated with sources of aerosol samples collected at these Houston sites.^ Source factor contributions to fine mass were calculated using a combination of PCA and stepwise multivariate regression analysis. It was found that much of the total fine mass was apparently contributed by sulfate-related aerosols. The average contributions to the fine mass coming from the sulfate-related aerosols were 56% of the Houston outdoor ambient fine particulate matter and 26% of the indoor fine particulate matter.^ Characterization of indoor aerosol in residential environments was compared with the results for outdoor aerosols. It was suggested that much of the indoor aerosol may be due to outdoor sources, but there may be important contributions from common indoor sources in the home environment such as smoking and gas cooking. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Accurate quantitative estimation of exposure using retrospective data has been one of the most challenging tasks in the exposure assessment field. To improve these estimates, some models have been developed using published exposure databases with their corresponding exposure determinants. These models are designed to be applied to reported exposure determinants obtained from study subjects or exposure levels assigned by an industrial hygienist, so quantitative exposure estimates can be obtained. ^ In an effort to improve the prediction accuracy and generalizability of these models, and taking into account that the limitations encountered in previous studies might be due to limitations in the applicability of traditional statistical methods and concepts, the use of computer science- derived data analysis methods, predominantly machine learning approaches, were proposed and explored in this study. ^ The goal of this study was to develop a set of models using decision trees/ensemble and neural networks methods to predict occupational outcomes based on literature-derived databases, and compare, using cross-validation and data splitting techniques, the resulting prediction capacity to that of traditional regression models. Two cases were addressed: the categorical case, where the exposure level was measured as an exposure rating following the American Industrial Hygiene Association guidelines and the continuous case, where the result of the exposure is expressed as a concentration value. Previously developed literature-based exposure databases for 1,1,1 trichloroethane, methylene dichloride and, trichloroethylene were used. ^ When compared to regression estimations, results showed better accuracy of decision trees/ensemble techniques for the categorical case while neural networks were better for estimation of continuous exposure values. Overrepresentation of classes and overfitting were the main causes for poor neural network performance and accuracy. Estimations based on literature-based databases using machine learning techniques might provide an advantage when they are applied to other methodologies that combine `expert inputs' with current exposure measurements, like the Bayesian Decision Analysis tool. The use of machine learning techniques to more accurately estimate exposures from literature-based exposure databases might represent the starting point for the independence from the expert judgment.^