10 resultados para Reading and Interpretation of Statistical Graphs

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Improvements in the analysis of microarray images are critical for accurately quantifying gene expression levels. The acquisition of accurate spot intensities directly influences the results and interpretation of statistical analyses. This dissertation discusses the implementation of a novel approach to the analysis of cDNA microarray images. We use a stellar photometric model, the Moffat function, to quantify microarray spots from nylon microarray images. The inherent flexibility of the Moffat shape model makes it ideal for quantifying microarray spots. We apply our novel approach to a Wilms' tumor microarray study and compare our results with a fixed-circle segmentation approach for spot quantification. Our results suggest that different spot feature extraction methods can have an impact on the ability of statistical methods to identify differentially expressed genes. We also used the Moffat function to simulate a series of microarray images under various experimental conditions. These simulations were used to validate the performance of various statistical methods for identifying differentially expressed genes. Our simulation results indicate that tests taking into account the dependency between mean spot intensity and variance estimation, such as the smoothened t-test, can better identify differentially expressed genes, especially when the number of replicates and mean fold change are low. The analysis of the simulations also showed that overall, a rank sum test (Mann-Whitney) performed well at identifying differentially expressed genes. Previous work has suggested the strengths of nonparametric approaches for identifying differentially expressed genes. We also show that multivariate approaches, such as hierarchical and k-means cluster analysis along with principal components analysis, are only effective at classifying samples when replicate numbers and mean fold change are high. Finally, we show how our stellar shape model approach can be extended to the analysis of 2D-gel images by adapting the Moffat function to take into account the elliptical nature of spots in such images. Our results indicate that stellar shape models offer a previously unexplored approach for the quantification of 2D-gel spots. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Essential biological processes are governed by organized, dynamic interactions between multiple biomolecular systems. Complexes are thus formed to enable the biological function and get dissembled as the process is completed. Examples of such processes include the translation of the messenger RNA into protein by the ribosome, the folding of proteins by chaperonins or the entry of viruses in host cells. Understanding these fundamental processes by characterizing the molecular mechanisms that enable then, would allow the (better) design of therapies and drugs. Such molecular mechanisms may be revealed trough the structural elucidation of the biomolecular assemblies at the core of these processes. Various experimental techniques may be applied to investigate the molecular architecture of biomolecular assemblies. High-resolution techniques, such as X-ray crystallography, may solve the atomic structure of the system, but are typically constrained to biomolecules of reduced flexibility and dimensions. In particular, X-ray crystallography requires the sample to form a three dimensional (3D) crystal lattice which is technically di‑cult, if not impossible, to obtain, especially for large, dynamic systems. Often these techniques solve the structure of the different constituent components within the assembly, but encounter difficulties when investigating the entire system. On the other hand, imaging techniques, such as cryo-electron microscopy (cryo-EM), are able to depict large systems in near-native environment, without requiring the formation of crystals. The structures solved by cryo-EM cover a wide range of resolutions, from very low level of detail where only the overall shape of the system is visible, to high-resolution that approach, but not yet reach, atomic level of detail. In this dissertation, several modeling methods are introduced to either integrate cryo-EM datasets with structural data from X-ray crystallography, or to directly interpret the cryo-EM reconstruction. Such computational techniques were developed with the goal of creating an atomic model for the cryo-EM data. The low-resolution reconstructions lack the level of detail to permit a direct atomic interpretation, i.e. one cannot reliably locate the atoms or amino-acid residues within the structure obtained by cryo-EM. Thereby one needs to consider additional information, for example, structural data from other sources such as X-ray crystallography, in order to enable such a high-resolution interpretation. Modeling techniques are thus developed to integrate the structural data from the different biophysical sources, examples including the work described in the manuscript I and II of this dissertation. At intermediate and high-resolution, cryo-EM reconstructions depict consistent 3D folds such as tubular features which in general correspond to alpha-helices. Such features can be annotated and later on used to build the atomic model of the system, see manuscript III as alternative. Three manuscripts are presented as part of the PhD dissertation, each introducing a computational technique that facilitates the interpretation of cryo-EM reconstructions. The first manuscript is an application paper that describes a heuristics to generate the atomic model for the protein envelope of the Rift Valley fever virus. The second manuscript introduces the evolutionary tabu search strategies to enable the integration of multiple component atomic structures with the cryo-EM map of their assembly. Finally, the third manuscript develops further the latter technique and apply it to annotate consistent 3D patterns in intermediate-resolution cryo-EM reconstructions. The first manuscript, titled An assembly model for Rift Valley fever virus, was submitted for publication in the Journal of Molecular Biology. The cryo-EM structure of the Rift Valley fever virus was previously solved at 27Å-resolution by Dr. Freiberg and collaborators. Such reconstruction shows the overall shape of the virus envelope, yet the reduced level of detail prevents the direct atomic interpretation. High-resolution structures are not yet available for the entire virus nor for the two different component glycoproteins that form its envelope. However, homology models may be generated for these glycoproteins based on similar structures that are available at atomic resolutions. The manuscript presents the steps required to identify an atomic model of the entire virus envelope, based on the low-resolution cryo-EM map of the envelope and the homology models of the two glycoproteins. Starting with the results of the exhaustive search to place the two glycoproteins, the model is built iterative by running multiple multi-body refinements to hierarchically generate models for the different regions of the envelope. The generated atomic model is supported by prior knowledge regarding virus biology and contains valuable information about the molecular architecture of the system. It provides the basis for further investigations seeking to reveal different processes in which the virus is involved such as assembly or fusion. The second manuscript was recently published in the of Journal of Structural Biology (doi:10.1016/j.jsb.2009.12.028) under the title Evolutionary tabu search strategies for the simultaneous registration of multiple atomic structures in cryo-EM reconstructions. This manuscript introduces the evolutionary tabu search strategies applied to enable a multi-body registration. This technique is a hybrid approach that combines a genetic algorithm with a tabu search strategy to promote the proper exploration of the high-dimensional search space. Similar to the Rift Valley fever virus, it is common that the structure of a large multi-component assembly is available at low-resolution from cryo-EM, while high-resolution structures are solved for the different components but lack for the entire system. Evolutionary tabu search strategies enable the building of an atomic model for the entire system by considering simultaneously the different components. Such registration indirectly introduces spatial constrains as all components need to be placed within the assembly, enabling the proper docked in the low-resolution map of the entire assembly. Along with the method description, the manuscript covers the validation, presenting the benefit of the technique in both synthetic and experimental test cases. Such approach successfully docked multiple components up to resolutions of 40Å. The third manuscript is entitled Evolutionary Bidirectional Expansion for the Annotation of Alpha Helices in Electron Cryo-Microscopy Reconstructions and was submitted for publication in the Journal of Structural Biology. The modeling approach described in this manuscript applies the evolutionary tabu search strategies in combination with the bidirectional expansion to annotate secondary structure elements in intermediate resolution cryo-EM reconstructions. In particular, secondary structure elements such as alpha helices show consistent patterns in cryo-EM data, and are visible as rod-like patterns of high density. The evolutionary tabu search strategy is applied to identify the placement of the different alpha helices, while the bidirectional expansion characterizes their length and curvature. The manuscript presents the validation of the approach at resolutions ranging between 6 and 14Å, a level of detail where alpha helices are visible. Up to resolution of 12 Å, the method measures sensitivities between 70-100% as estimated in experimental test cases, i.e. 70-100% of the alpha-helices were correctly predicted in an automatic manner in the experimental data. The three manuscripts presented in this PhD dissertation cover different computation methods for the integration and interpretation of cryo-EM reconstructions. The methods were developed in the molecular modeling software Sculptor (http://sculptor.biomachina.org) and are available for the scientific community interested in the multi-resolution modeling of cryo-EM data. The work spans a wide range of resolution covering multi-body refinement and registration at low-resolution along with annotation of consistent patterns at high-resolution. Such methods are essential for the modeling of cryo-EM data, and may be applied in other fields where similar spatial problems are encountered, such as medical imaging.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Detection of multidrug-resistant tuberculosis (MDR-TB), a frequent cause of treatment failure, takes 2 or more weeks to identify by culture. RIF-resistance is a hallmark of MDR-TB, and detection of mutations in the rpoB gene of Mycobacterium tuberculosis using molecular beacon probes with real-time quantitative polymerase chain reaction (qPCR) is a novel approach that takes ≤2 days. However, qPCR identification of resistant isolates, particularly for isolates with mixed RIF-susceptible and RIF-resistant bacteria, is reader dependent and limits its clinical use. The aim of this study was to develop an objective, reader-independent method to define rpoB mutants using beacon qPCR. This would facilitate the transition from a research protocol to the clinical setting, where high-throughput methods with objective interpretation are required. For this, DNAs from 107 M. tuberculosis clinical isolates with known susceptibility to RIF by culture-based methods were obtained from 2 regions where isolates have not previously been subjected to evaluation using molecular beacon qPCR: the Texas–Mexico border and Colombia. Using coded DNA specimens, mutations within an 81-bp hot spot region of rpoB were established by qPCR with 5 beacons spanning this region. Visual and mathematical approaches were used to establish whether the qPCR cycle threshold of the experimental isolate was significantly higher (mutant) compared to a reference wild-type isolate. Visual classification of the beacon qPCR required reader training for strains with a mixture of RIF-susceptible and RIF-resistant bacteria. Only then had the visual interpretation by an experienced reader had 100% sensitivity and 94.6% specificity versus RIF-resistance by culture phenotype and 98.1% sensitivity and 100% specificity versus mutations based on DNA sequence. The mathematical approach was 98% sensitive and 94.5% specific versus culture and 96.2% sensitive and 100% specific versus DNA sequence. Our findings indicate the mathematical approach has advantages over the visual reading, in that it uses a Microsoft Excel template to eliminate reader bias or inexperience, and allows objective interpretation from high-throughput analyses even in the presence of a mixture of RIF-resistant and RIF-susceptible isolates without the need for reader training.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This pilot study compares the mental models of a patient constructed by nurses and physicians while reading an electronic medical record. Preliminary results suggest that the participants' summaries were both quantitatively and qualitatively different. The physician made more inferences and focused on deeper relationships in the record, whereas the nurse focused on the descriptive surface structure of the record.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nuclear morphometry (NM) uses image analysis to measure features of the cell nucleus which are classified as: bulk properties, shape or form, and DNA distribution. Studies have used these measurements as diagnostic and prognostic indicators of disease with inconclusive results. The distributional properties of these variables have not been systematically investigated although much of the medical data exhibit nonnormal distributions. Measurements are done on several hundred cells per patient so summary measurements reflecting the underlying distribution are needed.^ Distributional characteristics of 34 NM variables from prostate cancer cells were investigated using graphical and analytical techniques. Cells per sample ranged from 52 to 458. A small sample of patients with benign prostatic hyperplasia (BPH), representing non-cancer cells, was used for general comparison with the cancer cells.^ Data transformations such as log, square root and 1/x did not yield normality as measured by the Shapiro-Wilks test for normality. A modulus transformation, used for distributions having abnormal kurtosis values, also did not produce normality.^ Kernel density histograms of the 34 variables exhibited non-normality and 18 variables also exhibited bimodality. A bimodality coefficient was calculated and 3 variables: DNA concentration, shape and elongation, showed the strongest evidence of bimodality and were studied further.^ Two analytical approaches were used to obtain a summary measure for each variable for each patient: cluster analysis to determine significant clusters and a mixture model analysis using a two component model having a Gaussian distribution with equal variances. The mixture component parameters were used to bootstrap the log likelihood ratio to determine the significant number of components, 1 or 2. These summary measures were used as predictors of disease severity in several proportional odds logistic regression models. The disease severity scale had 5 levels and was constructed of 3 components: extracapsulary penetration (ECP), lymph node involvement (LN+) and seminal vesicle involvement (SV+) which represent surrogate measures of prognosis. The summary measures were not strong predictors of disease severity. There was some indication from the mixture model results that there were changes in mean levels and proportions of the components in the lower severity levels. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diabetes in adults (type 2) has emerged as a world health problem. Prevalence and risk factors have been found to vary in different populations. The wide range of prevalence rates worldwide indicates the importance of genetic and environmental factors in the etiology of the disease. The few available studies suggest that Filipinos are among the higher-risk groups for developing diabetes. This cross-sectional study estimated the overall prevalence rate of type 2 diabetes among Filipino Americans, ages 20–74 years and residents of Houston Metropolitan Statistical Area, Texas, to be 16.1%. The observed high prevalence was associated with age, sex, family history of diabetes, obesity, region of birth; and, in women, gestational diabetes and income. The diabetic Filipino Americans had a higher proportion of parental history of diabetes, medical history of hypertension, and history of smoking; were physically less active, but generally non-obese, compared with the United States diabetic population. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper defines and compares several models for describing excess influenza pneumonia mortality in Houston. First, the methodology used by the Center for Disease Control is examined and several variations of this methodology are studied. All of the models examined emphasize the difficulty of omitting epidemic weeks.^ In an attempt to find a better method of describing expected and epidemic mortality, time series methods are examined. Grouping in four-week periods, truncating the data series to adjust epidemic periods, and seasonally-adjusting the series y(,t), by:^ (DIAGRAM, TABLE OR GRAPHIC OMITTED...PLEASE SEE DAI)^ is the best method examined. This new series w(,t) is stationary and a moving average model MA(1) gives a good fit for forecasting influenza and pneumonia mortality in Houston.^ Influenza morbidity, other causes of death, sex, race, age, climate variables, environmental factors, and school absenteeism are all examined in terms of their relationship to influenza and pneumonia mortality. Both influenza morbidity and ischemic heart disease mortality show a very high relationship that remains when seasonal trends are removed from the data. However, when jointly modeling the three series it is obvious that the simple time series MA(1) model of truncated, seasonally-adjusted four-week data gives a better forecast.^