5 resultados para classification aided by clustering
em DigitalCommons@The Texas Medical Center
Resumo:
The aim of this study was to determine cancer mortality rates for the United Arab Emirates (UAE) and to create an atlas of cancer mortality for the UAE. This atlas is the first of its kind in the Gulf country and the Middle East. Death certificates were reviewed for a period from January 1, 1990 to December 31, 1999 and cancer deaths were identified. Cancer mortality cases were verified by comparing with medical records. Age-adjusted cancer mortality rates were calculated by gender, emirate/medical district and nationality (UAE nationals and overall UAE population). Individual rates for each emirate were compared to the overall rate of the corresponding population for the same cancer site and gender. Age-adjusted rates were mapped using MapInfo software. High rates for liver, lung and stomach cancer were observed in Abu Dhabi, Dubai and the northern emirates, respectively. Rates for UAE nationals were greater compared to the overall UAE population. Several factors were suggested that may account for high rates of specific cancers observed in certain emirates. It is hoped that this atlas will provide leads that will guide further epidemiologic and public health activities aimed at preventing cancer. ^
Resumo:
The VirB/D4 type IV secretion system (T4SS) of Agrobacterium tumefaciens functions to transfer substrates to infected plant cells through assembly of a translocation channel and a surface structure termed a T-pilus. This thesis is focused on identifying contributions of VirB10 to substrate transfer and T-pilus formation through a mutational analysis. VirB10 is a bitopic protein with several domains, including a: (i) cytoplasmic N-terminus, (ii) single transmembrane (TM) α-helix, (iii) proline-rich region (PRR), and (iv) large C-terminal modified β-barrel. I introduced cysteine insertion and substitution mutations throughout the length of VirB10 in order to: (i) test a predicted transmembrane topology, (ii) identify residues/domains contributing to VirB10 stability, oligomerization, and function, and (iii) monitor structural changes accompanying energy activation or substrate translocation. These studies were aided by recent structural resolution of a periplasmic domain of a VirB10 homolog and a ‘core’ complex composed of homologs of VirB10 and two outer membrane associated subunits, VirB7 and VirB9. By use of the substituted cysteine accessibility method (SCAM), I confirmed the bitopic topology of VirB10. Through phenotypic studies of Ala-Cys insertion mutations, I identified “uncoupling” mutations in the TM and β-barrel domains that blocked T-pilus assembly but permitted substrate transfer. I showed that cysteine replacements in the C-terminal periplasmic domain yielded a variety of phenotypes in relation to protein accumulation, oligomerization, substrate transfer, and T-pilus formation. By SCAM, I also gained further evidence that VirB10 adopts different structural states during machine biogenesis. Finally, I showed that VirB10 supports substrate transfer even when its TM domain is extensively mutagenized or substituted with heterologous TM domains. By contrast, specific residues most probably involved in oligomerization of the TM domain are required for biogenesis of the T-pilus.
Resumo:
Lung cancer is a devastating disease with very poor prognosis. The design of better treatments for patients would be greatly aided by mouse models that closely resemble the human disease. The most common type of human lung cancer is adenocarcinoma with frequent metastasis. Unfortunately, current models for this tumor are inadequate due to the absence of metastasis. Based on the molecular findings in human lung cancer and metastatic potential of osteosarcomas in mutant p53 mouse models, I hypothesized that mice with both K-ras and p53 missense mutations might develop metastatic lung adenocarcinomas. Therefore, I incorporated both K-rasLA1 and p53RI72HΔg alleles into mouse lung cells to establish a more faithful model for human lung adenocarcinoma and for translational and mechanistic studies. Mice with both mutations ( K-rasLA1/+ p53R172HΔg/+) developed advanced lung adenocarcinomas with similar histopathology to human tumors. These lung adenocarcinomas were highly aggressive and metastasized to multiple intrathoracic and extrathoracic sites in a pattern similar to that seen in lung cancer patients. This mouse model also showed gender differences in cancer related death and developed pleural mesotheliomas in 23.2% of them. In a preclinical study, the new drug Erlotinib (Tarceva) decreased the number and size of lung lesions in this model. These data demonstrate that this mouse model most closely mimics human metastatic lung adenocarcinoma and provides an invaluable system for translational studies. ^ To screen for important genes for metastasis, gene expression profiles of primary lung adenocarcinomas and metastases were analyzed. Microarray data showed that these two groups were segregated in gene expression and had 79 highly differentially expressed genes (more than 2.5 fold changes and p<0.001). Microarray data of Bub1b, Vimentin and CCAM1 were validated in tumors by quantitative real-time PCR (QPCR). Bub1b , a mitotic checkpoint gene, was overexpressed in metastases and this correlated with more chromosomal abnormalities in metastatic cells. Vimentin, a marker of epithelial-mesenchymal transition (EMT), was also highly expressed in metastases. Interestingly, Twist, a key EMT inducer, was also highly upregulated in metastases by QPCR, and this significantly correlated with the overexpression of Vimentin in the same tumors. These data suggest EMT occurs in lung adenocarcinomas and is a key mechanism for the development of metastasis in K-ras LA1/+ p53R172HΔg/+ mice. Thus, this mouse model provides a unique system to further probe the molecular basis of metastatic lung cancer.^
Organization of the inferotemporal cortex in the macaque monkey: Connections of areas PITv and CITvp
Resumo:
Visual cortex of macaque monkeys consists of a large number of cortical areas that span the occipital, parietal, temporal, and frontal lobes and occupy more than half of cortical surface. Although considerable progress has been made in understanding the contributions of many occipital areas to visual perceptual processing, much less is known concerning the specific functional contributions of higher areas in the temporal and frontal lobes. Previous behavioral and electrophysiological investigations have demonstrated that the inferotemporal cortex (IT) is essential to the animal's ability to recognize and remember visual objects. While it is generally recognized that IT consists of a number of anatomically and functionally distinct visual-processing areas, there remains considerable controversy concerning the precise number, size, and location of these areas. Therefore, the precise delineation of the cortical subdivisions of inferotemporal cortex is critical for any significant progress in the understanding of the specific contributions of inferotemporal areas to visual processing. In this study, anterograde and/or retrograde neuroanatomical tracers were injected into two visual areas in the ventral posterior and central portions of IT (areas PITv and CITvp) to elucidate the corticocortical connections of these areas with well known areas of occipital cortex and with less well understood regions of inferotemporal cortex. The locations of injection sites and the delineation of the borders of many occipital areas were aided by the pattern of interhemispheric connections, revealed following callosal transection and subsequent labeling with HRP. The resultant patterns of connections were represented on two-dimensional computational (CARET) and manual cortical maps and the laminar characteristics and density of the projection fields were quantified. The laminar and density features of these corticocortical connections demonstrate thirteen anatomically distinct subdivisions or areas distributed within the superior temporal sulcus and across the inferotemporal gyrus. These results serve to refine previous descriptions of inferotemporal areas, validate recently identified areas, and provide a new description of the hierarchical relationships among occipitotemporal cortical areas in macaques. ^
Resumo:
The first manuscript, entitled "Time-Series Analysis as Input for Clinical Predictive Modeling: Modeling Cardiac Arrest in a Pediatric ICU" lays out the theoretical background for the project. There are several core concepts presented in this paper. First, traditional multivariate models (where each variable is represented by only one value) provide single point-in-time snapshots of patient status: they are incapable of characterizing deterioration. Since deterioration is consistently identified as a precursor to cardiac arrests, we maintain that the traditional multivariate paradigm is insufficient for predicting arrests. We identify time series analysis as a method capable of characterizing deterioration in an objective, mathematical fashion, and describe how to build a general foundation for predictive modeling using time series analysis results as latent variables. Building a solid foundation for any given modeling task involves addressing a number of issues during the design phase. These include selecting the proper candidate features on which to base the model, and selecting the most appropriate tool to measure them. We also identified several unique design issues that are introduced when time series data elements are added to the set of candidate features. One such issue is in defining the duration and resolution of time series elements required to sufficiently characterize the time series phenomena being considered as candidate features for the predictive model. Once the duration and resolution are established, there must also be explicit mathematical or statistical operations that produce the time series analysis result to be used as a latent candidate feature. In synthesizing the comprehensive framework for building a predictive model based on time series data elements, we identified at least four classes of data that can be used in the model design. The first two classes are shared with traditional multivariate models: multivariate data and clinical latent features. Multivariate data is represented by the standard one value per variable paradigm and is widely employed in a host of clinical models and tools. These are often represented by a number present in a given cell of a table. Clinical latent features derived, rather than directly measured, data elements that more accurately represent a particular clinical phenomenon than any of the directly measured data elements in isolation. The second two classes are unique to the time series data elements. The first of these is the raw data elements. These are represented by multiple values per variable, and constitute the measured observations that are typically available to end users when they review time series data. These are often represented as dots on a graph. The final class of data results from performing time series analysis. This class of data represents the fundamental concept on which our hypothesis is based. The specific statistical or mathematical operations are up to the modeler to determine, but we generally recommend that a variety of analyses be performed in order to maximize the likelihood that a representation of the time series data elements is produced that is able to distinguish between two or more classes of outcomes. The second manuscript, entitled "Building Clinical Prediction Models Using Time Series Data: Modeling Cardiac Arrest in a Pediatric ICU" provides a detailed description, start to finish, of the methods required to prepare the data, build, and validate a predictive model that uses the time series data elements determined in the first paper. One of the fundamental tenets of the second paper is that manual implementations of time series based models are unfeasible due to the relatively large number of data elements and the complexity of preprocessing that must occur before data can be presented to the model. Each of the seventeen steps is analyzed from the perspective of how it may be automated, when necessary. We identify the general objectives and available strategies of each of the steps, and we present our rationale for choosing a specific strategy for each step in the case of predicting cardiac arrest in a pediatric intensive care unit. Another issue brought to light by the second paper is that the individual steps required to use time series data for predictive modeling are more numerous and more complex than those used for modeling with traditional multivariate data. Even after complexities attributable to the design phase (addressed in our first paper) have been accounted for, the management and manipulation of the time series elements (the preprocessing steps in particular) are issues that are not present in a traditional multivariate modeling paradigm. In our methods, we present the issues that arise from the time series data elements: defining a reference time; imputing and reducing time series data in order to conform to a predefined structure that was specified during the design phase; and normalizing variable families rather than individual variable instances. The final manuscript, entitled: "Using Time-Series Analysis to Predict Cardiac Arrest in a Pediatric Intensive Care Unit" presents the results that were obtained by applying the theoretical construct and its associated methods (detailed in the first two papers) to the case of cardiac arrest prediction in a pediatric intensive care unit. Our results showed that utilizing the trend analysis from the time series data elements reduced the number of classification errors by 73%. The area under the Receiver Operating Characteristic curve increased from a baseline of 87% to 98% by including the trend analysis. In addition to the performance measures, we were also able to demonstrate that adding raw time series data elements without their associated trend analyses improved classification accuracy as compared to the baseline multivariate model, but diminished classification accuracy as compared to when just the trend analysis features were added (ie, without adding the raw time series data elements). We believe this phenomenon was largely attributable to overfitting, which is known to increase as the ratio of candidate features to class examples rises. Furthermore, although we employed several feature reduction strategies to counteract the overfitting problem, they failed to improve the performance beyond that which was achieved by exclusion of the raw time series elements. Finally, our data demonstrated that pulse oximetry and systolic blood pressure readings tend to start diminishing about 10-20 minutes before an arrest, whereas heart rates tend to diminish rapidly less than 5 minutes before an arrest.