7 resultados para Data Generation

em DigitalCommons@The Texas Medical Center


Relevância:

70.00% 70.00%

Publicador:

Resumo:

High-throughput assays, such as yeast two-hybrid system, have generated a huge amount of protein-protein interaction (PPI) data in the past decade. This tremendously increases the need for developing reliable methods to systematically and automatically suggest protein functions and relationships between them. With the available PPI data, it is now possible to study the functions and relationships in the context of a large-scale network. To data, several network-based schemes have been provided to effectively annotate protein functions on a large scale. However, due to those inherent noises in high-throughput data generation, new methods and algorithms should be developed to increase the reliability of functional annotations. Previous work in a yeast PPI network (Samanta and Liang, 2003) has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional associations between proteins, and hence suggest their functions. One advantage of the work is that their algorithm is not sensitive to noises (false positives) in high-throughput PPI data. In this study, we improved their prediction scheme by developing a new algorithm and new methods which we applied on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting functionally associated proteins. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as independent and unbiased benchmarks to evaluate our algorithms and methods within the human PPI network. We showed that, compared with the previous work from Samanta and Liang, our algorithm and methods developed in this study improved the overall quality of functional inferences for human proteins. By applying the algorithms to the human PPI network, we obtained 4,233 significant functional associations among 1,754 proteins. Further comparisons of their KEGG and GO annotations allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made pathway analysis to identify several subclusters that are highly enriched in certain signaling pathways. Particularly, we performed a detailed analysis on a subcluster enriched in the transforming growth factor β signaling pathway (P<10-50) which is important in cell proliferation and tumorigenesis. Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotations in this post-genomic era.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Genetic anticipation is defined as a decrease in age of onset or increase in severity as the disorder is transmitted through subsequent generations. Anticipation has been noted in the literature for over a century. Recently, anticipation in several diseases including Huntington's Disease, Myotonic Dystrophy and Fragile X Syndrome were shown to be caused by expansion of triplet repeats. Anticipation effects have also been observed in numerous mental disorders (e.g. Schizophrenia, Bipolar Disorder), cancers (Li-Fraumeni Syndrome, Leukemia) and other complex diseases. ^ Several statistical methods have been applied to determine whether anticipation is a true phenomenon in a particular disorder, including standard statistical tests and newly developed affected parent/affected child pair methods. These methods have been shown to be inappropriate for assessing anticipation for a variety of reasons, including familial correlation and low power. Therefore, we have developed family-based likelihood modeling approaches to model the underlying transmission of the disease gene and penetrance function and hence detect anticipation. These methods can be applied in extended families, thus improving the power to detect anticipation compared with existing methods based only upon parents and children. The first method we have proposed is based on the regressive logistic hazard model. This approach models anticipation by a generational covariate. The second method allows alleles to mutate as they are transmitted from parents to offspring and is appropriate for modeling the known triplet repeat diseases in which the disease alleles can become more deleterious as they are transmitted across generations. ^ To evaluate the new methods, we performed extensive simulation studies for data simulated under different conditions to evaluate the effectiveness of the algorithms to detect genetic anticipation. Results from analysis by the first method yielded empirical power greater than 87% based on the 5% type I error critical value identified in each simulation depending on the method of data generation and current age criteria. Analysis by the second method was not possible due to the current formulation of the software. The application of this method to Huntington's Disease and Li-Fraumeni Syndrome data sets revealed evidence for a generation effect in both cases. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Second-generation antipsychotics (SGAs) are increasingly prescribed to treat psychiatric symptoms in pediatric patients infected with HIV. We examined the relationship between prescribed SGAs and physical growth in a cohort of youth with perinatally acquired HIV-1 infection. Pediatric AIDS Clinical Trials Group (PACTG), Protocol 219C (P219C), a multicenter, longitudinal observational study of children and adolescents perinatally exposed to HIV, was conducted from September 2000 until May 2007. The analysis included P219C participants who were perinatally HIV-infected, 3-18 years old, prescribed first SGA for at least 1 month, and had available baseline data prior to starting first SGA. Each participant prescribed an SGA was matched (based on gender, age, Tanner stage, baseline body mass index [BMI] z score) with 1-3 controls without antipsychotic prescriptions. The main outcomes were short-term (approximately 6 months) and long-term (approximately 2 years) changes in BMI z scores from baseline. There were 236 participants in the short-term and 198 in the long-term analysis. In linear regression models, youth with SGA prescriptions had increased BMI z scores relative to youth without antipsychotic prescriptions, for all SGAs (short-term increase = 0.192, p = 0.003; long-term increase = 0.350, p < 0.001), and for risperidone alone (short-term = 0.239, p = 0.002; long-term = 0.360, p = 0.001). Participants receiving both protease inhibitors (PIs) and SGAs showed especially large increases. These findings suggest that growth should be carefully monitored in youth with perinatally acquired HIV who are prescribed SGAs. Future research should investigate the interaction between PIs and SGAs in children and adolescents with perinatally acquired HIV infection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The respiratory central pattern generator is a collection of medullary neurons that generates the rhythm of respiration. The respiratory central pattern generator feeds phrenic motor neurons, which, in turn, drive the main muscle of respiration, the diaphragm. The purpose of this thesis is to understand the neural control of respiration through mathematical models of the respiratory central pattern generator and phrenic motor neurons. ^ We first designed and validated a Hodgkin-Huxley type model that mimics the behavior of phrenic motor neurons under a wide range of electrical and pharmacological perturbations. This model was constrained physiological data from the literature. Next, we designed and validated a model of the respiratory central pattern generator by connecting four Hodgkin-Huxley type models of medullary respiratory neurons in a mutually inhibitory network. This network was in turn driven by a simple model of an endogenously bursting neuron, which acted as the pacemaker for the respiratory central pattern generator. Finally, the respiratory central pattern generator and phrenic motor neuron models were connected and their interactions studied. ^ Our study of the models has provided a number of insights into the behavior of the respiratory central pattern generator and phrenic motor neurons. These include the suggestion of a role for the T-type and N-type calcium channels during single spikes and repetitive firing in phrenic motor neurons, as well as a better understanding of network properties underlying respiratory rhythm generation. We also utilized an existing model of lung mechanics to study the interactions between the respiratory central pattern generator and ventilation. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

InGen of Creative Production in the Health Sciences is a compendium of innovative thinking exercises for individuals and groups, derived from an eclectic array of practical guides for professionals in a variety of fields. Segmented into five subcategories across twenty two chapters, the effort seeks to make techniques for increasing innovative problem solving more accessible to a diverse audience of problem solvers. The chapters of Roberta Ness. Innovation Generation (2012, Oxford University Press) provide the themes for each of the chapters in the workbook. It is intended that those who read Ness. Innovation Generation will benefit from practicing the constructs of innovative thinking exemplified in each exercise.^ The methods used to gather data, in this case mostly innovative thinking exercises, included literature reviews of existing innovative thinking tools, classroom materials, and theory-driven exploration of exercises to fill in gaps in extant materials. Specifically, Google.com and Amazon.com searches were conducted using the terms “innovation,” “innovative,” “innovator,” “creative,” “novelty,” “thinking,” together with some variance of “book,” “workbook,” and “exercise.” The results were sorted thematically to show correspondence with the themes in Ness (2012) and compared to suggested best practices of 50 years of scientific research on innovative thinking. Where themes were suggested by Ness (2012) and peer-reviewed research on innovation but unavailable in published innovation thinking workbooks, new exercises were developed. The five type subcategories into which these results were organized are: individual direct, individual indirect, group direct, group indirect and probing question. It is anticipated that the five type subcategories and spectrum of themes will equip problem solvers in a variety of capacities.^