6 resultados para CpGV resistance baculovirus whole genome sequencing

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The genomes of Fusobacterium nucleatum subspecies polymorphum strain ATCC 10953, Rickettsia typhi strain Wilmington, and Francisella tularensis subspecies holarctica strain OSU18 were sequenced, annotated, and analyzed. Each genome was then compared to the sequenced genomes of closely related bacteria. The genome of F. nucleatum ATCC 10953 was compared to two additional F. nucleatum subspecies, subspecies nucleatum and subspecies vincentii. This analysis revealed substantial evidence of horizontal gene transfer along with considerable genetic diversity within the species of F. nucleatum. R. typhi was compared to R. prowazekii and R. conorii. This analysis uncovered a hotspot for chromosomal rearrangements in the Spotted Fever Group but not the Typhus Group Rickettsia and revealed the close genetic relationship between the Typhus Group rickettsial species. F. tularensis OSU18 was compared to two additional F. tularensis strains. These comparisons uncovered significant chromosomal rearrangements between F. tularensis subspecies due to recombination between insertion sequence elements. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To identify genetic susceptibility loci for severe diabetic retinopathy, 286 Mexican-Americans with type 2 diabetes from Starr County, Texas completed detailed physical and ophthalmologic examinations including fundus photography for diabetic retinopathy grading. 103 individuals with moderate-to-severe non-proliferative diabetic retinopathy or proliferative diabetic retinopathy were defined as cases for this study. DNA samples extracted from study subjects were genotyped using the Affymetrix GeneChip® Human Mapping 100K Set, which includes 116,204 single nucleotide polymorphisms (SNPs) across the whole genome. Single-marker allelic tests and 2- to 8-SNP sliding-window Haplotype Trend Regression implemented in HelixTreeTM were first performed with these direct genotypes to identify genes/regions contributing to the risk of severe diabetic retinopathy. An additional 1,885,781 HapMap Phase II SNPs were imputed from the direct genotypes to expand the genomic coverage for a more detailed exploration of genetic susceptibility to diabetic retinopathy. The average estimated allelic dosage and imputed genotypes with the highest posterior probabilities were subsequently analyzed for associations using logistic regression and Fisher's Exact allelic tests, respectively. To move beyond these SNP-based approaches, 104,572 directly genotyped and 333,375 well-imputed SNPs were used to construct genetic distance matrices based on 262 retinopathy candidate genes and their 112 related biological pathways. Multivariate distance matrix regression was then used to test hypotheses with genes and pathways as the units of inference in the context of susceptibility to diabetic retinopathy. This study provides a framework for genome-wide association analyses, and implicated several genes involved in the regulation of oxidative stress, inflammatory processes, histidine metabolism, and pancreatic cancer pathways associated with severe diabetic retinopathy. Many of these loci have not previously been implicated in either diabetic retinopathy or diabetes. In summary, CDC73, IL12RB2, and SULF1 had the best evidence as candidates to influence diabetic retinopathy, possibly through novel biological mechanisms related to VEGF-mediated signaling pathway or inflammatory processes. While this study uncovered some genes for diabetic retinopathy, a comprehensive picture of the genetic architecture of diabetic retinopathy has not yet been achieved. Once fully understood, the genetics and biology of diabetic retinopathy will contribute to better strategies for diagnosis, treatment and prevention of this disease.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Linkage and association studies are major analytical tools to search for susceptibility genes for complex diseases. With the availability of large collection of single nucleotide polymorphisms (SNPs) and the rapid progresses for high throughput genotyping technologies, together with the ambitious goals of the International HapMap Project, genetic markers covering the whole genome will be available for genome-wide linkage and association studies. In order not to inflate the type I error rate in performing genome-wide linkage and association studies, multiple adjustment for the significant level for each independent linkage and/or association test is required, and this has led to the suggestion of genome-wide significant cut-off as low as 5 × 10 −7. Almost no linkage and/or association study can meet such a stringent threshold by the standard statistical methods. Developing new statistics with high power is urgently needed to tackle this problem. This dissertation proposes and explores a class of novel test statistics that can be used in both population-based and family-based genetic data by employing a completely new strategy, which uses nonlinear transformation of the sample means to construct test statistics for linkage and association studies. Extensive simulation studies are used to illustrate the properties of the nonlinear test statistics. Power calculations are performed using both analytical and empirical methods. Finally, real data sets are analyzed with the nonlinear test statistics. Results show that the nonlinear test statistics have correct type I error rates, and most of the studied nonlinear test statistics have higher power than the standard chi-square test. This dissertation introduces a new idea to design novel test statistics with high power and might open new ways to mapping susceptibility genes for complex diseases. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Attention has recently been drawn to Enterococcus faecium because of an increasing number of nosocomial infections caused by this species and its resistance to multiple antibacterial agents. However, relatively little is known about the pathogenic determinants of this organism. We have previously identified a cell-wall-anchored collagen adhesin, Acm, produced by some isolates of E. faecium, and a secreted antigen, SagA, exhibiting broad-spectrum binding to extracellular matrix proteins. Here, we analysed the draft genome of strain TX0016 for potential microbial surface components recognizing adhesive matrix molecules (MSCRAMMs). Genome-based bioinformatics identified 22 predicted cell-wall-anchored E. faecium surface proteins (Fms), of which 15 (including Acm) had characteristics typical of MSCRAMMs, including predicted folding into a modular architecture with multiple immunoglobulin-like domains. Functional characterization of one [Fms10; redesignated second collagen adhesin of E. faecium (Scm)] revealed that recombinant Scm(65) (A- and B-domains) and Scm(36) (A-domain) bound to collagen type V efficiently in a concentration-dependent manner, bound considerably less to collagen type I and fibrinogen, and differed from Acm in their binding specificities to collagen types IV and V. Results from far-UV circular dichroism measurements of recombinant Scm(36) and of Acm(37) indicated that these proteins were rich in beta-sheets, supporting our folding predictions. Whole-cell ELISA and FACS analyses unambiguously demonstrated surface expression of Scm in most E. faecium isolates. Strikingly, 11 of the 15 predicted MSCRAMMs clustered in four loci, each with a class C sortase gene; nine of these showed similarity to Enterococcus faecalis Ebp pilus subunits and also contained motifs essential for pilus assembly. Antibodies against one of the predicted major pilus proteins, Fms9 (redesignated EbpC(fm)), detected a 'ladder' pattern of high-molecular-mass protein bands in a Western blot analysis of cell surface extracts from E. faecium, suggesting that EbpC(fm) is polymerized into a pilus structure. Further analysis of the transcripts of the corresponding gene cluster indicated that fms1 (ebpA(fm)), fms5 (ebpB(fm)) and ebpC(fm) are co-transcribed, a result consistent with those for pilus-encoding gene clusters of other Gram-positive bacteria. All 15 genes occurred frequently in 30 clinically derived diverse E. faecium isolates tested. The common occurrence of MSCRAMM- and pilus-encoding genes and the presence of a second collagen-binding protein may have important implications for our understanding of this emerging pathogen.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Mycobacterium tuberculosis, a bacillus known to cause disease in humans since ancient times, is the etiological agent of tuberculosis (TB). The infection is primarily pulmonary, although other organs may also be affected. The prevalence of pulmonary TB disease in the US is highest along the US-Mexico border, and of the four US states bordering Mexico, Texas had the second highest percentage of cases of TB disease among Mexico-born individuals in 1999 (CDC, 2001). Between the years of 1993 and 1998, the prevalence of drug-resistant (DR) TB was 9.1% among Mexican-born individuals and 4.4% among US-born individuals (CDC, 2001). In the same time period, the prevalence of multi-drug resistant (MDR) TB was 1.4% among Mexican-born individuals and 0.6% among US-born individuals (CDC, 2001). There is a renewed urgency in the quest for faster and more effective screening, diagnosis, and treatment methods for TB due to the resurgence of tuberculosis in the US during the mid-1980s and early 1990s (CDC, 2007a), and the emergence of drug-resistant, multidrug-resistant, and extremely drug-resistant tuberculosis worldwide. Failure to identify DR and MDR-TB quickly leads to poorer treatment outcomes (CDC, 2007b). The recent rise in TB/HIV comorbidity further complicates TB control efforts. The gold standard for identification of DR-TB requires mycobacterial growth in culture, a technique taking up to three weeks, during which time DR/MDR-TB individuals harboring resistant organisms may be receiving inappropriate treatment. The goal of this study was to determine the sensitivity and specificity of real-time quantitative polymerase chain reaction (qPCR) using molecular beacons in the Texas population. qPCR using molecular beacons is a novel approach to detect mycobacterial mutations conferring drug resistance. This technique is time-efficient and has been shown to have high sensitivity and specificity in several populations worldwide. Rifampin (RIF) susceptibility was chosen as the test parameter because strains of M. tuberculosis which are resistant to RIF are likely to also be MDR. Due to its status as a point of entry for many immigrants into the US, control efforts against TB and drug-resistant TB in Texas is a vital component of prevention efforts in the US as a whole. We show that qPCR using molecular beacons has high sensitivity and specificity when compared with culture (94% and 87%, respectively) and DNA sequencing (90% and 96%, respectively). We also used receiver operator curve analysis to calculate cutoff values for the objective determination of results obtained by qPCR using molecular beacons. ^