Biblioteca Digital

18 resultados para Patent sequence dowload

em DigitalCommons@The Texas Medical Center

Detailed analysis of sequence changes occurring during vlsE antigenic variation in the mouse model of Borrelia burgdorferi infection.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID) mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained "template-independent" sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice) is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses.

Dissemination of methicillin-resistant Staphylococcus aureus USA300 sequence type 8 lineage in Latin America.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: Methicillin-resistant Staphylococus aureus (MRSA) is an important nosocomial and community-associated (CA) pathogen. Recently, a variant of the MRSA USA300 clone emerged and disseminated in South America, causing important clinical problems. METHODS: S. aureus isolates were prospectively collected (2006-2008) from 32 tertiary hospitals in Colombia, Ecuador, Peru, and Venezuela. MRSA isolates were subjected to antimicrobial susceptibility testing and pulsed-field gel electrophoresis and were categorized as health care-associated (HA)-like or CA-like clones on the basis of genotypic characteristics and detection of genes encoding Panton-Valentine leukocidin and staphylococcal cassette chromosome (SCC) mec IV. In addition, multilocus sequence typing of representative isolates of each major CA-MRSA pulsotype was performed, and the presence of USA300-associated toxins and the arcA gene was investigated for all isolates categorized as CA-MRSA. RESULTS: A total of 1570 S. aureus were included; 651 were MRSA (41%)--with the highest rate of MRSA isolation in Peru (62%) and the lowest in Venezuela (26%)--and 71%, 27%, and 2% were classified as HA-like, CA-like, and non-CA/HA-like clones, respectively. Only 9 MRSA isolates were confirmed to have reduced susceptibility to glycopeptides (glycopeptide-intermediate S. aureus phenotype). The most common pulsotype (designated ComA) among the CA-like MRSA strains was found in 96% of isolates, with the majority (81%) having a < or =6-band difference with the USA300-0114 strain. Representative isolates of this clone were sequence type 8; however, unlike the USA300-0114 strain, they harbored a different SCCmec IV subtype and lacked arcA (an indicator of the arginine catabolic mobile element). CONCLUSION: A variant CA-MRSA USA300 clone has become established in South America and, in some countries, is endemic in hospital settings.

Detailed analysis of sequence changes occurring during vlsE antigenic variation in the mouse model of Borrelia burgdorferi infection.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID) mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained "template-independent" sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice) is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses.

Complete Genome Sequence of Treponema paraluiscuniculi, Strain Cuniculi A: The Loss of Infectivity to Humans Is Associated with Genome Decay.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Treponema paraluiscuniculi is the causative agent of rabbit venereal spirochetosis. It is not infectious to humans, although its genome structure is very closely related to other pathogenic Treponema species including Treponema pallidum subspecies pallidum, the etiological agent of syphilis. In this study, the genome sequence of Treponema paraluiscuniculi, strain Cuniculi A, was determined by a combination of several high-throughput sequencing strategies. Whereas the overall size (1,133,390 bp), arrangement, and gene content of the Cuniculi A genome closely resembled those of the T. pallidum genome, the T. paraluiscuniculi genome contained a markedly higher number of pseudogenes and gene fragments (51). In addition to pseudogenes, 33 divergent genes were also found in the T. paraluiscuniculi genome. A set of 32 (out of 84) affected genes encoded proteins of known or predicted function in the Nichols genome. These proteins included virulence factors, gene regulators and components of DNA repair and recombination. The majority (52 or 61.9%) of the Cuniculi A pseudogenes and divergent genes were of unknown function. Our results indicate that T. paraluiscuniculi has evolved from a T. pallidum-like ancestor and adapted to a specialized host-associated niche (rabbits) during loss of infectivity to humans. The genes that are inactivated or altered in T. paraluiscuniculi are candidates for virulence factors important in the infectivity and pathogenesis of T. pallidum subspecies.

Genome sequence of Fusobacterium nucleatum subspecies polymorphum - a genetically tractable fusobacterium.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fusobacterium nucleatum is a prominent member of the oral microbiota and is a common cause of human infection. F. nucleatum includes five subspecies: polymorphum, nucleatum, vincentii, fusiforme, and animalis. F. nucleatum subsp. polymorphum ATCC 10953 has been well characterized phenotypically and, in contrast to previously sequenced strains, is amenable to gene transfer. We sequenced and annotated the 2,429,698 bp genome of F. nucleatum subsp. polymorphum ATCC 10953. Plasmid pFN3 from the strain was also sequenced and analyzed. When compared to the other two available fusobacterial genomes (F. nucleatum subsp. nucleatum, and F. nucleatum subsp. vincentii) 627 open reading frames unique to F. nucleatum subsp. polymorphum ATCC 10953 were identified. A large percentage of these mapped within one of 28 regions or islands containing five or more genes. Seventeen percent of the clustered proteins that demonstrated similarity were most similar to proteins from the clostridia, with others being most similar to proteins from other gram-positive organisms such as Bacillus and Streptococcus. A ten kilobase region homologous to the Salmonella typhimurium propanediol utilization locus was identified, as was a prophage and integrated conjugal plasmid. The genome contains five composite ribozyme/transposons, similar to the CdISt IStrons described in Clostridium difficile. IStrons are not present in the other fusobacterial genomes. These findings indicate that F. nucleatum subsp. polymorphum is proficient at horizontal gene transfer and that exchange with the Firmicutes, particularly the Clostridia, is common.

A trilocus sequence typing scheme for hospital epidemiology and subspecies differentiation of an important nosocomial pathogen, Enterococcus faecalis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this study, we present a trilocus sequence typing (TLST) scheme based on intragenic regions of two antigenic genes, ace and salA (encoding a collagen/laminin adhesin and a cell wall-associated antigen, respectively), and a gene associated with antibiotic resistance, lsa (encoding a putative ABC transporter), for subspecies differentiation of Enterococcus faecalis. Each of the alleles was analyzed using 50 E. faecalis isolates representing 42 diverse multilocus sequence types (ST(M); based on seven housekeeping genes) and four groups of clonally linked (by pulsed-field gel electrophoresis [PFGE]) isolates. The allelic profiles and/or concatenated sequences of the three genes agreed with multilocus sequence typing (MLST) results for typing of 49 of the 50 isolates; in addition to the one exception, two isolates were found to have identical TLST types but were single-locus variants (differing by a single nucleotide) by MLST and were therefore also classified as clonally related by MLST. TLST was also comparable to PFGE for establishing short-term epidemiological relationships, typing all isolates classified as clonally related by PFGE with the same type. TLST was then applied to representative isolates (of each PFGE subtype and isolation year) of a collection of 48 hospital isolates and demonstrated the same relationships between isolates of an outbreak strain as those found by MLST and PFGE. In conclusion, the TLST scheme described here was shown to be successful for investigating short-term epidemiology in a hospital setting and may provide an alternative to MLST for discriminating isolates.

Detection and sequence analysis of a nickel-induced mutation in a retroviral model system

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have developed a novel way to assess the mutagenicity of environmentally important metal carcinogens, such as nickel, by creating a positive selection system based upon the conditional expression of a retroviral transforming gene. The target gene is the v-mos gene in MuSVts110, a murine retrovirus possessing a growth temperature dependent defect in expression of the transforming gene due to viral RNA splicing. In normal rat kidney cells infected with MuSVts110 (6m2 cells), splicing of the MuSVts110 RNA to form the mRNA from which the transforming protein, p85$\sp{\rm gag-mos}$, is translated is growth-temperature dependent, occurring at 33 C and below but not at 39 C and above. This splicing "defect" is mediated by cis-acting viral sequences. Nickel chloride treatment of 6m2 cells followed by growth at 39 C, allowed the selection of "revertant" cells which constitutively express p85$\sp{\rm gag-mos}$ due to stable changes in the viral RNA splicing phenotype, suggesting that nickel, a carcinogen whose mutagenicity has not been well established, could induce mutations in mammalian genes. We also show by direct sequencing of PCR-amplified integrated MuSVts110 DNA from a 6m2 nickel-revertant cell line that the nickel-induced mutation affecting the splicing phenotype is a cis-acting 70-base duplication of a region of the viral DNA surrounding the 3$\sp\prime$ splice site. These findings provide the first example of the molecular basis for a nickel-induced DNA lesion and establish the mutagenicity of this potent carcinogen. ^

Nuclear proteins interacting with an AP-1/CRE-like promoter sequence in the human TNF$\alpha$ gene

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Monocyte developmental heterogeneity is reflected at the cellular level by differential activation competence, at the molecular level by differential regulation of gene expression. LPS activates monocytes to produce tumor necrosis factor-$\alpha$ (TNF). Events occurring at the molecular level necessary for TNF regulation have not been elucidated, but depend both on activation signals and the maturation state of the cell: Peripheral blood monocytes produce TNF upon LPS stimulation, but only within the first 72 hours of culture. Expression of c-fos is associated with monocytic differentiation and activation; the fos-associated protein, c-jun, is also expressed during monocyte activation. Increased cAMP levels are associated with down regulation of macrophage function, including LPS-induced TNF transcription. Due to these associations, we studied a region of the TNF promoter which resembles the binding sites for both AP-1(fos/jun) and CRE-binding protein (or ATF) in order to identify potential molecular markers defining activation competent populations of monocytic cells.^ Nuclear protein binding studies using extracts from THP-1 monocytic cells stimulated with LPS, which stimulates, or dexamethasone (Dex) or pentoxyfilline (PTX), which inhibit TNF production, respectively, suggest that a low mobility doublet complex may be involved in regulation through this promoter region. PTX or Dex increase binding of these complexes equivalently over untreated cells; approximately two hours after LPS induction, the upper complex is undetectable. The upper complex is composed of ATF2 (CRE-BP1); the lower is a heterodimer of jun/ATF2. LPS induces c-jun and thus may enhance formation of jun-ATF2 complexes. The simultaneous presence of both complexes may reduce the amount of TNF transcription through competitive binding, while a loss of the upper (ATF2) and/or gain of the lower (jun-ATF2) allow increased transcription. AP-1 elements generally transduce signals involving PKC; the CRE mediates a cAMP response, involving PKA. Thus, this element has the potential of receiving signals through divergent signalling pathways. Our findings also suggest that cAMP-induced inhibition of macrophage functions may occur via down regulation of activation-associated genes through competitive binding of particular cAMP-responsive nuclear protein complexes. ^

Models of DNA sequence evolution

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Models of DNA sequence evolution and methods for estimating evolutionary distances are needed for studying the rate and pattern of molecular evolution and for inferring the evolutionary relationships of organisms or genes. In this dissertation, several new models and methods are developed.^ The rate variation among nucleotide sites: To obtain unbiased estimates of evolutionary distances, the rate heterogeneity among nucleotide sites of a gene should be considered. Commonly, it is assumed that the substitution rate varies among sites according to a gamma distribution (gamma model) or, more generally, an invariant+gamma model which includes some invariable sites. A maximum likelihood (ML) approach was developed for estimating the shape parameter of the gamma distribution $(\alpha)$ and/or the proportion of invariable sites $(\theta).$ Computer simulation showed that (1) under the gamma model, $\alpha$ can be well estimated from 3 or 4 sequences if the sequence length is long; and (2) the distance estimate is unbiased and robust against violations of the assumptions of the invariant+gamma model.^ However, this ML method requires a huge amount of computational time and is useful only for less than 6 sequences. Therefore, I developed a fast method for estimating $\alpha,$ which is easy to implement and requires no knowledge of tree. A computer program was developed for estimating $\alpha$ and evolutionary distances, which can handle the number of sequences as large as 30.^ Evolutionary distances under the stationary, time-reversible (SR) model: The SR model is a general model of nucleotide substitution, which assumes (i) stationary nucleotide frequencies and (ii) time-reversibility. It can be extended to SRV model which allows rate variation among sites. I developed a method for estimating the distance under the SR or SRV model, as well as the variance-covariance matrix of distances. Computer simulation showed that the SR method is better than a simpler method when the sequence length $L>1,000$ bp and is robust against deviations from time-reversibility. As expected, when the rate varies among sites, the SRV method is much better than the SR method.^ The evolutionary distances under nonstationary nucleotide frequencies: The statistical properties of the paralinear and LogDet distances under nonstationary nucleotide frequencies were studied. First, I developed formulas for correcting the estimation biases of the paralinear and LogDet distances. The performances of these formulas and the formulas for sampling variances were examined by computer simulation. Second, I developed a method for estimating the variance-covariance matrix of the paralinear distance, so that statistical tests of phylogenies can be conducted when the nucleotide frequencies are nonstationary. Third, a new method for testing the molecular clock hypothesis was developed in the nonstationary case. ^

REPSA: A method for determining the sequence specificity of DNA binding ligands

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We developed a novel combinatorial method termed restriction endonuclease protection selection and amplification (REPSA) to identify consensus binding sites of DNA-binding ligands. REPSA uses a unique enzymatic selection based on the inhibition of cleavage by a type IIS restriction endonuclease, an enzyme that cleaves DNA at a site distal from its recognition sequence. Sequences bound by a ligand are protected from cleavage while unprotected sequences are cleaved. This enzymatic selection occurs in solution under mild conditions and is dependant only on the DNA-binding ability of the ligand. Thus, REPSA is useful for a broad range of ligands including all classes of DNA-binding ligands, weakly binding ligands, mixed populations of ligands, and unknown ligands. Here I describe REPSA and the application of this method to select the consensus DNA-binding sequences of three representative DNA-binding ligands; a nucleic acid (triplex-forming single-stranded DNA), a protein (the TATA-binding protein), and a small molecule (Distamycin A). These studies generated new information regarding the specificity of these ligands in addition to establishing their DNA-binding sequences. ^

Fibrillins: Sequence, phylogeny and carboxy terminal domain proteolytic processing

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fibrillin-1 and -2 are large secreted glycoproteins that are known to be components of extracellular matrix microfibrils located in the vasculature, basement membrane and various connective tissues. These microfibrils are often associated with a superstructure known as the elastic fiber. During the development of elastic tissues, fibrillin microfibrils precede the appearance of elastin and may provide a scaffolding for the deposition and crosslinking of elastin. Using RT/PCR, we cloned and sequenced 3.85Kbp of the FBN2 gene. Five differences were found between our contig sequence and that published by Zhang et al. (1995). Like many extracellular matrix proteins, the fibrillins are modular proteins. We compared analogous domains of the two fibrillins and also members of the latent TGF-$\beta$ binding protein (LTBP) family to determine their phylogenetic relationship. We found that the two families are homologous. LTBP-2 is the most similar to the fibrillin family while FBN-1 is the most similar to the LTBP family. The fibrillin-1 carboxy terminal domain is proteolytically processed. Two eukaryotic protein expression systems, baculoviral and CHO-K1, were developed to examine the proteolytic processing of the carboxy terminal domain of the fibrillin-1 protein. Both expression systems successfully processed the domain and both processed a mutant less efficiently. In the CHO-K1 cells, processing occurred intracellularly. ^

Antigenic variation in Lyme disease spirochetes by segmental recombination of VMP-like sequence cassettes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lyme disease is a multisystemic disorder caused by tick-borne infection of humans or other mammalian hosts with Borrelia burgdorferi. If untreated, the spirochetes can persist in the mammalian host for months or years. The mechanisms by which Lyme disease spirochetes evade the immune response have not been determined. In this study, we have identified and characterized an elaborate genetic system in the Lyme disease spirochete B. burgdorferi that promotes extensive antigenic variation of a 34-kDa surface-exposed lipoprotein, VlsE. A 28-kilobase linear plasmid of B. burgdorferi B31 (lp28-1) was found to contain a vmp-like sequence (vls) locus that closely resembles the variable major protein (vmp) system for antigenic variation of relapsing fever organisms. The presence of lp28-1 correlates with the high-infectivity phenotype in B. burgdorferi strains tested. Segments of the 15 non-expressed (silent) vls cassette sequences located upstream of vlsE are able to recombine into the centra vlsE cassette region during infection of C3H/HeN mice, resulting in antigenic variation of the expressed lipoprotein. When compared to parental VlsE, VlsE variants progressively accumulate sequence changes during the period of 4, 7, 14, 21, and 28 days post infection in C3H/HeN mice. However, no recombination was detected during the period of 28-day in vitro culture, suggesting in vivo induction of VlsE antigenic variation. Adaptive immune responses do not appear to play a significant role in this induction, since similar recombination events were also observed in immunodeficient SCID mice. The $5\sp\prime$ and $3\sp\prime$ noncassette regions of vlsE are apparently not subject to recombination and sequence variation. The structure and sequence of the silent vls cassette locus is preserved during the process of the VlsE antigenic variation, consistent with a nonreciprocal recombination mechanism. This combinatorial form of antigenic variation could potentially yield millions of VlsE variants in the mammalian host, and thereby contribute to immune evasion, long-term survival, and pathogenesis of B. burgdorferi. ^

THEORETICAL STUDIES ON THE METHODS OF RECONSTRUCTING PHYLOGENETIC TREES FROM DNA SEQUENCE DATA (MOLECULAR EVOLUTION, HOMINOID EVOLUTION, COMPUTER SIMULATION)

Relevância:

20.00% 20.00%

Publicador:

Resumo:

(1) A mathematical theory for computing the probabilities of various nucleotide configurations is developed, and the probability of obtaining the correct phylogenetic tree (model tree) from sequence data is evaluated for six phylogenetic tree-making methods (UPGMA, distance Wagner method, transformed distance method, Fitch-Margoliash's method, maximum parsimony method, and compatibility method). The number of nucleotides (m*) necessary to obtain the correct tree with a probability of 95% is estimated with special reference to the human, chimpanzee, and gorilla divergence. m* is at least 4,200, but the availability of outgroup species greatly reduces m* for all methods except UPGMA. m* increases if transitions occur more frequently than transversions as in the case of mitochondrial DNA. (2) A new tree-making method called the neighbor-joining method is proposed. This method is applicable either for distance data or character state data. Computer simulation has shown that the neighbor-joining method is generally better than UPGMA, Farris' method, Li's method, and modified Farris method on recovering the true topology when distance data are used. A related method, the simultaneous partitioning method, is also discussed. (3) The maximum likelihood (ML) method for phylogeny reconstruction under the assumption of both constant and varying evolutionary rates is studied, and a new algorithm for obtaining the ML tree is presented. This method gives a tree similar to that obtained by UPGMA when constant evolutionary rate is assumed, whereas it gives a tree similar to that obtained by the maximum parsimony tree and the neighbor-joining method when varying evolutionary rate is assumed. ^

The sequence comparison index: A novel method for comparing proteins and proteomes

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Historically morphological features were used as the primary means to classify organisms. However, the age of molecular genetics has allowed us to approach this field from the perspective of the organism's genetic code. Early work used highly conserved sequences, such as ribosomal RNA. The increasing number of complete genomes in the public data repositories provides the opportunity to look not only at a single gene, but at organisms' entire parts list. ^ Here the Sequence Comparison Index (SCI) and the Organism Comparison Index (OCI), algorithms and methods to compare proteins and proteomes, are presented. The complete proteomes of 104 sequenced organisms were compared. Over 280 million full Smith-Waterman alignments were performed on sequence pairs which had a reasonable expectation of being related. From these alignments a whole proteome phylogenetic tree was constructed. This method was also used to compare the small subunit (SSU) rRNA from each organism and a tree constructed from these results. The SSU rRNA tree by the SCI/OCI method looks very much like accepted SSU rRNA trees from sources such as the Ribosomal Database Project, thus validating the method. The SCI/OCI proteome tree showed a number of small but significant differences when compared to the SSU rRNA tree and proteome trees constructed by other methods. Horizontal gene transfer does not appear to affect the SCI/OCI trees until the transferred genes make up a large portion of the proteome. ^ As part of this work, the Database of Related Local Alignments (DaRLA) was created and contains over 81 million rows of sequence alignment information. DaRLA, while primarily used to build the whole proteome trees, can also be applied shared gene content analysis, gene order analysis, and creating individual protein trees. ^ Finally, the standard BLAST method for analyzing shared gene content was compared to the SCI method using 4 spirochetes. The SCI system performed flawlessly, finding all proteins from one organism against itself and finding all the ribosomal proteins between organisms. The BLAST system missed some proteins from its respective organism and failed to detect small ribosomal proteins between organisms. ^

Integrating sequence information in microarray data analysis by free energy modeling

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Microarray technology is a high-throughput method for genotyping and gene expression profiling. Limited sensitivity and specificity are one of the essential problems for this technology. Most of existing methods of microarray data analysis have an apparent limitation for they merely deal with the numerical part of microarray data and have made little use of gene sequence information. Because it's the gene sequences that precisely define the physical objects being measured by a microarray, it is natural to make the gene sequences an essential part of the data analysis. This dissertation focused on the development of free energy models to integrate sequence information in microarray data analysis. The models were used to characterize the mechanism of hybridization on microarrays and enhance sensitivity and specificity of microarray measurements. ^ Cross-hybridization is a major obstacle factor for the sensitivity and specificity of microarray measurements. In this dissertation, we evaluated the scope of cross-hybridization problem on short-oligo microarrays. The results showed that cross hybridization on arrays is mostly caused by oligo fragments with a run of 10 to 16 nucleotides complementary to the probes. Furthermore, a free-energy based model was proposed to quantify the amount of cross-hybridization signal on each probe. This model treats cross-hybridization as an integral effect of the interactions between a probe and various off-target oligo fragments. Using public spike-in datasets, the model showed high accuracy in predicting the cross-hybridization signals on those probes whose intended targets are absent in the sample. ^ Several prospective models were proposed to improve Positional Dependent Nearest-Neighbor (PDNN) model for better quantification of gene expression and cross-hybridization. ^ The problem addressed in this dissertation is fundamental to the microarray technology. We expect that this study will help us to understand the detailed mechanism that determines sensitivity and specificity on the microarrays. Consequently, this research will have a wide impact on how microarrays are designed and how the data are interpreted. ^

«
1
2
»