14 resultados para GENOMIC SEQUENCES

em Brock University, Canada


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The ease of production and manipulation has made plasmid DNA a prime target for its use in gene transfer technologies such as gene therapy and DNA vaccines. The major drawback of plasmid however is its stability within mammalian cells. Plasmid DNA is usually lost by cellular mechanisms or as a result of mitosis by simple dilution. This study set out to search for mammalian genomic DNA sequences that would enhance the stability of plasmid DNA in mammalian cells.Creating a plasmid based genomic DNA library, we were able to screen the human genome by transfecting the library into Human Embryonic Kidney (HEK 293) Cells. Cells that contained plasmid DNA were selected, using G418 for 14 days. The resulting population was then screened for the presence of biologically active plasmid DNA using the process of transformation as a detector.A commercially available plasmid DNA isolation kit was modified to extract plasmid DNA from mammalian cells. The standardized protocol had a detection limit of -0.6 plasmids per cell in one million cells. This allowed for the detection of 45 plasmids that were maintained for 32 days in the HEK 293 cells. Sequencing of selected inserts revealed a significantly higher thymine content in comparison to the human genome. Sequences with high A/T content have been associated with Scaffold/Matrix Attachment Region (S/MAR) sequences in mammalian cells. Therefore, association with the nuclear matrix might be required for the stability of plasmids in mammalian cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The regenerating urodele limb is a useful model system in which to study, in vivo, the controls of cell proliferation and differentiation. Techniques are available which enable one to experimentally manipulate mitogenic influences upon the blastema, as well the morphogenesis of the regenerating 11mb. Although classical regeneration studies have generated a wealth of knowledge concerning tissue interactions, little 1s known about the process at the level of gene expression. The aim of this project was to clone potentially developmentally regulated genes from a newt genomic library for use in future studies of gene expression during limb regeneration. We decided to clone the cytoskeletal actin gene for the following reasons: 1. its expression reflects the proliferative and differentiatlve states of cells in other systems 2. the high copy number of cytoplasmic actin pseudogenes in other vertebrates and the high degree of evolutionary sequence conservation among actin genes increased the chance of cloning one of the newt cytoplasmic actin genes. 3. Preliminary experiments indicated that a newt actin could probably be identified using an available chick ~-actln gene for a molecular probe. Two independent recombinant phage clones, containing actin homologous inserts, were isolated from a newt genomic library by hybridization with the chick actin probe. Restriction mapping identified actin homologous sequences within the newt DNA inserts which were subcloned into the plasmid pTZ19R. The recombinant plasmids were transformed into the Escherichia coli strain, DHsa. Detailed restriction maps were produced of the 5.7Kb and 3.1Kb newt DNA inserts in the plasmids, designated pTNAl and pTNA2. The short «1.3 Kb) length of the actin homologous sequence in pTNA2 indicated that it was possibly a reverse transcript pseudogene. Problems associated with molecular cloning of DNA sequences from N. viridescens are discussed with respect to the large genome size and abundant highly repetitive DNA sequences.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Phascolomyces articulosus genomic DNA was isolated from 48 h old hyphae and was used for amplification of a chitin synthase fragment by the polymerase chain reaction method. The primers used in the amplification corresponded to two widely conserved amino acid regions found in chitin synthases of many fimgi. Amphfication resulted in four bands (820, 900, 1000 and 1500 bp, approximately) as visualized in a 1.2% agarose gel. The lowest band (820 bp) was selected as a candidate for chitin synthase because most amplified regions from other fimgi so far exhibited similar sizes (600-750 bp). The selected fragment was extracted from the gel and cloned in the Hinc n site of pUC19. The derived plasmid and insert were designated ^\5C\9'PaCHS and PaCHS respectively. The plasmid pUC19-PaC/fS was digested by several restriction enzymes and was found to contain BamHl and HincU sites. Sequencing of PaCHS revealed two intron sequences and a total open reading frame of 200 amino acids. The derived polypeptide was compared with other related sequences from the EMBL database (Heidelberg, Germany) and was matched to 36 other fiilly or partially sequenced fimgal chitin synthase genes. The closest resemblance was with two genes (74.5% and 73.1% identity) from Rhizopus oligosporus. Southern hybridization with the cloned fragment as a probe to the PCR reaction showed a strong signal at the fragment selected for cloning and weaker signals at the other two fragments. Southern hybridization with partially digested Phascolomyces articulosus genomic DNA showed a single band. The amino acid sequence was compared with sequences from other chitin synthase gene classes using the CLUSTALW program. The chitin synthase fragment from Phascolomyces articulosus was initially grouped in class n along with chitin synthase fragments from Rhizopus oligosporus and Phycomyces blakesleeanus which also belong to the same class, Zygomycetes. Bootstrap analysis using the neighbor-joining method available by CLUSTALW verified such classification. Comparison of PaCHS revealed conservation of intron positions that are characteristic of chitin synthase gene fragments of zygomycetous fungi.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

ABSTRACT Recombinant adenoviruses are currently under intense investigation as potential gene delivery and gene expression vectors with applications in human and veterinary medicine. As part of our efforts to develop a bovine adenovirus type 2 (BAV2) based vector system, the nucleotide sequence of BAV2 was determined. Sixty-six open reading frames (ORFs) were found with the potential to encode polypeptides that were at least 50 amino acid (aa) residue long. Thirty-one of the BAV2 polypeptide sequences were found to share homology to already identified adenovirus proteins. The arrangement of the genes revealed that the BAV2 genomic organization closely resembles that of well-characterized human adenoviruses. In the course of this study, continuous propagation of BAV2 over many generations in cell culture resulted in the isolation of a BAV2 spontaneous mutant in which the E3 region was deleted. Restriction enzyme, sequencing and PCR analyses produced concordant results that precisely located the deletion and revealed that its size was exactly 1299 bp. The E3-deleted virus was plaque-purified and further propagated in cell culture. It appeared that the replication of such a virus lacking a portion of the E3 region was not affected, at least in cell culture. Attempts to rescue a recombinant BAV2 virus with the bacterial kanamycin resistance gene in the E3 region yielded a candidate as verified with extensive Southern blotting and PCR analyses. Attempts to purify the recombinant virus were not successful, suggesting that such recombinant BAV2 was helper-dependent. Ten clones containing full-length BAV2 genomes in a pWE15 cosmid vector were constructed. The infectivity of these constructs was tested by using different transfection methods. The BAV2 genomic clones did appear to be infectious only after extended incubation period. This may be due to limitations of various transfection methods tested, or biological differences between virus- and E. co//-derived BAV2 DNA.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Adenoviral vectors are currently the most widely used gene therapeutic vectors, but their inability to integrate into host chromosomal DNA shortened their transgene expression and limited their use in clinical trials. In this project, we initially planned to develop a technique to test the effect of the early region 1 (E1) on adenovirus integration by comparing the integration efficiencies between an E1-deleted adenoviral vector (SubE1) and an Elcontaining vector (SubE3). However, we did not harvest any SubE3 virus, even if we repeated the transfection and successfully rescued the SubE1 virus (2/4 transfections generated viruses) and positive control virus (6/6). The failure of rescuing SubE3 could be caused by the instability of the genomic plasmid pFG173, as it had frequent intemal deletions when we were purifying It. Therefore, we developed techniques to test the effect of E1 on homologous recombination (HR) since literature suggested that adenovirus integration is initiated by HR. We attempted to silence the E1 in 293 cells by transfecting E1A/B-specific small interfering RNA (siRNA). However, no silenced phenotype was observed, even if we varied the concentrations of E1A/B siRNA (from 30 nM to 270 nM) and checked the silencing effects at different time points (48, 72, 96 h). One possible explanation would be that the E1A/B siRNA sequences are not potent enough to Induce the silenced phenotype. For evaluating HR efficiencies, an HR assay system based on bacterial transfonmatJon was designed. We constmcted two plasmids ( designated as pUC19-dl1 and pUC19-dl2) containing different defective lacZa cassettes (forming white colonies after transformation) that can generate a functional lacZa cassette (forming blue colonies) through HR after transfecting into 293 cells. The HR efficiencies would be expressed as the percentages of the blue colonies among all the colonies. Unfortunately, after transfonnation of plasmid isolated from 293 cells, no colony was found, even at a transformation efficiency of 1.8x10^ colonies/pg pUC19, suggesting the sensitivity of this system was low. To enhance the sensitivity, PCR was used. We designed a set of primers that can only amplify the recombinant plasmid fomied through HR. Therefore, the HR efficiencies among different treatments can be evaluated by the amplification results, and this system could be used to test the effect of E1 region on adenovirus integration. In addition, to our knowledge there was no previous studies using PCR/ Realtime PCR to evaluate HR efficiency, so this system also provides a PCR-based method to carry out the HR assays.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The construction of adenovirus vectors for cloning and foreign gene expression requires packaging cell lines that can complement missing viral functions caused by sequence deletions and/or replacement with foreign DNA sequences. In this study, packaging cell lines were designed to provide in trans the missing bovine adenovirus functions, so that recombinant viruses could be generated. Fetal bovine kidney and lUng cells, acquired at the trimester term from a pregnant cow, were tranfected with both digested wild type BAV2 genomic DNA and pCMV-EI. The plasmid pCMV-EI was specifically constructed to express El of BAV2 under the control of the cytomegalovirus enhancer/promoter (CMV). Selection for "true" transformants by continuous passaging showed no success in isolating immortalised cells, since the cells underwent crisis resulting in complete cell death. Moreover, selection for G418 resistance, using the same cells, also did not result in the isolation of an immortalised cell line and the same culture-collapse event was observed. The lack of success in establishing an immortalised cell line from fetal tissue prompted us to transfect a pre-established cell line. We began by transfecting MDBK (Mardin-Dardy bovine kidney) cells with pCMV-El-neo, which contain the bacterial selectable marker neo gene. A series of MDBK-derived cell lines, that constitutively express bovine adenoviral (BAV) early region 1 (El), were then isolated. Cells selected for resistance to the drug G418 were isolated collectively for full characterisation to assess their suitability as packaging cell lines. Individual colonies were isolated by limiting dilution and further tested for El expression and efficiency of DNA uptake. Two cell lines, L-23 and L-24, out of 48 generated foci tested positive for £1 expression using Northern Blot analysis. DNA uptake studies, using both lipofectamine and calcium phosphate methods, were performed to compare these cells, their parental MDBK cells, 8 and the unrelated human 293 cells as a benchmark. The results revealed that the new MDBKderived clones were no more efficient than MDBK cells in the transient expression of transfected DNA and that they were inferior to 293 cells, when using lacZ as the reporter gene. In view of the inherently poor transfection efficiency of MDBK cells and their derivatives, a number of other bovine cells were investigated for their potential as packaging cells. The cell line CCL40 was chosen for its high efficiency in DNA uptake and subsequently transfected with the plasmid vector pCMV El-neo. By selection with the drug G418, two cell lines were isolated, ProCell 1 and ProCell 2. These cell lines were tested for El expression, permissivity to BAV2 and DNA uptake efficiency, revealing a DNA uptake efficiency of 37 % , comparable to that of CCL40. Attempts to rescue BAV2 mutants carrying the lacZ gene in place of £1 or £3 were carried out by co-transfecting wild type viral DNA with either the plasmid pdlElE-Z (which contains BAV2 sequences from 0% to 40.4% with the lacZ gene in place of the £1 region from 1.1% to 8.25%) or with the plasmid pdlE3-5-Z (which contains BAV2 sequences from 64.8% to 100% with the lacZ gene in place of the E3 region from 75.8% to 81.4%). These cotransfections did not result in the generation of a viral mutant. The lack of mutant generation was thought to be caused by the relative inefficiency ofDNA uptake. Consequently, cosBAV2, a cosmid vector carrying the BAV2 genome, was modified to carry the neo reporter gene in place of the £3 region from 75.8% to 81.4%. The use of a single cosmid vector earring the whole genome would eliminate the need for homologous recombination in order to generate a viral vector. Unfortunately, the transfection of cosBAV2- neo also did not result in the generation of a viral mutant. This may have been caused by the size of the £3 deletion, where excess sequences that are essential to the virus' survival might have been deleted. As an extension to this study, the spontaneous E3 deletion, accidently discovered in our viral stock, could be used as site of foreign gene insertion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recombinant human adenovirus (Ad) vectors are being extensively explored for their use in gene therapy and recombinant vaccines. Ad vectors are attractive for many reasons, including the fact that (1) they are relatively safe, based on their use as live oral vaccines, (2) they can accept large transgene inserts, (3) they can infect dividing and postmitotic cells, and (4) they can be produced to high titers. However, there are also a number of major problems associated with Ad vectors, including transient foreign gene expression due to host cellular immune responses, problems with humoral immunity, and the creation of replication competent adenoviruses (RCA). Most Ad vectors contain deletions in the E1 region that allow for insertion of a transgene. However, the E1 gene products are required for replication and thus must be supplied in trans by a helper ceillille that will allow for the growth and packaging of the defective virus. For this purpose the 293 cell line (Graham et al., 1977) is used most often; however, homologous recombination between the vector and the cell line often results in the generation of RCA. The presence of RCA in batches of adenoviral vectors for clinical use is a safety risk because tlley . may result in the mobilization and spread of the replication-defective vector viruses, and in significant tissue damage and pathogenicity. The present research focused on the alteration of the 293 cell line such that RCA formation can be eliminated. The strategy to modify the 293 cells involved the removal of the first 380 bp of the adenovirus genome through the process of homologous recombination. The first step towards this goal involved identifying and cloning the left-end cellular-viral jUl1ction from 293 cells to assemble sequences required for homologous recombination. Polymerase chain reaction (PCR) was performed to clone the junction, and the clone was verified through sequencing. The plasn1id PAM2 was then constructed, which served as the targeting cassette used to modify the 293 cells. The cassette consisted of (1) the cellular-viral junction as the left-end region of homology, (2) the neo gene to use for positive selection upon tranfection into 293 cells, (3) the adenoviral genome from bp 380 to bp 3438 as the right-end region of homology, and (4) the HSV-tk gene to use for negative selection. The plasmid PAM2 was linearized to produce a double strand break outside the region of homology, and transfected into 293 cells using the calcium-phosphate technique. Cells were first selected for their resistance to the drug G418, and subsequently for their resistance to the drug Gancyclovir (GANC). From 17 transfections, 100 pools of G418f and GANCf cells were picked using cloning lings and expanded for screening. Genomic DNA was isolated from the pools and screened for the presence of the 380 bps using PCR. Ten of the most promising pools were diluted to single cells and expanded in order to isolate homogeneous cell lines. From these, an additional 100 G41Sf and GANef foci were screened. These preliminary screening results appear promising for the detection of the desired cell line. Future work would include further cloning and purification of the promising cell lines that have potentially undergone homologous recombination, in order to isolate a homogeneous cell line of interest.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The neuropeptide Th1RFamide with the sequence Phe-Met-Arg-Phe-amide was originally isolated in the clam Macrocallista nimbosa (price and Greenberg, 1977). Since its discovery, a large family ofFl\1RFamide-related peptides termed FaRPs have been found to be present in all major animal phyla with functions ranging from modulation of neuronal activity to alteration of muscular contractions. However, little is known about the genetics encoding these peptides, especially in invertebrates. As FaRP-encoding genes have yet to be investigated in the invertebrate Malacostracean subphylum, the isolation and characterization ofFaRP-encoding DNA and mRNA was pursued in this project. The immediate aims of this thesis were: (1) to amplify mRNA sequences of Procambarus clarkii using a degenerate oligonucleotide primer deduced from the common amino acid sequence ofisolated Procambarus FaRPS, (2) to determine if these amplification products encode FaRP gene sequences, and (3) to create a selective cDNA library of sequences recognized by the degenerate oligonucleotide primer. The polymerase chain reaction - rapid amplification of cDNA ends (PCR-RACE) is a procedure in which a single gene-specific primer is used in conjunction with a generalized 3' or 5' primer to amplify copies ofthe region between a single point in the transcript and the 3' or 5' end of cDNA of interest (Frohman et aI., 1988). PCRRACE reactions were optimized with respect to primers used, buffer composition, cycle number, nature ofgenetic substrate to be amplified, annealing, extension and denaturation temperatures and times, and use of reamplification procedures. Amplification products were cloned into plasmid vectors and recombinant products were isolated, as were the recombinant plaques formed in the selective cDNA library. Labeled amplification products were hybridized to recombinant bacteriophage to determine ligated amplification product presence. When sequenced, the five isolated PCR-RACE amplification products were determined not to possess FaRP-encoding sequences. The 200bp, 450bp, and 1500bp sequences showed homology to the Caenorhabditis elegans cosmid K09A11, which encodes for cytochrome P450; transfer-RNA; transposase; and tRNA-Tyr, while the 500bp and 750bp sequences showed homology with the complete genome of the Vaccinia virus. Under the employed amplification conditions the degenerate oligonucleotide primer was observed to bind to and to amplify sequences with either 9 or 10bp of 17bp identity. The selective cDNA library was obselVed to be of extremely low titre. When library titre was increased, white. plaques were isolated. Amplification analysis of eight isolated Agt11 sequences from these plaques indicated an absence of an insertion sequence. The degenerate 17 base oligonucleotide primer synthesized from the common amino acid sequence ofisolated Procambarus FaRPs was thus determined to be non-specific in its binding under the conditions required for its use, and to be insufficient for the isolation and identification ofFaRP-encoding sequences. A more specific primer oflonger sequence, lower degeneracy, and higher melting temperature (TJ is recommended for further investigation into the FaRP-encoding genes of Procambarlls clarkii.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The nucleotide sequence of a genomic DNA fragment thought previously to contain the dihydrofolate reductase gene (DFR1) of Saccharomyces cerevisiae by genetic criteria was determined. This DNA fragment of 1784' basepairs contains a large open reading frame from position 800 to 1432, which encodes a enzyme with a predicted molecular weight of 24,229.8 Daltons. Analysis of the amino acid sequence of this protein revealed that the yeast polypep·tide contained 211 amino acids, compared to the 186 residues commonly found in the polypeptides of other eukaryotes. The difference in size of the gene product can be attributed mainly to an insert in the yeast gene. Within this region, several consensus sequences required for processing of yeast nuclear and class II mitochondrial introns were identified, but appear not sufficient for the RNA splicing. The primary structure of the yeast DHFR protein has considerable sequence homology with analogous polypeptides from other organisms, especially in the consensus residues involved in cofactor and/or inhibitor binding. Analysis of the nucleotide sequence also revealed the presence of a number of canonical sequences identified in yeast as having some function in the regulation of gene expression. These include UAS elements (TGACTC) required for tIle amino acid general control response, and "TATA H boxes as well as several consensus sequences thought to be required for transcriptional termination and polyadenylation. Analysis of the codon usage of the yeast DFRl coding region revealed a codon bias index of 0.0083. this valve very close to zero suggestes 3 that the gene is expressed at a relatively low level under normal physiological conditions. The information concerning the organization of the DFRl were used to construct a variety of fusions of its 5' regulatory region with the coding region of the lacZ gene of E. coli. Some of such fused genes encoded a fusion product that expressed in E.coli and/or in yeast under the control of the 5' regulatory elements of the DFR1. Further studies with these fusion constructions revealed that the beta-galactosidase activity encoded on multicopy plasmids was stimulated transiently by prior exposure of yeast host cells to UV light. This suggests that the yeast PFRl gene is indu.ced by UV light and nlay in1ply a novel function of DHFR protein in the cellular responses to DNA damage. Another novel f~ature of yeast DHFR was revealed during preliminary studies of a diploid strain containing a heterozygous DFRl null allele. The strain was constructed by insertion of a URA3 gene within the coding region of DFR1. Sporulation of this diploid revealed that meiotic products segregated 2:0 for uracil prototrophy when spore clones were germinated on medium supplemented with 5-formyltetrahydrofolate (folinic acid). This finding suggests that, in addition to its catalytic activity, the DFRl gene product nlay play some role in the anabolisln of folinic acid. Alternatively, this result may indicate that Ura+ haploid segregants were inviable and suggest that the enzyme has an essential cellular function in this species.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

One of the main objectives of the mid-Atlantic transect is to improve dating resolution of sequences and unconfonnity surfaces. Dinoflagellate cysts from two Ocean Drilling Program boreholes, the onshore Leg 174AX Ocean View Site and Leg 174A continental shelf Site 1071, are used to provide age estimates for sequences and unconfonnities fonned on the New Jersey continental margin during the Miocene epoch. Despite the occasional lack of dinocysts in barren and oxidized sections, dinocyst biochronology still offers greater age control than that provided by other microfossils in marginal marine environments. An early Miocene to late Miocene chronology based on ages detennined for the two study sites is presented. In addition, .palynofacies are used to unravel the systems tract character of the Miocene sequences and provide insight into the effects of taphonomy and preservation of palynomorphs in marginal marine and shelf environments under different ~ea level conditions. More precise placement of maximum flooding surfaces is possible through the identification of condensed sections and palynofacies shifts can also reveal subaerially exposed sections and surfaces not apparent in seismic or lithological analyses. The problems with the application of the pollen record in the interpretation of Miocene climate are also discussed. Palynomorphs provide evidence for a second-order lowering of sea level during the Miocene, onto which higher order sea level fluctuations are super-imposed. Correlation of sequences and unconfonnities is attempted between onshore boreholes and from the onshore Ocean View borehole to offshore Site 1071.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Children were afforded the opportunity to control the order of repetitions for three novel spatiotemporal sequences. The following was predicted: a) children and adults in the self-regulated (SELF) groups would produce faster movement (MT) and reaction times (R T) and greater recall success (RS) during retention compared to the age-matched yoked (YOKE) groups; b) children would choose to switch sequences less often than adults; c) adults would produce faster MT and RT and greater RS than the children during acquisition and retention, independent of experimental group. During acquisition, no effects were seen for RS, however for MT and RT there was a main effect for age as well as block. During retention a main effect for practice condition was seen for RS and failed to reach statistical significance for MT and RT, thus partially supporting our first and second hypotheses. The third hypothesis was not supported.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Retrotransposons, which used to be considered as “junk DNA”, have begun to reveal their immense value to genome evolution and human biology due to recent studies. They consist of at least ~45% of the human genome and are more or less the same in other mammalian genomes. Retrotransposon elements (REs) are known to affect the human genome through many different mechanisms, such as generating insertion mutations, genomic instability, and alteration in gene expression. Previous studies have suggested several RE subfamilies, such as Alu, L1, SVA and LTR, are currently active in the human genome, and they are an important source of genetic diversity between human and other primates, as well as among humans. Although several groups had used Retrotransposon Insertion Polymorphisms (RIPs) as markers in studying primate evolutionary history, no study specifically focused on identifying Human-Specific Retrotransposon Element (HS-RE) and their roles in human genome evolution. In this study, by computationally comparing the human genome to 4 primate genomes, we identified a total of 18,860 HS-REs, among which are 11,664 Alus, 4,887 L1s, 1,526 SVAs and 783 LTRs (222 full length entries), representing the largest and most comprehensive list of HS-REs generated to date. Together, these HS-REs contributed a total of 14.2Mb sequence increase from the inserted REs and Target Site Duplications (TSDs), 71.6Kb increase from transductions, and 268.2 Kb sequence deletion of from insertion-mediated deletion, leading to a net increase of ~14 Mb sequences to the human genome. Furthermore, we observed for the first time that Y chromosome might be a hot target for new retrotransposon insertions in general and particularly for LTRs. The data also allowed for the first time the survey of frequency of TE insertions inside other TEs in comparison with TE insertion into none-TE regions. In summary, our data suggest that retrotransposon elements have played a significant role in the evolution of Homo sapiens.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Variations in different types of genomes have been found to be responsible for a large degree of physical diversity such as appearance and susceptibility to disease. Identification of genomic variations is difficult and can be facilitated through computational analysis of DNA sequences. Newly available technologies are able to sequence billions of DNA base pairs relatively quickly. These sequences can be used to identify variations within their specific genome but must be mapped to a reference sequence first. In order to align these sequences to a reference sequence, we require mapping algorithms that make use of approximate string matching and string indexing methods. To date, few mapping algorithms have been tailored to handle the massive amounts of output generated by newly available sequencing technologies. In otrder to handle this large amount of data, we modified the popular mapping software BWA to run in parallel using OpenMPI. Parallel BWA matches the efficiency of multithreaded BWA functions while providing efficient parallelism for BWA functions that do not currently support multithreading. Parallel BWA shows significant wall time speedup in comparison to multithreaded BWA on high-performance computing clusters, and will thus facilitate the analysis of genome sequencing data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Human endogenous retroviruses (HERVs) are the result of ancient germ cell infections of human germ cells by exogenous retroviruses. HERVs belong to the long terminal repeat (LTR) group of retrotransposons that comprise ~8% of the human genome. The majority of the HERVs documented have been truncated and/or incurred lethal mutations and no longer encode functional genes; however a very small number of HERVs seem to maintain functional in making new copies by retrotranspositon as suggested by the identification of a handful of polymorphic HERV insertions in human populations. The objectives of this study were to identify novel insertion of HERVs via analysis of personal genomic data and survey the polymorphism levels of new and known HERV insertions in the human genome. Specifically, this study involves the experimental validation of polymorphic HERV insertion candidates predicted by personal genome-based computation prediction and survey the polymorphism level within the human population based on a set of 30 diverse human DNA samples. Based on computational analysis of a limited number of personal genome sequences, PCR genotyping aided in the identification of 15 dimorphic, 2 trimorphic and 5 fixed full-length HERV-K insertions not previously investigated. These results suggest that the proliferation rate of HERVKs, perhaps also other ERVs, in the human genome may be much higher than we previously appreciated and the recently inserted HERVs exhibit a high level of instability. Throughout this study we have observed the frequent presence of additional forms of genotypes for these HERV insertions, and we propose for the first time the establishment of new genotype reporting nomenclature to reflect all possible combinations of the pre-integration site, solo-LTR and full-length HERV alleles.