242 resultados para Genomic sequence database
Resumo:
Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear genomes of Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, S. castellii, S. kluyveri, S. bayanus, and Candida albicans. Our results show that second codon sites in the ancestral genome of these species contained 49.1% invariable sites, 39.6% variable sites belonging to one rate category (V1), and 11.3% variable sites belonging to a second rate category (V2). The ancestral nucleotide content was found to differ markedly across these three sets of sites, and the evolutionary processes operating at the variable sites were found to be non-SRH and best modeled by a combination of eight edge-specific rate matrices (four for V1 and four for V2). The number of substitutions per site at the variable sites also differed markedly, with sites belonging to V1 evolving slower than those belonging to V2 along the lineages separating the seven species of Saccharomyces. Finally, sites belonging to V1 appeared to have ceased evolving along the lineages separating S. cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus, implying that they might have become so selectively constrained that they could be considered invariable sites in these species.
Resumo:
With new national targets for patient flow in public hospitals designed to increase efficiencies in patient care and resource use, better knowledge of events affecting length of stay will support improved bed management and scheduling of procedures. This paper presents a case study involving the integration of material from each of three databases in operation at one tertiary hospital and demonstrates it is possible to follow patient journeys from admission to discharge. What is known about this topic? At present, patient data at one Queensland tertiary hospital are assembled in three information systems: (1) the Hospital Based Corporate Information System (HBCIS), which tracks patients from in-patient admission to discharge; (2) the Emergency Department Information System (EDIS) containing patient data from presentation to departure from the emergency department; and (3) Operation Room Management Information System (ORMIS), which records surgical operations. What does this paper add? This paper describes how a new enquiry tool may be used to link the three hospital information systems for studying the hospital journey through different wards and/or operating theatres for both individual and groups of patients. What are the implications for practitioners? An understanding of the patients’ journeys provides better insight into patient flow and provides the tool for research relating to access block, as well as optimising the use of physical and human resources.
Resumo:
Many protocols have been used for extraction of DNA from Thraustochytrids. These generally involve the use of CTAB, phenol/chloroform and ethanol. They also feature mechanical grinding, sonication, N2 freezing or bead beating. However, the resulting chemical and physical damage to extracted DNA reduces its quality. The methods are also unsuitable for large numbers of samples. Commercially-available DNA extraction kits give better quality and yields but are expensive. Therefore, an optimized DNA extraction protocol was developed which is suitable for Thraustochytrids to both minimise expensive and time-consuming steps prior to DNA extraction and also to improve the yield. The most effective method is a combination of single bead in TissueLyser (Qiagen) and Proteinase K. Results were conclusive: both the quality and the yield of extracted DNA were higher than with any other method giving an average yield of 8.5 µg/100 mg biomass.
Resumo:
The koala (Phascolarctos cinereus) is an Australian marsupial that continues to experience significant population declines. Infectious diseases caused by pathogens such as Chlamydia are proposed to have a major role. Very few species-specific immunological reagents are available, severely hindering our ability to respond to the threat of infectious diseases in the koala. In this study, we utilise data from the sequencing of the koala transcriptome to identify key immunological markers of the koala adaptive immune response and cytokines known to be important in the host response to chlamydial infection in other species. This report describes the identification and preliminary sequence analysis of (1) T lymphocyte glycoprotein markers (CD4, CD8); (2) IL-4, a marker for the Th2 response; (3) cytokines such as IL-6, IL-12 and IL-1β, that have been shown to have a role in chlamydial clearance and pathology in other hosts; and (4) the sequences for the koala immunoglobulins, IgA, IgG, IgE and IgM. These sequences will enable the development of a range of immunological reagents for understanding the koala’s innate and adaptive immune responses, while also providing a resource that will enable continued investigations into the origin and evolution of the marsupial immune system.
Resumo:
Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/.
Resumo:
Chlamydia pecorum is globally associated with several ovine diseases including keratoconjunctivitis and polyarthritis. The exact relationship between the variety of C. pecorum strains reported and the diseases described in sheep remains unclear, challenging efforts to accurately diagnose and manage infected flocks. In the present study, we applied C. pecorum multi-locus sequence typing (MLST) to C. pecorum positive samples collected from sympatric flocks of Australian sheep presenting with conjunctivitis, conjunctivitis with polyarthritis, or polyarthritis only and with no clinical disease (NCD) in order to elucidate the exact relationships between the infecting strains and the range of diseases. Using Bayesian phylogenetic and cluster analyses on 62 C. pecorum positive ocular, vaginal and rectal swab samples from sheep presenting with a range of diseases and in a comparison to C. pecorum sequence types (STs) from other hosts, one ST (ST 23) was recognised as a globally distributed strain associated with ovine and bovine diseases such as polyarthritis and encephalomyelitis. A second ST (ST 69) presently only described in Australian animals, was detected in association with ovine as well as koala chlamydial infections. The majority of vaginal and rectal C. pecorum STs from animals with NCD and/or anatomical sites with no clinical signs of disease in diseased animals, clustered together in a separate group, by both analyses. Furthermore, 8/13 detected STs were novel. This study provides a platform for strain selection for further research into the pathogenic potential of C. pecorum in animals and highlights targets for potential strain-specific diagnostic test development.
Resumo:
Thymine DNA glycosylase (TDG) functions in base excision repair, a DNA repair pathway that acts in a lesion-specific manner to correct individual damaged or altered bases. TDG preferentially catalyzes the removal of thymine and uracil paired with guanine, and is also active on 5-fluorouracil (5-FU) paired with adenine or guanine. The rs4135113 single nucleotide polymorphism (SNP) of TDG is found in 10% of the global population. This coding SNP results in the alteration of Gly199 to Ser. Gly199 is part of a loop responsible for stabilizing the flipped abasic nucleotide in the active site pocket. Biochemical analyses indicate that G199S exhibits tighter binding to both its substrate and abasic product. The persistent accumulation of abasic sites in cells expressing G199S leads to the induction of double-strand breaks (DSBs). Cells expressing the G199S variant also activate a DNA damage response. When expressed in cells, G199S induces genomic instability and cellular transformation. Together, these results suggest that individuals harboring the G199S variant may have increased risk for developing cancer.
Resumo:
Copy number variations (CNVs) as described in the healthy population are purported to contribute significantly to genetic heterogeneity. Recent studies have described CNVs using lymphoblastoid cell lines or by application of specifically developed algorithms to interrogate previously described data. However, the full extent of CNVs remains unclear. Using high-density SNP array, we have undertaken a comprehensive investigation of chromosome 18 for CNV discovery and characterisation of distribution and association with chromosome architecture. We identified 399 CNVs, of which loss represents 98%, 58% are less than 2.5 kb in size and 71% are intergenic. Intronic deletions account for the majority of copy number changes with gene involvement. Furthermore, one-third of CNVs do not have putative breakpoints within repetitive sequences. We conclude that replicative processes, mediated either by repetitive elements or microhomology, account for the majority of CNVs in the healthy population. Genomic instability involving the formation of a non-B structure is demonstrated in one region.
Resumo:
MicroRNAs (miRNAs) are small non-coding RNAs of 20 nt in length that are capable of modulating gene expression post-transcriptionally. Although miRNAs have been implicated in cancer, including breast cancer, the regulation of miRNA transcription and the role of defects in this process in cancer is not well understood. In this study we have mapped the promoters of 93 breast cancer-associated miRNAs, and then looked for associations between DNA methylation of 15 of these promoters and miRNA expression in breast cancer cells. The miRNA promoters with clearest association between DNA methylation and expression included a previously described and a novel promoter of the Hsa-mir-200b cluster. The novel promoter of the Hsa-mir-200b cluster, denoted P2, is located 2 kb upstream of the 5′ stemloop and maps within a CpG island. P2 has comparable promoter activity to the previously reported promoter (P1), and is able to drive the expression of miR-200b in its endogenous genomic context. DNA methylation of both P1 and P2 was inversely associated with miR-200b expression in eight out of nine breast cancer cell lines, and in vitro methylation of both promoters repressed their activity in reporter assays. In clinical samples, P1 and P2 were differentially methylated with methylation inversely associated with miR-200b expression. P1 was hypermethylated in metastatic lymph nodes compared with matched primary breast tumours whereas P2 hypermethylation was associated with loss of either oestrogen receptor or progesterone receptor. Hypomethylation of P2 was associated with gain of HER2 and androgen receptor expression. These data suggest an association between miR-200b regulation and breast cancer subtype and a potential use of DNA methylation of miRNA promoters as a component of a suite of breast cancer biomarkers.
Resumo:
Objectives. Strong genetic association of rheumatoid arthritis (RA) with PADI4 (peptidyl arginine deiminase) has previously been described in Japanese, although this was not confirmed in a subsequent study in the UK. We therefore undertook a further study of genetic association between PADI4 and RA in UK Caucasians and also studied expression of PADI4 in the peripheral blood of patients with RA. Methods. Seven single-nucleotide polymorphisms (SNP) were genotyped using polymerase chain reaction (PCR)-restriction fragment length polymorphism in 111 RA cases and controls. A marker significantly associated with RA (PADI4_100, rs#2240339) in this first data set (P = 0.03) was then tested for association in a larger group of 439 RA patients and 428 controls. PADI4 transcription was also assessed by real-time quantitative PCR using RNA extracted from peripheral blood mononuclear cells from 13 RA patients and 11 healthy controls. Results. A single SNP was weakly associated with RA (P = 0.03) in the initial case-control study, a single SNP (PADI4_100) and a two marker haplotype of that SNP and the neighbouring SNP (PADI4_04) were significantly associated with RA (P = 0.02 and P = 0.03 respectively). PADI4_100 was not associated with RA in a second sample set. PADI4 expression was four times greater in cases than controls (P = 0.004), but expression levels did not correlate with the levels of markers of inflammation. Conclusion. PADI4 is significantly overexpressed in the blood of RA patients but genetic variation within PADI4 is not a major risk factor for RA in Caucasians.
Resumo:
The authors report an in vivo human examination of carotid atheroma by using the inversion-recovery ON resonance (IRON) sequence, which is able to produce positive contrast after the infusion of an ultrasmall super paramagnetic iron oxide (USPIO) contrast medium. This technique provides a method of potentially identifying inflammatory burden within carotid atheroma. This may be particularly useful in patients who currently do not meet criteria for intervention (ie, moderate symptomatic stenosis or <70% asymptomatic stenosis) to further risk-stratify this important patient cohort. A 63-year-old man was imaged at 1.5 T before and 36 hours after USPIO infusion by using the IRON sequence. Regions of interest showing profound signal loss at T2*-weighted imaging corresponded well with regions of positive contrast at IRON imaging after the administration of USPIO. These regions also showed a profound decrease in T2* measurements after USPIO infusion, whereas surrounding tissue did not. It has been shown that such strong signal loss on T2*-weighted images after USPIO infusion is indicative of USPIO uptake.
Resumo:
This study investigates the use of unsupervised features derived from word embedding approaches and novel sequence representation approaches for improving clinical information extraction systems. Our results corroborate previous findings that indicate that the use of word embeddings significantly improve the effectiveness of concept extraction models; however, we further determine the influence that the corpora used to generate such features have. We also demonstrate the promise of sequence-based unsupervised features for further improving concept extraction.
Resumo:
This research examined the influence of tectonic activity on submarine sedimentation processes, through a deposit-based analysis of turbidites in outcrop. A comprehensive field study of the Miocene Whakataki Formation yielded significant data that was analysed using methods of process-sedimentology, stratigraphy, and ichnology. Signatures of the tectonically active depositional environment were identifiable at very high resolution, from grain composition and texture to trace-fossil assemblages, as well as on a broader-scale in stratigraphic stacking patterns and structural deformation. From these results and environmental interpretations, an original facies characterisation and conceptual depositional model have been established.
Resumo:
Background Fusion transcripts are found in many tissues and have the potential to create novel functional products. Here, we investigate the genomic sequences around fusion junctions to better understand the transcriptional mechanisms mediating fusion transcription/splicing. We analyzed data from prostate (cancer) cells as previous studies have shown extensively that these cells readily undergo fusion transcription. Results We used the FusionMap program to identify high-confidence fusion transcripts from RNAseq data. The RNAseq datasets were from our (N = 8) and other (N = 14) clinical prostate tumors with adjacent non-cancer cells, and from the LNCaP prostate cancer cell line that were mock-, androgen- (DHT), and anti-androgen- (bicalutamide, enzalutamide) treated. In total, 185 fusion transcripts were identified from all RNAseq datasets. The majority (76 %) of these fusion transcripts were ‘read-through chimeras’ derived from adjacent genes in the genome. Characterization of sequences at fusion loci were carried out using a combination of the FusionMap program, custom Perl scripts, and the RNAfold program. Our computational analysis indicated that most fusion junctions (76 %) use the consensus GT-AG intron donor-acceptor splice site, and most fusion transcripts (85 %) maintained the open reading frame. We assessed whether parental genes of fusion transcripts have the potential to form complementary base pairing between parental genes which might bring them into physical proximity. Our computational analysis of sequences flanking fusion junctions at parental loci indicate that these loci have a similar propensity as non-fusion loci to hybridize. The abundance of repetitive sequences at fusion and non-fusion loci was also investigated given that SINE repeats are involved in aberrant gene transcription. We found few instances of repetitive sequences at both fusion and non-fusion junctions. Finally, RT-qPCR was performed on RNA from both clinical prostate tumors and adjacent non-cancer cells (N = 7), and LNCaP cells treated as above to validate the expression of seven fusion transcripts and their respective parental genes. We reveal that fusion transcript expression is similar to the expression of parental genes. Conclusions Fusion transcripts maintain the open reading frame, and likely use the same transcriptional machinery as non-fusion transcripts as they share many genomic features at splice/fusion junctions.
Resumo:
This paper describes a vision-only system for place recognition in environments that are tra- versed at different times of day, when chang- ing conditions drastically affect visual appear- ance, and at different speeds, where places aren’t visited at a consistent linear rate. The ma- jor contribution is the removal of wheel-based odometry from the previously presented algo- rithm (SMART), allowing the technique to op- erate on any camera-based device; in our case a mobile phone. While we show that the di- rect application of visual odometry to our night- time datasets does not achieve a level of perfor- mance typically needed, the VO requirements of SMART are orthogonal to typical usage: firstly only the magnitude of the velocity is required, and secondly the calculated velocity signal only needs to be repeatable in any one part of the environment over day and night cycles, but not necessarily globally consistent. Our results show that the smoothing effect of motion constraints is highly beneficial for achieving a locally consis- tent, lighting-independent velocity estimate. We also show that the advantage of our patch-based technique used previously for frame recogni- tion, surprisingly, does not transfer to VO, where SIFT demonstrates equally good performance. Nevertheless, we present the SMART system us- ing only vision, which performs sequence-base place recognition in extreme low-light condi- tions where standard 6-DOF VO fails and that improves place recognition performance over odometry-less benchmarks, approaching that of wheel odometry.