Biblioteca Digital

183 resultados para GENOMIC SEQUENCE

Fusion transcript loci share many genomic features with non-fusion loci

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background Fusion transcripts are found in many tissues and have the potential to create novel functional products. Here, we investigate the genomic sequences around fusion junctions to better understand the transcriptional mechanisms mediating fusion transcription/splicing. We analyzed data from prostate (cancer) cells as previous studies have shown extensively that these cells readily undergo fusion transcription. Results We used the FusionMap program to identify high-confidence fusion transcripts from RNAseq data. The RNAseq datasets were from our (N = 8) and other (N = 14) clinical prostate tumors with adjacent non-cancer cells, and from the LNCaP prostate cancer cell line that were mock-, androgen- (DHT), and anti-androgen- (bicalutamide, enzalutamide) treated. In total, 185 fusion transcripts were identified from all RNAseq datasets. The majority (76 %) of these fusion transcripts were ‘read-through chimeras’ derived from adjacent genes in the genome. Characterization of sequences at fusion loci were carried out using a combination of the FusionMap program, custom Perl scripts, and the RNAfold program. Our computational analysis indicated that most fusion junctions (76 %) use the consensus GT-AG intron donor-acceptor splice site, and most fusion transcripts (85 %) maintained the open reading frame. We assessed whether parental genes of fusion transcripts have the potential to form complementary base pairing between parental genes which might bring them into physical proximity. Our computational analysis of sequences flanking fusion junctions at parental loci indicate that these loci have a similar propensity as non-fusion loci to hybridize. The abundance of repetitive sequences at fusion and non-fusion loci was also investigated given that SINE repeats are involved in aberrant gene transcription. We found few instances of repetitive sequences at both fusion and non-fusion junctions. Finally, RT-qPCR was performed on RNA from both clinical prostate tumors and adjacent non-cancer cells (N = 7), and LNCaP cells treated as above to validate the expression of seven fusion transcripts and their respective parental genes. We reveal that fusion transcript expression is similar to the expression of parental genes. Conclusions Fusion transcripts maintain the open reading frame, and likely use the same transcriptional machinery as non-fusion transcripts as they share many genomic features at splice/fusion junctions.

Repeatable Condition-Invariant Visual Odometry for Sequence-Based Place Recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a vision-only system for place recognition in environments that are tra- versed at different times of day, when chang- ing conditions drastically affect visual appear- ance, and at different speeds, where places aren’t visited at a consistent linear rate. The ma- jor contribution is the removal of wheel-based odometry from the previously presented algo- rithm (SMART), allowing the technique to op- erate on any camera-based device; in our case a mobile phone. While we show that the di- rect application of visual odometry to our night- time datasets does not achieve a level of perfor- mance typically needed, the VO requirements of SMART are orthogonal to typical usage: firstly only the magnitude of the velocity is required, and secondly the calculated velocity signal only needs to be repeatable in any one part of the environment over day and night cycles, but not necessarily globally consistent. Our results show that the smoothing effect of motion constraints is highly beneficial for achieving a locally consis- tent, lighting-independent velocity estimate. We also show that the advantage of our patch-based technique used previously for frame recogni- tion, surprisingly, does not transfer to VO, where SIFT demonstrates equally good performance. Nevertheless, we present the SMART system us- ing only vision, which performs sequence-base place recognition in extreme low-light condi- tions where standard 6-DOF VO fails and that improves place recognition performance over odometry-less benchmarks, approaching that of wheel odometry.

Complex symbolic sequence clustering and multiple classifiers for predictive process monitoring

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper addresses the following predictive business process monitoring problem: Given the execution trace of an ongoing case,and given a set of traces of historical (completed) cases, predict the most likely outcome of the ongoing case. In this context, a trace refers to a sequence of events with corresponding payloads, where a payload consists of a set of attribute-value pairs. Meanwhile, an outcome refers to a label associated to completed cases, like, for example, a label indicating that a given case completed “on time” (with respect to a given desired duration) or “late”, or a label indicating that a given case led to a customer complaint or not. The paper tackles this problem via a two-phased approach. In the first phase, prefixes of historical cases are encoded using complex symbolic sequences and clustered. In the second phase, a classifier is built for each of the clusters. To predict the outcome of an ongoing case at runtime given its (uncompleted) trace, we select the closest cluster(s) to the trace in question and apply the respective classifier(s), taking into account the Euclidean distance of the trace from the center of the clusters. We consider two families of clustering algorithms – hierarchical clustering and k-medoids – and use random forests for classification. The approach was evaluated on four real-life datasets.

Genomic inflation factors under polygenic inheritance

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Population structure, including population stratification and cryptic relatedness, can cause spurious associations in genome-wide association studies (GWAS). Usually, the scaled median or mean test statistic for association calculated from multiple single-nucleotide-polymorphisms across the genome is used to assess such effects, and 'genomic control' can be applied subsequently to adjust test statistics at individual loci by a genomic inflation factor. Published GWAS have clearly shown that there are many loci underlying genetic variation for a wide range of complex diseases and traits, implying that a substantial proportion of the genome should show inflation of the test statistic. Here, we show by theory, simulation and analysis of data that in the absence of population structure and other technical artefacts, but in the presence of polygenic inheritance, substantial genomic inflation is expected. Its magnitude depends on sample size, heritability, linkage disequilibrium structure and the number of causal variants. Our predictions are consistent with empirical observations on height in independent samples of ~4000 and ~133,000 individuals.

Sequence variants in three loci influence monocyte counts and erythrocyte volume

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Blood cells participate in vital physiological processes, and their numbers are tightly regulated so that homeostasis is maintained. Disruption of key regulatory mechanisms underlies many blood-related Mendelian diseases but also contributes to more common disorders, including atherosclerosis. We searched for quantitative trait loci (QTL) for hematology traits through a whole-genome association study, because these could provide new insights into both hemopoeitic and disease mechanisms. We tested 1.8 million variants for association with 13 hematology traits measured in 6015 individuals from the Australian and Dutch populations. These traits included hemoglobin composition, platelet counts, and red blood cell and white blood cell indices. We identified three regions of strong association that, to our knowledge, have not been previously reported in the literature. The first was located in an intergenic region of chromosome 9q31 near LPAR1, explaining 1.5% of the variation in monocyte counts (best SNP rs7023923, p=8.9x10(-14)). The second locus was located on chromosome 6p21 and associated with mean cell erythrocyte volume (rs12661667, p=1.2x10(-9), 0.7% variance explained) in a region that spanned five genes, including CCND3, a member of the D-cyclin gene family that is involved in hematopoietic stem cell expansion. The third region was also associated with erythrocyte volume and was located in an intergenic region on chromosome 6q24 (rs592423, p=5.3x10(-9), 0.6% variance explained). All three loci replicated in an independent panel of 1543 individuals (p values=0.001, 9.9x10(-5), and 7x10(-5), respectively). The identification of these QTL provides new opportunities for furthering our understanding of the mechanisms regulating hemopoietic cell fate.

Multi-species sequence comparison reveals conservation of ghrelin gene-derived splice variants encoding a truncated ghrelin peptide

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The peptide hormone ghrelin is a potent orexigen produced predominantly in the stomach. It has a number of other biological actions, including roles in appetite stimulation, energy balance, the stimulation of growth hormone release and the regulation of cell proliferation. Recently, several ghrelin gene splice variants have been described. Here, we attempted to identify conserved alternative splicing of the ghrelin gene by cross-species sequence comparisons. We identified a novel human exon 2-deleted variant and provide preliminary evidence that this splice variant and in1-ghrelin encode a C-terminally truncated form of the ghrelin peptide, termed minighrelin. These variants are expressed in humans and mice, demonstrating conservation of alternative splicing spanning 90 million years. Minighrelin appears to have similar actions to full-length ghrelin, as treatment with exogenous minighrelin peptide stimulates appetite and feeding in mice. Forced expression of the exon 2-deleted preproghrelin variant mirrors the effect of the canonical preproghrelin, stimulating cell proliferation and migration in the PC3 prostate cancer cell line. This is the first study to characterise an exon 2-deleted preproghrelin variant and to demonstrate sequence conservation of ghrelin gene-derived splice variants that encode a truncated ghrelin peptide. This adds further impetus for studies into the alternative splicing of the ghrelin gene and the function of novel ghrelin peptides in vertebrates.

A transcriptomic analysis of the kidney tissue of Tra catfish (pangasianodon hypophthalmus) reared in saline condition: De novo assembly, annotation, SNP discovery

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Pangasianodon hypophthalmus is a commercially important freshwater fish used in inland aquaculture in the Mekong Delta, Vietnam. The current study using Ion Torrent technology generated EST resources from the kidney for Tra catfish reared at a salinity level of 9 ppt. We obtained 2,623,929 reads after trimming and processing with an average length of 104 bp. De novo assemblies were generated using CLC Genomic Workbench, Trinity and Velvet/Oases with the best overall contig performance resulting from the CLC assembly. De novo assembly using CLC yielded 29,940 contigs, and allowing identification of 5,710 putative genes when comppared with NCBI non-redundant database. A large number of single nucleotide polymorphisms (SNPs) were also detected. The sequence collection generated in our study represents the most comprehensive transcriptomic resource for P. hypophthalmus available to date.

Molecular characterisation of the Vacuolating Autotransporter Toxin in Uropathogenic Escherichia coli

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The vacuolating autotransporter (AT) toxin (Vat) contributes to Uropathogenic Escherichia coli (UPEC) fitness during systemic infection. Here we characterised Vat and investigated its regulation in UPEC. We assessed the prevalence of vat in a collection of 45 UPEC urosepsis strains and showed that it was present in 31 (68%) of the isolates. The isolates containing the vat gene corresponded to three major E. coli sequence types (ST12, 73 and 95) and these strains secreted the Vat protein. Further analysis of the vat genomic locus identified a conserved gene located directly downstream of vat that encodes a putative MarR-like transcriptional regulator, which we termed vatX. The vat-vatX genes were present in the UPEC reference strain CFT073 and RT-PCR revealed both genes are co-transcribed. Over-expression of vatX in CFT073 led to a 3-fold increase in vat gene transcription. The vat promoter region contained three putative nucleation sites for the global transcriptional regulator H-NS; thus the hns gene was mutated in CFT073 (to generate CFT073hns). Western blot analysis using a Vat-specific antibody revealed a significant increase in Vat expression in CFT073hns compared to wild-type CFT073. Direct H-NS binding to the vat promoter region was demonstrated using purified H-NS in combination with electrophoresis mobility shift assays. Finally, Vat-specific antibodies were detected in plasma samples from urosepsis patients infected by vat-containing UPEC strains, demonstrating Vat is expressed during infection. Overall, this study has demonstrated that Vat is a highly prevalent and tightly regulated immunogenic SPATE secreted by UPEC during infection.

Characterization of sequence and structural features of the Candida krusei Enolase

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The incidence of human infections by the fungal pathogen Candida species has been increasing in recent years. Enolase is an essential protein in fungal metabolism. Sequence data is available for human and a number of medically important fungal species. An understanding of the structural and functional features of fungal enolases may provide the structural basis for their use as a target for the development of new anti-fungal drugs. We have obtained the sequence of the enolase of Candida krusei (C. krusei), as it is a significant medically important fungal pathogen. We have then used multiple sequence alignments with various enolase isoforms in order to identify C. krusei specific amino acid residues. The phylogenetic tree of enolases shows that the C. krusei enolase assembles on the tree with the fungal genes. Importantly, C. krusei lacks four amino acids in the active site compared to human enolase, as revealed by multiple sequence alignments. These differences in the substrate binding site may be exploited for the design of new anti-fungal drugs to selectively block this enzyme. The lack of the important amino acids in the active site also indicates that C. krusei enolase might have evolved as a member of a mechanistically diverse enolase superfamily catalying somewhat different reactions.

Draft genome sequence of caloramator mitchellensis, a thermoanaerobe isolated from the waters of the Great Artesian Basin

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The genome sequence of Caloramator mitchellensis strain VF08, a rod-shaped, heterotrophic, strictly anaerobic bacterium iso-lated from the free-flowing waters of a Great Artesian Basin (GAB) bore well located in Mitchell, an outback Queensland town in Australia, is reported here. The analysis of the 2.42-Mb genome sequence indicates that the attributes of the genome are consistent with its physiological and phenotypic traits.

Association of FOXE1 Polyalanine Repeat Region with Papillary Thyroid Cancer

Relevância:

20.00% 20.00%

Publicador:

Resumo:

CONTEXT: Polyalanine tract variations in transcription factors have been identified for a wide spectrum of developmental disorders. The thyroid transcription factor forkhead factor E1 (FOXE1) contains a polymorphic polyalanine tract with 12-22 alanines. Single-nucleotide polymorphisms (SNP) close to this locus are associated with papillary thyroid cancer (PTC), and a strong linkage disequilibrium block extends across this region. OBJECTIVE: The objective of the study was to assess whether the FOXE1 polyalanine repeat region was associated with PTC and to assess the effect of polyalanine repeat region variants on protein expression, DNA binding, and transcriptional function on FOXE1-responsive promoters. DESIGN: This was a case-control study. SETTING: The study was conducted at a tertiary referral hospital. PATIENTS AND METHODS: The FOXE1 polyalanine repeat region and tag SNP were genotyped in 70 PTC, with a replication in a further 92 PTC, and compared with genotypes in 5767 healthy controls (including 5667 samples from the Wellcome Trust Case Control Consortium). In vitro studies were performed to examine the protein expression, DNA binding, and transcriptional function for FOXE1 variants of different polyalanine tract lengths. RESULTS: All the genotyped SNP were in tight linkage disequilibrium, including the FOXE1 polyalanine repeat region. We confirmed the strong association of rs1867277 with PTC (overall P = 1 × 10(-7), odds ratio 1.84, confidence interval 1.31-2.57). rs1867277 was in tight linkage disequilibrium with the FOXE1 polyalanine repeat region (r(2) = 0.95). FOXE1(16Ala) was associated with PTC with an odds ratio of 2.23 (confidence interval 1.42-3.50; P = 0.0005). Functional studies in vitro showed that FOXE1(16Ala) was transcriptionally impaired compared with FOXE1(14Ala), which was not due to differences in protein expression or DNA binding. CONCLUSIONS: We have confirmed the previous association of FOXE1 with PTC. Our data suggest that the coding polyalanine expansion in FOXE1 may be responsible for the observed association between FOXE1 and PTC.

Next-generation sequencing: A frameshift in skeletal dysplasia gene discovery

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the last decade, huge breakthroughs in genetics - driven by new technology and different statistical approaches - have resulted in a plethora of new disease genes identified for both common and rare diseases. Massive parallel sequencing, commonly known as next-generation sequencing, is the latest advance in genetics, and has already facilitated the discovery of the molecular cause of many monogenic disorders. This article describes this new technology and reviews how this approach has been used successfully in patients with skeletal dysplasias. Moreover, this article illustrates how the study of rare diseases can inform understanding and therapeutic developments for common diseases such as osteoporosis. © International Osteoporosis Foundation and National Osteoporosis Foundation 2013.

Next generation sequencing identifies novel CACNA1A gene mutations in Episodic Ataxia type 2

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Episodic Ataxia type 2 (EA2) is a rare autosomal dominantly inherited neurological disorder characterized by recurrent disabling imbalance, vertigo and episodes of ataxia lasting minutes to hours. EA2 is caused most often by loss of function mutations of the calcium channel gene CACNA1A. In addition to EA2, mutations in CACNA1A are responsible for two other allelic disorders: familial hemiplegic migraine type1 (FHM1) and spinocerebellar ataxia type 6 (SCA6). Herein, we have utilised Next Generation Sequencing (NGS) to screen the coding sequence, exon-intron boundaries and UTRs of five genes where mutation is known to produce symptoms related to EA2, including CACNA1A. We performed this screening in a group of 31 unrelated patients with EA2 symptoms. Both novel and known mutations were detected through NGS technology, and confirmed through Sanger sequencing. Genetic testing showed in total 15 mutation bearing patients (48%), of which 9 were novel mutations (6 missense and 3 small frameshift deletion mutations) and six known mutations (4 missense and 2 nonsense).These results demonstrate the efficiency of our NGS-panel for detecting known and novel mutations for EA2 in the CACNA1A gene, also identifying a novel missense mutation in ATP1A2 which is not a normal target for EA2 screening.

Complete genome sequence of a novel zantedeschia mild mosaic virus isolate: The first report from Australia and from Alocasia sp

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The complete genome of an Australian isolate of zantedeschia mild mosaic virus (ZaMMV) causing mosaic symptoms on Alocasia sp. (designated ZaMMVAU) was cloned and sequenced. The genome comprises 9942 nucleotides (excluding the poly-A tail) and encodes a polyprotein of 3167 amino acids. The sequence is most closely related to a previously reported ZaMMV isolate from Taiwan (ZaMMV-TW), with 82 and 86 % identity at the nucleotide and amino acid level, respectively. Unlike the amino acid sequence of ZaMMV-TW, however, ZaMMV-AU does not contain a polyglutamine stretch at the N-terminus of the coat-protein-coding region upstream of the DAG motif. This is the first report of ZaMMV from Australia and from Alocasia sp.

Complete genome sequence of Colocasia bobone disease-associated virus, a putative cytorhabdovirus infecting taro

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report the first genome sequence of a Colocasia bobone disease-associated virus (CBDaV) derived from bobone-affected taro [Colocasia esculenta L. Schott] from Solomon Islands. The negative-strand RNA genome is 12,193 nt long, with six major open reading frames (ORFs) with the arrangement 3′-N-P-P3-M-G-L-5′. Typical of all rhabdoviruses, the 3′ leader and 5′ trailer sequences show complementarity to each other. Phylogenetic analysis indicated that CBDaV is a member of the genus Cytorhabdovirus, supporting previous reports of virus particles within the cytoplasm of bobone-infected taro cells. The availability of the CBDaV genome sequence now makes it possible to assess the role of this virus in bobone, and possibly alomae disease of taro and confirm that this sequence is that of Colocasia bobone disease virus (CBDV).

«
1
2
...
5
6
7
8
9
10
11
12
13
»