110 resultados para Antinutritional features
Resumo:
This study investigates the use of unsupervised features derived from word embedding approaches and novel sequence representation approaches for improving clinical information extraction systems. Our results corroborate previous findings that indicate that the use of word embeddings significantly improve the effectiveness of concept extraction models; however, we further determine the influence that the corpora used to generate such features have. We also demonstrate the promise of sequence-based unsupervised features for further improving concept extraction.
Resumo:
Background Fusion transcripts are found in many tissues and have the potential to create novel functional products. Here, we investigate the genomic sequences around fusion junctions to better understand the transcriptional mechanisms mediating fusion transcription/splicing. We analyzed data from prostate (cancer) cells as previous studies have shown extensively that these cells readily undergo fusion transcription. Results We used the FusionMap program to identify high-confidence fusion transcripts from RNAseq data. The RNAseq datasets were from our (N = 8) and other (N = 14) clinical prostate tumors with adjacent non-cancer cells, and from the LNCaP prostate cancer cell line that were mock-, androgen- (DHT), and anti-androgen- (bicalutamide, enzalutamide) treated. In total, 185 fusion transcripts were identified from all RNAseq datasets. The majority (76 %) of these fusion transcripts were ‘read-through chimeras’ derived from adjacent genes in the genome. Characterization of sequences at fusion loci were carried out using a combination of the FusionMap program, custom Perl scripts, and the RNAfold program. Our computational analysis indicated that most fusion junctions (76 %) use the consensus GT-AG intron donor-acceptor splice site, and most fusion transcripts (85 %) maintained the open reading frame. We assessed whether parental genes of fusion transcripts have the potential to form complementary base pairing between parental genes which might bring them into physical proximity. Our computational analysis of sequences flanking fusion junctions at parental loci indicate that these loci have a similar propensity as non-fusion loci to hybridize. The abundance of repetitive sequences at fusion and non-fusion loci was also investigated given that SINE repeats are involved in aberrant gene transcription. We found few instances of repetitive sequences at both fusion and non-fusion junctions. Finally, RT-qPCR was performed on RNA from both clinical prostate tumors and adjacent non-cancer cells (N = 7), and LNCaP cells treated as above to validate the expression of seven fusion transcripts and their respective parental genes. We reveal that fusion transcript expression is similar to the expression of parental genes. Conclusions Fusion transcripts maintain the open reading frame, and likely use the same transcriptional machinery as non-fusion transcripts as they share many genomic features at splice/fusion junctions.
Resumo:
Age estimation from facial images is increasingly receiving attention to solve age-based access control, age-adaptive targeted marketing, amongst other applications. Since even humans can be induced in error due to the complex biological processes involved, finding a robust method remains a research challenge today. In this paper, we propose a new framework for the integration of Active Appearance Models (AAM), Local Binary Patterns (LBP), Gabor wavelets (GW) and Local Phase Quantization (LPQ) in order to obtain a highly discriminative feature representation which is able to model shape, appearance, wrinkles and skin spots. In addition, this paper proposes a novel flexible hierarchical age estimation approach consisting of a multi-class Support Vector Machine (SVM) to classify a subject into an age group followed by a Support Vector Regression (SVR) to estimate a specific age. The errors that may happen in the classification step, caused by the hard boundaries between age classes, are compensated in the specific age estimation by a flexible overlapping of the age ranges. The performance of the proposed approach was evaluated on FG-NET Aging and MORPH Album 2 datasets and a mean absolute error (MAE) of 4.50 and 5.86 years was achieved respectively. The robustness of the proposed approach was also evaluated on a merge of both datasets and a MAE of 5.20 years was achieved. Furthermore, we have also compared the age estimation made by humans with the proposed approach and it has shown that the machine outperforms humans. The proposed approach is competitive with current state-of-the-art and it provides an additional robustness to blur, lighting and expression variance brought about by the local phase features.
Resumo:
The incidence of human infections by the fungal pathogen Candida species has been increasing in recent years. Enolase is an essential protein in fungal metabolism. Sequence data is available for human and a number of medically important fungal species. An understanding of the structural and functional features of fungal enolases may provide the structural basis for their use as a target for the development of new anti-fungal drugs. We have obtained the sequence of the enolase of Candida krusei (C. krusei), as it is a significant medically important fungal pathogen. We have then used multiple sequence alignments with various enolase isoforms in order to identify C. krusei specific amino acid residues. The phylogenetic tree of enolases shows that the C. krusei enolase assembles on the tree with the fungal genes. Importantly, C. krusei lacks four amino acids in the active site compared to human enolase, as revealed by multiple sequence alignments. These differences in the substrate binding site may be exploited for the design of new anti-fungal drugs to selectively block this enzyme. The lack of the important amino acids in the active site also indicates that C. krusei enolase might have evolved as a member of a mechanistically diverse enolase superfamily catalying somewhat different reactions.
Resumo:
We have systematically analysed the ultra structure of the early secretory pathway in the Trichoderma reesei hyphae in the wild-type QM6a, cellulase overexpressing Rut-C30 strain and a Rut-C30 transformant BV47 overexpressing a recombinant BiP1-VenusYFP fusion protein with an endoplasmic reticulum (ER) retention signal. The hyphae were studied after 24h of growth using transmission electron microscopy, confocal microscopy and quantitative stereological techniques. All three strains exhibited different spatial organisation of the ER at 24h in both a cellulase-inducing medium and a minimal medium containing glycerol as a carbon source (non-cellulase-inducing medium). The wild-type displayed a number of ER subdomains including parallel tubular/cisternal ER, ER whorls, ER-isolation membrane complexes with abundant autophagy vacuoles and dense bodies. Rut-C30 and its transformant BV47 overexpressing the BiP1-VenusYFP fusion protein also contained parallel tubular/cisternal ER, but no ER whorls; also, there were very few autophagy vacuoles and an increasing amount of punctate bodies where particularly the recombinant BiP1-VenusYFPfusion protein was localised. The early presence of distinct strain-specific features such as the dominance of ER whorls in the wild type and tub/cis ER in Rut-C30 suggests that these are inherent traits and not solely a result of cellular response mechanisms by the high secreting mutant to protein overload.