949 resultados para Readability, Text pre-processing


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Uridine-rich small nuclear RNAs (U snRNAs) play essential roles in eukaryotic gene expression by facilitating the removal of introns from mRNA precursors and the processing of the replication-dependent histone pre-mRNAs. Formation of the 3’ end of these snRNAs is carried out by a poorly characterized, twelve-membered protein complex named Integrator Complex. In the effort to understand Integrator Complex function in the formation of the snRNA 3’ end, we performed a functional RNAi screen in Drosophila S2 cells to identify protein factors required for snRNA 3’ end formation. This screen was conducted by using a fluorescence-based reporter that elicits GFP expression in response to a deficiency in snRNA processing. Besides scoring the known Integrator subunits, we identified Asunder and CG4785 as additional core members of the Integrator Complex. Additionally, we also found a conserved requirement for Cyclin C and Cdk8 in both fly and human snRNA 3’ end processing. We have further demonstrated that the kinase activity of Cdk8 is critical for snRNA 3’ end processing and is likely to function independent of its well-documented function within the Mediator Cdk8 module. Taken together, this work functionally defines the Drosophila Integrator Complex and demonstrates a novel function for Cyclin C/Cdk8 in snRNA 3’ end formation. This thesis work has also characterized an important functional interaction mediated by a microdomain within Integrator subunit 12 (IntS12) and IntS1 that is required for the activity of the Integrator Complex in processing the snRNA 3’ end. Through the development of a reporter-based functional RNAi-rescue assay in Drosophila S2 cells, we analyzed domains within IntS12 required for snRNA 3’ end formation. This analysis unexpectedly revealed that an N-terminal 30 amino acid region and not the highly conserved central PHD finger domain, is required for snRNA 3’ end cleavage. The IntS12 microdomain (1-45) functions autonomously, and is sufficient to interact and stabilize the putative scaffold protein IntS1. Our findings provide more details of the Integrator Complex for understanding the molecular mechanism of snRNA 3’ end processing. Moreover, these results lay the foundation for future studies of the complex through the identification of a novel functional domain within one subunit and the identification of additional subunits.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

One of the most elegant and tightly regulated mechanisms for control of gene expression is alternative pre-mRNA splicing. Despite the importance of regulated splicing in a variety of biological processes relatively little is understood about the mechanisms by which specific alternative splice choices are made and regulated. The transformer-2 (tra-2) gene encodes a splicing regulator that controls the use of alternative splicing pathways in the sex determination cascade of D. melanogaster and is particularly interesting because it directs the splicing of several distinct pre-mRNAs in different manners. The tra-2 protein positively regulates the splicing of both doublesex (dsx) and fruitless (fru) pre-mRNAs. Additionally tra-2 controls exuperantia (exu) by directing the choices between splicing and cleavage/polyadenylation and autoregulates the tra-2 pre-mRNA processing by repressing the removal of a specific intron (called M1). The goal of this study is to identify the molecular mechanisms by which TRA-2 protein affects the alternative splicing of pre-mRNA deriving from the tra-2 gene itself.^ The autoregulation of M1 splicing plays a key role in regulation of the relative levels of two functionally distinct TRA-2 protein isoforms expressed in the male germline. We have examined whether the structure, function, and regulation of tra-2 are conserved in Drosophila virilis, a species diverged from D. melanogaster by over 60 million years. We find that the D. virilis homolog of tra-2 produces alternatively spliced RNAs encoding a set of protein isoforms analogous to those found in D. melanogaster. When introduced into the genome of D. melanogaster, this homolog can functionally replace the endogenous tra-2 gene for both normal female sexual differentiation and spermatogenesis. Examination of alternative pre-mRNAs produced in D. virilis testes suggests that the germline-specific autoregulation of tra-2 function is accomplished by a strategy similar to that used in D. melanogaster.^ To identify elements necessary for regulation of tra-2 M1 splicing, we mutagenized evolutionarily conserved sequences within the tra-2 M1 intron and flanking exons. Constructs containing these mutations were used to generate transgenic fly lines that have been tested for their ability to carry out autoregulation. These transgenic fly experiments elucidated several elements that are necessary for setting up a context under which tissue-specific regulation of M1 splicing can occur. These elements include a suboptimal 3$\sp\prime$ splice site, an element that has been conserved between D. virilis and D. melanogaster, and an element that resembles the 3$\sp\prime$ portion of a dsx repeat and other splicing enhancers.^ Although important contextual features of the tra-2 M1 intron have been delineated in the transgenic fly experiments, the specific RNA sequences that interact directly with the TRA-2 protein were not identified. Using Drosophila nuclear extracts from Schneider cells, we have shown that recombinant TRA-2 protein represses M1 splicing in vitro. UV crosslinking analysis suggests that the TRA-2 protein binds to several different sites within and near the M1 intron. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The conserved CDC5 family of Myb-related proteins performs an essential function in cell cycle control at G2/M. Although c-Myb and many Myb-related proteins act as transcription factors, herein, we implicate CDC5 proteins in pre-mRNA splicing. Mammalian CDC5 colocalizes with pre-mRNA splicing factors in the nuclei of mammalian cells, associates with core components of the splicing machinery in nuclear extracts, and interacts with the spliceosome throughout the splicing reaction in vitro. Furthermore, genetic depletion of the homolog of CDC5 in Saccharomyces cerevisiae, CEF1, blocks the first step of pre-mRNA processing in vivo. These data provide evidence that eukaryotic cells require CDC5 proteins for pre-mRNA splicing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Three small nucleolar RNAs (snoRNAs), E1, E2 and E3, have been described that have unique sequences and interact directly with unique segments of pre-rRNA in vivo. In this report, injection of antisense oligodeoxynucleotides into Xenopus laevis oocytes was used to target the specific degradation of these snoRNAs. Specific disruptions of pre-rRNA processing were then observed, which were reversed by injection of the corresponding in vitro-synthesized snoRNA. Degradation of each of these three snoRNAs produced a unique rRNA maturation phenotype. E1 RNA depletion shut down 18 rRNA formation, without overaccumulation of 20S pre-rRNA. After E2 RNA degradation, production of 18S rRNA and 36S pre-rRNA stopped, and 38S pre-rRNA accumulated, without overaccumulation of 20S pre-rRNA. E3 RNA depletion induced the accumulation of 36S pre-rRNA. This suggests that each of these snoRNAs plays a different role in pre-rRNA processing and indicates that E1 and E2 RNAs are essential for 18S rRNA formation. The available data support the proposal that these snoRNAs are at least involved in pre-rRNA processing at the following pre-rRNA cleavage sites: E1 at the 5′ end and E2 at the 3′ end of 18S rRNA, and E3 at or near the 5′ end of 5.8S rRNA.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We have examined the distribution of RNA transcription and processing factors in the amphibian oocyte nucleus or germinal vesicle. RNA polymerase I (pol I), pol II, and pol III occur in the Cajal bodies (coiled bodies) along with various components required for transcription and processing of the three classes of nuclear transcripts: mRNA, rRNA, and pol III transcripts. Among these components are transcription factor IIF (TFIIF), TFIIS, splicing factors, the U7 small nuclear ribonucleoprotein particle, the stem–loop binding protein, SR proteins, cleavage and polyadenylation factors, small nucleolar RNAs, nucleolar proteins that are probably involved in pre-rRNA processing, and TFIIIA. Earlier studies and data presented here show that several of these components are first targeted to Cajal bodies when injected into the oocyte and only subsequently appear in the chromosomes or nucleoli, where transcription itself occurs. We suggest that pol I, pol II, and pol III transcription and processing components are preassembled in Cajal bodies before transport to the chromosomes and nucleoli. Most components of the pol II transcription and processing pathway that occur in Cajal bodies are also found in the many hundreds of B-snurposomes in the germinal vesicle. Electron microscopic images show that B-snurposomes consist primarily, if not exclusively, of 20- to 30-nm particles, which closely resemble the interchromatin granules described from sections of somatic nuclei. We suggest the name pol II transcriptosome for these particles to emphasize their content of factors involved in synthesis and processing of mRNA transcripts. We present a model in which pol I, pol II, and pol III transcriptosomes are assembled in the Cajal bodies before export to the nucleolus (pol I), to the B-snurposomes and eventually to the chromosomes (pol II), and directly to the chromosomes (pol III). The key feature of this model is the preassembly of the transcription and processing machinery into unitary particles. An analogy can be made between ribosomes and transcriptosomes, ribosomes being unitary particles involved in translation and transcriptosomes being unitary particles for transcription and processing of RNA.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Efficient 3′-end processing of cell cycle-regulated mammalian histone premessenger RNAs (pre-mRNAs) requires an upstream stem–loop and a histone downstream element (HDE) that base pairs with the U7 small ribonuclearprotein. Insertions between these elements have two effects: the site of cleavage moves in concert with the HDE and processing efficiency declines. We used Xenopus oocytes to ask whether compensatory length insertions in the human U7 RNA could restore the fidelity and efficiency of processing of mouse histone insertion pre-mRNAs. An insertion of 5 nt into U7 RNA that extends its complementary to the HDE compensated for both defects in processing of a 5-nt insertion substrate; a noncomplementary insertion into U7 did not. Yet, the noncomplementary insertion mutant U7 was shown to be active on insertion substrates further mutated to allow base pairing. Our results suggest that the histone pre-mRNA becomes rigidified upstream of its HDE, allowing the bound U7 small ribonucleoprotein to measure from the HDE to the cleavage site. Such a mechanism may be common to other RNA measuring systems. To our knowledge, this is the first demonstration of length suppression in an RNA processing system.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Research on semantic processing focused mainly on isolated units in language, which does not reflect the complexity of language. In order to understand how semantic information is processed in a wider context, the first goal of this thesis was to determine whether Swedish pre-school children are able to comprehend semantic context and if that context is semantically built up over time. The second goal was to investigate how the brain distributes attentional resources by means of brain activation amplitude and processing type. Swedish preschool children were tested in a dichotic listening task with longer children’s narratives. The development of event-related potential N400 component and its amplitude were used to investigate both goals. The decrease of the N400 in the attended and unattended channel indicated semantic comprehension and that semantic context was built up over time. The attended stimulus received more resources, processed the stimuli in more of a top-down manner and displayed prominent N400 amplitude in contrast to the unattended stimulus. The N400 and the late positivity were more complex than expected since endings of utterances longer than nine words were not accounted for. More research on wider linguistic context is needed in order to understand how the human brain comprehends natural language. 

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Domain specific information retrieval has become in demand. Not only domain experts, but also average non-expert users are interested in searching domain specific (e.g., medical and health) information from online resources. However, a typical problem to average users is that the search results are always a mixture of documents with different levels of readability. Non-expert users may want to see documents with higher readability on the top of the list. Consequently the search results need to be re-ranked in a descending order of readability. It is often not practical for domain experts to manually label the readability of documents for large databases. Computational models of readability needs to be investigated. However, traditional readability formulas are designed for general purpose text and insufficient to deal with technical materials for domain specific information retrieval. More advanced algorithms such as textual coherence model are computationally expensive for re-ranking a large number of retrieved documents. In this paper, we propose an effective and computationally tractable concept-based model of text readability. In addition to textual genres of a document, our model also takes into account domain specific knowledge, i.e., how the domain-specific concepts contained in the document affect the document’s readability. Three major readability formulas are proposed and applied to health and medical information retrieval. Experimental results show that our proposed readability formulas lead to remarkable improvements in terms of correlation with users’ readability ratings over four traditional readability measures.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Ontology construction for any domain is a labour intensive and complex process. Any methodology that can reduce the cost and increase efficiency has the potential to make a major impact in the life sciences. This paper describes an experiment in ontology construction from text for the animal behaviour domain. Our objective was to see how much could be done in a simple and relatively rapid manner using a corpus of journal papers. We used a sequence of pre-existing text processing steps, and here describe the different choices made to clean the input, to derive a set of terms and to structure those terms in a number of hierarchies. We describe some of the challenges, especially that of focusing the ontology appropriately given a starting point of a heterogeneous corpus. Results - Using mainly automated techniques, we were able to construct an 18055 term ontology-like structure with 73% recall of animal behaviour terms, but a precision of only 26%. We were able to clean unwanted terms from the nascent ontology using lexico-syntactic patterns that tested the validity of term inclusion within the ontology. We used the same technique to test for subsumption relationships between the remaining terms to add structure to the initially broad and shallow structure we generated. All outputs are available at http://thirlmere.aston.ac.uk/~kiffer/animalbehaviour/ webcite. Conclusion - We present a systematic method for the initial steps of ontology or structured vocabulary construction for scientific domains that requires limited human effort and can make a contribution both to ontology learning and maintenance. The method is useful both for the exploration of a scientific domain and as a stepping stone towards formally rigourous ontologies. The filtering of recognised terms from a heterogeneous corpus to focus upon those that are the topic of the ontology is identified to be one of the main challenges for research in ontology learning.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Sensory processing is a crucial underpinning of the development of social cognition, a function which is compromised in variable degree in patients with pervasive developmental disorders (PDD). In this manuscript, we review some of the most recent and relevant contributions, which have looked at auditory sensory processing derangement in PDD. The variability in the clinical characteristics of the samples studied so far, in terms of severity of the associated cognitive deficits and associated limited compliance, underlying aetiology and demographic features makes a univocal interpretation arduous. We hypothesise that, in patients with severe mental deficits, the presence of impaired auditory sensory memory as expressed by the mismatch negativity could be a non-specific indicator of more diffuse cortical deficits rather than causally related to the clinical symptomatology. More consistent findings seem to emerge from studies on less severely impaired patients, in whom increased pitch perception has been interpreted as an indicator of increased local processing, probably as compensatory mechanism for the lack of global processing (central coherence). This latter hypothesis seems extremely attractive and future trials in larger cohorts of patients, possibly standardising the characteristics of the stimuli are a much-needed development. Finally, specificity of the role of the auditory derangement as opposed to other sensory channels needs to be assessed more systematically using multimodal stimuli in the same patient group. (c) 2006 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The activities of the Institute of Information Technologies in the area of automatic text processing are outlined. Major problems related to different steps of processing are pointed out together with the shortcomings of the existing solutions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The principal feature of ontology, which is developed for a text processing, is wider knowledge representation of an external world due to introduction of three-level hierarchy. It allows to improve semantic interpretation of natural language texts.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

As one of the most popular deep learning models, convolution neural network (CNN) has achieved huge success in image information extraction. Traditionally CNN is trained by supervised learning method with labeled data and used as a classifier by adding a classification layer in the end. Its capability of extracting image features is largely limited due to the difficulty of setting up a large training dataset. In this paper, we propose a new unsupervised learning CNN model, which uses a so-called convolutional sparse auto-encoder (CSAE) algorithm pre-Train the CNN. Instead of using labeled natural images for CNN training, the CSAE algorithm can be used to train the CNN with unlabeled artificial images, which enables easy expansion of training data and unsupervised learning. The CSAE algorithm is especially designed for extracting complex features from specific objects such as Chinese characters. After the features of articficial images are extracted by the CSAE algorithm, the learned parameters are used to initialize the first CNN convolutional layer, and then the CNN model is fine-Trained by scene image patches with a linear classifier. The new CNN model is applied to Chinese scene text detection and is evaluated with a multilingual image dataset, which labels Chinese, English and numerals texts separately. More than 10% detection precision gain is observed over two CNN models.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This research pursued the conceptualization, implementation, and verification of a system that enhances digital information displayed on an LCD panel to users with visual refractive errors. The target user groups for this system are individuals who have moderate to severe visual aberrations for which conventional means of compensation, such as glasses or contact lenses, does not improve their vision. This research is based on a priori knowledge of the user's visual aberration, as measured by a wavefront analyzer. With this information it is possible to generate images that, when displayed to this user, will counteract his/her visual aberration. The method described in this dissertation advances the development of techniques for providing such compensation by integrating spatial information in the image as a means to eliminate some of the shortcomings inherent in using display devices such as monitors or LCD panels. Additionally, physiological considerations are discussed and integrated into the method for providing said compensation. In order to provide a realistic sense of the performance of the methods described, they were tested by mathematical simulation in software, as well as by using a single-lens high resolution CCD camera that models an aberrated eye, and finally with human subjects having various forms of visual aberrations. Experiments were conducted on these systems and the data collected from these experiments was evaluated using statistical analysis. The experimental results revealed that the pre-compensation method resulted in a statistically significant improvement in vision for all of the systems. Although significant, the improvement was not as large as expected for the human subject tests. Further analysis suggest that even under the controlled conditions employed for testing with human subjects, the characterization of the eye may be changing. This would require real-time monitoring of relevant variables (e.g. pupil diameter) and continuous adjustment in the pre-compensation process to yield maximum viewing enhancement.