46 resultados para corpora, terminologia, termini, estrazione automatica

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel phosphoramidite; N,N-diisopropylamino-2-cyanoethyl-ortho-methylbenzylphosphoramidite 1, was prepared. The reaction of 1 with DMTrT and subsequent derivatisation of the phosphite triester product under solution-phase, Michaelis–Arbuzov conditions was investigated. Coupling of 1 with the terminal hydroxyl groups of support-bound oligodeoxyribonucleotides and subsequent reaction with an activated disulfide yielded oligonucleotides bearing a terminal, phosphorothiolate-linked, lipophilic moiety. The oligomers were readily purified using RP-HPLC. Silver(I)-mediated cleavage of the phosphorothiolate linkage and desalting of the oligonucleotides were performed readily in one step to yield cleanly the corresponding phosphate monester-terminated oligomers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Experiments show that for a large corpus, Zipf’s law does not hold for all rank of words: the frequencies fall below those predicted by Zipf’s law for ranks greater than about 5,000 word types in the English language and about 30,000 word types in the inflected languages Irish and Latin. It also does not hold for syllables or words in the syllable-based languages, Chinese or Vietnamese. However, when single words are combined together with word n-grams in one list and put in rank order, the frequency of tokens in the combined list extends Zipf’s law with a slope close to -1 on a log-log plot in all five languages. Further experiments have demonstrated the validity of this extension of Zipf’s law to n-grams of letters, phonemes or binary bits in English. It is shown theoretically that probability theory
alone can predict this behavior in randomly created n-grams of binary bits.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To create smiling virtual characters, the different morphological and dynamic characteristics of the virtual characters smiles and the impact of the virtual characters smiling behavior on the users need to be identified. For this purpose, we have collected two corpora: one directly created by users and the other resulting from the interaction between virtual characters and users. We present in details these two corpora in the article.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a recent paper (Automatica 49 (2013) 2860–2866), the Wirtinger-based inequality has been introduced to derive tractable stability conditions for time-delay or sampled-data systems. We point out that there exist two errors in Theorem 8 for the stability analysis of sampled-data systems, and the correct theorem is presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of mining interesting phrases from subsets of a text corpus where the subset is specified using a set of features such as keywords that form a query. Previous algorithms for the problem have proposed solutions that involve sifting through a phrase dictionary based index or a document-based index where the solution is linear in either the phrase dictionary size or the size of the document subset. We propose the usage of an independence assumption between query keywords given the top correlated phrases, wherein the pre-processing could be reduced to discovering phrases from among the top phrases per each feature in the query. We then outline an indexing mechanism where per-keyword phrase lists are stored either in disk or memory, so that popular aggregation algorithms such as No Random Access and Sort-merge Join may be adapted to do the scoring at real-time to identify the top interesting phrases. Though such an approach is expected to be approximate, we empirically illustrate that very high accuracies (of over 90%) are achieved against the results of exact algorithms. Due to the simplified list-aggregation, we are also able to provide response times that are orders of magnitude better than state-of-the-art algorithms. Interestingly, our disk-based approach outperforms the in-memory baselines by up to hundred times and sometimes more, confirming the superiority of the proposed method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Ov/Br septin gene, which is also a fusion partner of MLL in acute myeloid leukaemia, is a member of a family of novel GTP binding proteins that have been implicated in cytokinesis and exocytosis. In this study, we describe the genomic and transcriptional organization of this gene, detailing seventeen exons distributed over 240 kb of sequence. Extensive database analyses identified orthologous rodent cDNAs that corresponded to new, unidentified 5' splice variants of the Ov/Br septin gene, increasing the total number of such variants to six. We report that splicing events, occurring at non-canonical sites within the body of the 3' terminal exon, remove either 1801 bp or 1849 bp of non-coding sequence and facilitate access to a secondary open reading frame of 44 amino acids maintained near the end of the 3' UTR. These events constitute a novel coding arrangement and represent the first report of such a design being implemented by a eukaryotic gene. The various Ov/Br proteins either differ minimally at their amino and carboxy termini or are equivalent to truncated versions of larger isoforms. Northern analysis with an Ov/Br septin 3' UTR probe reveals three transcripts of 4.4, 4 and 3 kb, the latter being restricted to a sub-set of the tissues tested. Investigation of the identified Ov/Br septin isoforms by RT-PCR confirms a complex transcriptional pattern, with several isoforms showing tissue-specific distribution. To date, none of the other human septins have demonstrated such transcriptional complexity.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Members of the evolutionarily conserved septin family of genes are emerging as key components of several cellular processes including membrane trafficking, cytokinesis, and cell-cycle control events. SEPT9 has been shown to have a complex genomic architecture, such that up to 15 different isoforms are possible by the shuffling of five alternate amino termini and three alternate carboxy termini. Genomic and transcriptional alterations of SEPT9 have been associated with neoplasia. The present study has used a Sept9-specific antibody to determine the pattern of isoform expression in a range of tumour cell lines. Western blot analysis indicated considerable variation in the relative amounts and isoform content of Sept9. Immunofluorescence studies showed a range of patterns of cytoplasmic localization ranging from mainly particulate to mainly filamentous. Expression constructs were also generated for each amino terminal isoform to investigate the patterns of localization of individual isoforms and the effects on cells of ectopic expression. The present study shows that the epsilon isoform appears filamentous in this overexpression system while the remaining isoforms are particulate and cytoplasmic. Transient transfection of individual constructs into tumour cell lines results in cell-cycle perturbation with a G2/M arrest and dramatic growth suppression, which was greatest in cell lines with the lowest amounts of endogenous Sept9. Similar phenotypic observations were made with GTP-binding mutants of all five N-terminal variants of Sept9. However, dramatic differences were observed in the kinetics of accumulation of wild-type versus mutant septin protein in transfected cells. In conclusion, the present study shows that the expression patterns of Sept9 protein are very varied in a panel of tumour cell lines and the functional studies are consistent with a model of septin function as a component of a molecular scaffold that contributes to diverse cellular functions. Alterations in the levels of Sept9 protein by overexpression of individual isoforms can clearly perturb cellular behaviour and may thus provide a mechanistic explanation for observations of deranged septin expression in neoplasia. Copyright © 2004 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper investigates the two-stage stepwise identification for a class of nonlinear dynamic systems that can be described by linear-in-the-parameters models, and the model has to be built from a very large pool of basis functions or model terms. The main objective is to improve the compactness of the model that is obtained by the forward stepwise methods, while retaining the computational efficiency. The proposed algorithm first generates an initial model using a forward stepwise procedure. The significance of each selected term is then reviewed at the second stage and all insignificant ones are replaced, resulting in an optimised compact model with significantly improved performance. The main contribution of this paper is that these two stages are performed within a well-defined regression context, leading to significantly reduced computational complexity. The efficiency of the algorithm is confirmed by the computational complexity analysis, and its effectiveness is demonstrated by the simulation results.