Biblioteca Digital

9 resultados para Comparable corpora

Effect of Biaxial Stretching at Temperatures and Strain Histories Comparable to Injection Stretch Blow Moulding on Tensile Modulus for Pelyethylene Terephthalate (PET)

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Use of imatinib mesylate in elderly patients in Northern Ireland: evidence of comparable haematological and molecular responses to younger patients.

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Extending Zipf’s law to n-grams for large corpora

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Experiments show that for a large corpus, Zipf’s law does not hold for all rank of words: the frequencies fall below those predicted by Zipf’s law for ranks greater than about 5,000 word types in the English language and about 30,000 word types in the inflected languages Irish and Latin. It also does not hold for syllables or words in the syllable-based languages, Chinese or Vietnamese. However, when single words are combined together with word n-grams in one list and put in rank order, the frequency of tokens in the combined list extends Zipf’s law with a slope close to -1 on a log-log plot in all five languages. Further experiments have demonstrated the validity of this extension of Zipf’s law to n-grams of letters, phonemes or binary bits in English. It is shown theoretically that probability theory
alone can predict this behavior in randomly created n-grams of binary bits.

Veja mais

Editor Proceedings of the LREC Satellite Workshop Corpora for Research on Emotion and Affect Genoa

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Editor Proceedings of LREC Satellite Workshop on Corpora for Research on Emotion and Affect Marrakesh

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Smiling Virtual Characters Corpora

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To create smiling virtual characters, the different morphological and dynamic characteristics of the virtual characters smiles and the impact of the virtual characters smiling behavior on the users need to be identified. For this purpose, we have collected two corpora: one directly created by users and the other resulting from the interaction between virtual characters and users. We present in details these two corpora in the article.

Veja mais

Detection of avian influenza virus by fluorescent DNA barcode-based immunoassay with sensitivity comparable to PCR

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a coupling of fluorophore-DNA barcode and bead-based immunoassay for detecting avian influenza virus (AIV) with PCR-like sensitivity is reported. The assay is based on the use of sandwich immunoassay and fluorophore-tagged oligonucleotides as representative barcodes. The detection involves the sandwiching of the target AIV between magnetic immunoprobes and barcode-carrying immunoprobes. Because each barcode-carrying immunoprobe is functionalized with a multitude of fluorophore-DNA barcode strands, many DNA barcodes are released for each positive binding event resulting in amplification of the signal. Using an inactivated H16N3 AIV as a model, a linear response over five orders of magnitude was obtained, and the sensitivity of the detection was comparable to conventional RT-PCR. Moreover, the entire detection required less than 2 hr. The results indicate that the method has great potential as an alternative for surveillance of epidemic outbreaks caused by AIV, other viruses and microorganisms.

Veja mais

Fast Mining of Interesting Phrases from Subsets of Text Corpora

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of mining interesting phrases from subsets of a text corpus where the subset is specified using a set of features such as keywords that form a query. Previous algorithms for the problem have proposed solutions that involve sifting through a phrase dictionary based index or a document-based index where the solution is linear in either the phrase dictionary size or the size of the document subset. We propose the usage of an independence assumption between query keywords given the top correlated phrases, wherein the pre-processing could be reduced to discovering phrases from among the top phrases per each feature in the query. We then outline an indexing mechanism where per-keyword phrase lists are stored either in disk or memory, so that popular aggregation algorithms such as No Random Access and Sort-merge Join may be adapted to do the scoring at real-time to identify the top interesting phrases. Though such an approach is expected to be approximate, we empirically illustrate that very high accuracies (of over 90%) are achieved against the results of exact algorithms. Due to the simplified list-aggregation, we are also able to provide response times that are orders of magnitude better than state-of-the-art algorithms. Interestingly, our disk-based approach outperforms the in-memory baselines by up to hundred times and sometimes more, confirming the superiority of the proposed method.

Veja mais

Applying Machine Learning Methods to Text Corpora and Case Bases

Relevância:

20.00% 20.00%

Publicador:

Veja mais

9 resultados para Comparable corpora

Filtro por publicador