Biblioteca Digital

722 resultados para corpora de aprendizes

Combining multiple high quality corpora for improving HMM-TTS

Relevância:

20.00% 20.00%

Publicador:

Veja mais

the application of the comparable corpora in chinese-english cross-lingual information retrieval

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Extending Zipf’s law to n-grams for large corpora

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Experiments show that for a large corpus, Zipf’s law does not hold for all rank of words: the frequencies fall below those predicted by Zipf’s law for ranks greater than about 5,000 word types in the English language and about 30,000 word types in the inflected languages Irish and Latin. It also does not hold for syllables or words in the syllable-based languages, Chinese or Vietnamese. However, when single words are combined together with word n-grams in one list and put in rank order, the frequency of tokens in the combined list extends Zipf’s law with a slope close to -1 on a log-log plot in all five languages. Further experiments have demonstrated the validity of this extension of Zipf’s law to n-grams of letters, phonemes or binary bits in English. It is shown theoretically that probability theory
alone can predict this behavior in randomly created n-grams of binary bits.

Veja mais

Editor Proceedings of the LREC Satellite Workshop Corpora for Research on Emotion and Affect Genoa

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Editor Proceedings of LREC Satellite Workshop on Corpora for Research on Emotion and Affect Marrakesh

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Smiling Virtual Characters Corpora

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To create smiling virtual characters, the different morphological and dynamic characteristics of the virtual characters smiles and the impact of the virtual characters smiling behavior on the users need to be identified. For this purpose, we have collected two corpora: one directly created by users and the other resulting from the interaction between virtual characters and users. We present in details these two corpora in the article.

Veja mais

Fast Mining of Interesting Phrases from Subsets of Text Corpora

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of mining interesting phrases from subsets of a text corpus where the subset is specified using a set of features such as keywords that form a query. Previous algorithms for the problem have proposed solutions that involve sifting through a phrase dictionary based index or a document-based index where the solution is linear in either the phrase dictionary size or the size of the document subset. We propose the usage of an independence assumption between query keywords given the top correlated phrases, wherein the pre-processing could be reduced to discovering phrases from among the top phrases per each feature in the query. We then outline an indexing mechanism where per-keyword phrase lists are stored either in disk or memory, so that popular aggregation algorithms such as No Random Access and Sort-merge Join may be adapted to do the scoring at real-time to identify the top interesting phrases. Though such an approach is expected to be approximate, we empirically illustrate that very high accuracies (of over 90%) are achieved against the results of exact algorithms. Due to the simplified list-aggregation, we are also able to provide response times that are orders of magnitude better than state-of-the-art algorithms. Interestingly, our disk-based approach outperforms the in-memory baselines by up to hundred times and sometimes more, confirming the superiority of the proposed method.

Veja mais

Applying Machine Learning Methods to Text Corpora and Case Bases

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Coreference resolution for portuguese using parallel corpora word alignment

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A área de Extração da Informação tem como objetivo essencial investigar métodos e técnicas para transformar a informação não estruturada presente em textos de língua natural em dados estruturados. Um importante passo deste processo é a resolução de correferência, tarefa que identifica diferentes sintagmas nominais que se referem a mesma entidade no discurso. A área de estudos sobre resolução de correferência tem sido extensivamente pesquisada para a Língua Inglesa (Ng, 2010) lista uma série de estudos da área, entretanto tem recebido menos atenção em outras línguas. Isso se deve ao fato de que a grande maioria das abordagens utilizadas nessas pesquisas são baseadas em aprendizado de máquina e, portanto, requerem uma extensa quantidade de dados anotados.

Veja mais

Contribuição para o estudo da construção e utilização de corpora no processo de terminologização

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Polissema: Revista de Letras do ISCAP 2002/N.º 2 Linguagens

Veja mais

O uso de corpora na análise da representação do discurso em tradução

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Este artigo apresenta uma pesquisa sobre a representação do discurso ficcional embasado na gramática sistêmico - funcional proposta por Halliday e na Lingüística de Corpus, utilizando-se o software WordSmith Tools. A análise focaliza a metafunção ideacional, realizada pelo sistema de transitividade, focalizando os processos mentais e a relação lógico - semântica da projeção. O objetivo da pesquisa foi observar como os pensamentos das personagens de um corpus ficcional são representados através dos verbos de elocução THINK e PENSAR, buscando descrever padrões textuais nos três romances que compõem o corpus.

Veja mais

Qual a função dos corpora na descrição do léxico?

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Neste artigo procuramos reflectir sobre a função dos corpora na observação e análise de fenómenos de uma língua natural bem como na criação de novos recursos de exploração linguísticos que as tecnologias de informação têm vindo a potenciar e a tornar mais eficaz.

Veja mais

Polissemia nominal diacrônica. Do conceito ao linguístico: relações lexicias a partir dos corpora de especialidade

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tese apresentada para cumprimento dos requisitos necessários à obtenção do grau de Doutor em Linguística – Lexicologia, Lexicografia e Terminologia e e Tese apresentada para cumprimento dos requisitos necessários à obtenção do grau de Doutor em Filologia e Língua Portugesa na Faculdade de Filosofia Letras e Ciências Humanas da Universidade de São Paulo

Veja mais

Corpora and the teaching of languages and linguistics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Se analizan algunas investigaciones recientes que exploran las potenciales ventajas y el uso efectivo de un método de enseñanza del ingles como lengua extranjera y la enseñanza de la lingüística inglesa.

Veja mais

Assessing the contribution of corpora to EAP practice

Relevância:

20.00% 20.00%

Publicador:

Veja mais

722 resultados para corpora de aprendizes

Filtro por publicador