3 resultados para Emails categorization
em Digital Peer Publishing
Resumo:
Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.
Resumo:
While spoken codeswitching (CS) among Latinos has received significant scholarly attention, few studies have examined written CS, specifically naturally-occurring CS in email. This study contributes to an under-studied area of Latino linguistic practices by reporting the results of a study of CS in the emails of five Spanish-English bilingual Latinos. Methods are employed that are not often used in discourse analysis of email texts, namely multi-dimensional scaling and tree diagrams, to explore the contextual parameters of written Spanish-English CS systematically. Consistent with the findings of other studies of CS in CMC, English use was most associated with professional or formal contacts, and use of Spanish, the participants’ native language, was linked to intimacy, informality, and group identification. Switches to Spanish functioned to personalize otherwise transactional or work-related English-dominant emails. The article also discusses novel orthographic and linguistic forms specific to the CMC context.
Resumo:
Given arbitrary pictures, we explore the possibility of using new techniques from computer vision and artificial intelligence to create customized visual games on-the-fly. This includes coloring books, link-the-dot and spot-the-difference popular games. The feasibility of these systems is discussed and we describe prototype implementation that work well in practice in an automatic or semi-automatic way.