996 resultados para Word Classification


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Major Depressive Disorder (MDD) has been associated with biased processing and abnormal regulation of negative and positive information, which may result from compromised coordinated activity of prefrontal and subcortical brain regions involved in evaluating emotional information. We tested whether patients with MDD show distributed changes in functional connectivity with a set of independently derived brain networks that have shown high correspondence with different task demands, including stimulus salience and emotional processing. We further explored if connectivity during emotional word processing related to the tendency to engage in positive or negative emotional states. In this study, 25 medication-free MDD patients without current or past comorbidity and matched controls (n=25) performed an emotional word-evaluation task during functional MRI. Using a dual regression approach, individual spatial connectivity maps representing each subject’s connectivity with each standard network were used to evaluate between-group differences and effects of positive and negative emotionality (extraversion and neuroticism, respectively, as measured with the NEO-FFI). Results showed decreased functional connectivity of the medial prefrontal cortex, ventrolateral prefrontal cortex, and ventral striatum with the fronto-opercular salience network in MDD patients compared to controls. In patients, abnormal connectivity was related to extraversion, but not neuroticism. These results confirm the hypothesis of a relative (para)limbic-cortical decoupling that may explain dysregulated affect in MDD. As connectivity of these regions with the salience network was related to extraversion, but not to general depression severity or negative emotionality, dysfunction of this network may be responsible for the failure to sustain engagement in rewarding behavior.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Na presente pesquisa de campo, abordou-se, primeir~ mente, o estudo da comunicação entre os homens por códigos verbais, as relações entre a linguagem e o pensamento e a visão do mundo percebida pelo individuo ou pela comunidade através do uso da lin guagem. A seguir examinou-se o sistema escola, comprometido com a transmissão de uma determinada cosmovisão, caracteristica de uma nacionalidade dada, e/ou com a manutenção de valores e modos de vi da que identificam uma certa comunidade. As considerações feitas justificam a importância dada à alfabetização e aos programas de alfabetização em massa nos paises em desenvolvimento, c omo o Brasil. Levando-se em conta estas colocações, planejou-se investigar o vocabulário corrente de trinta e sete alunos do MOBRAL, em Nova Friburgo, relacionando-o com os indices sociais dos informantes (procedência, anos de vida na área geográfica considerada, idade, sexo e profissão), com as variáres temáticas (alimentação, saúde/ doença, profissão/afazeres, expectativas de vida, lembranças de vi da, lazer/diversões), escolhidas após sondagem prévia das condições de vida e dos interesses dos informantes, e com as variáveis lingüisticas, ou seja, as classes de palavras. Neste estudo explQ ratório, propôs-se ainda verificar em que medida o material escri to dos livros de leitura continuada do MOBRAL e dos jornais clas se A (Jornal do Brasil) e C (O Dia e Oltima Hora) se relacionam com o vocabulário utilizado pelos alunos do MOBRAL, em Nova Fribur go. Visando ao levantamento do vocabulário dos entrevistados,foram gravadas cinqüenta falas de acordo com a metodologia utilizada em trabalhos de natureza sociolingüistica . Os dados obtidos neste "corpus" gravado foram anali sados quantitativamente, aplicando-se um programa computacional cQ nhecido como SPSS. O estudo das rel ações entre as variáveis(classi ficação morfológica, tema, idade, sexo, profissão) conduziu à for mação de tabelas de contingência multivariada . A análise dos resultados ofereceu algumas conclusões como o uso constante de substantivos e verbos nas elocuções . Embora se tenha introduzido a técnica de captar as palavras disponíveis durante as entrevistas, não foi alterado o número de substan I tivos nesta pesquisa, porque os informantes não indicaram o nome das coisas isoladamente, fizeram-no por enunciados completos. Ba seando-se neste resultado, propuseram-se algumas sugestões de in teresse pedagógico para utilização do MOBRAL: a primeira -- nao en fatizar os nomes (processo estático da língua) em detrimento dos verkos (processo dinâmico); a segunda -- o uso de frases nas estra tégias de alfabetização. No exame das relações entre as variáveis, a grande variação detectada deveu-se ao tema. Quando se investigou a variedade dos vocábulos usa dos pelos entrevistados em Nova Friburgo, observou-se que das 11.337 ocorrências de substantivos, encontraram-se 2.222 substanti vos diferentes e 1.590 vocábulos; das 17.604 ocorrências de verbos, encontraram-se 2.365 verbos diferentes e 588 vocábulos; das 1.980 ocorrências de adjetivo~, encontraram-se 660 adjetivos diferentes e 488 vocábulos. Concluiu-se que o vocabulário deste grupo pode ser diferente dos de outras áreas, dos de outras cfu~adas sociais inse ridas em outros contextos, mas não é limitado, nem re stri to.Expre~ sa a visão e a expectativa do mundo que os cerca. Outra recomendação ao MOBRAL: a escolha das palavras a ensinar seria colhida nas diversas comunidades, onde funcionam as classes de alfabetização e a motivação para sua seleção deveria estar ligada às necessidades cotidianas dos adultos com a palavra geradora integrada em frases.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Internet traffic classification is a relevant and mature research field, anyway of growing importance and with still open technical challenges, also due to the pervasive presence of Internet-connected devices into everyday life. We claim the need for innovative traffic classification solutions capable of being lightweight, of adopting a domain-based approach, of not only concentrating on application-level protocol categorization but also classifying Internet traffic by subject. To this purpose, this paper originally proposes a classification solution that leverages domain name information extracted from IPFIX summaries, DNS logs, and DHCP leases, with the possibility to be applied to any kind of traffic. Our proposed solution is based on an extension of Word2vec unsupervised learning techniques running on a specialized Apache Spark cluster. In particular, learning techniques are leveraged to generate word-embeddings from a mixed dataset composed by domain names and natural language corpuses in a lightweight way and with general applicability. The paper also reports lessons learnt from our implementation and deployment experience that demonstrates that our solution can process 5500 IPFIX summaries per second on an Apache Spark cluster with 1 slave instance in Amazon EC2 at a cost of $ 3860 year. Reported experimental results about Precision, Recall, F-Measure, Accuracy, and Cohen's Kappa show the feasibility and effectiveness of the proposal. The experiments prove that words contained in domain names do have a relation with the kind of traffic directed towards them, therefore using specifically trained word embeddings we are able to classify them in customizable categories. We also show that training word embeddings on larger natural language corpuses leads improvements in terms of precision up to 180%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Schizophrenia has been associated with semantic memory impairment and previous studies report a difficulty in accessing semantic category exemplars (Moelter et al. 2005 Schizophr Res 78:209–217). The anterior temporal cortex (ATC) has been implicated in the representation of semantic knowledge (Rogers et al. 2004 Psychol Rev 111(1):205–235). We conducted a high-field (4T) fMRI study with the Category Judgment and Substitution Task (CJAST), an analogue of the Hayling test. We hypothesised that differential activation of the temporal lobe would be observed in schizophrenia patients versus controls. Methods Eight schizophrenia patients (7M : 1F) and eight matched controls performed the CJAST, involving a randomised series of 55 common nouns (from five semantic categories) across three conditions: semantic categorisation, anomalous categorisation and word reading. High-resolution 3D T1-weighted images and GE EPI with BOLD contrast and sparse temporal sampling were acquired on a 4T Bruker MedSpec system. Image processing and analyses were performed with SPM2. Results Differential activation in the left ATC was found for anomalous categorisation relative to category judgment, in patients versus controls. Conclusions We examined semantic memory deficits in schizophrenia using a novel fMRI task. Since the ATC corresponds to an area involved in accessing abstract semantic representations (Moelter et al. 2005), these results suggest schizophrenia patients utilise the same neural network as healthy controls, however it is compromised in the patients and the different ATC activity might be attributable to weakening of category-to-category associations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Suffix separation plays a vital role in improving the quality of training in the Statistical Machine Translation from English into Malayalam. The morphological richness and the agglutinative nature of Malayalam make it necessary to retrieve the root word from its inflected form in the training process. The suffix separation process accomplishes this task by scrutinizing the Malayalam words and by applying sandhi rules. In this paper, various handcrafted rules designed for the suffix separation process in the English Malayalam SMT are presented. A classification of these rules is done based on the Malayalam syllable preceding the suffix in the inflected form of the word (check_letter). The suffixes beginning with the vowel sounds like ആല, ഉെെ, ഇല etc are mainly considered in this process. By examining the check_letter in a word, the suffix separation rules can be directly applied to extract the root words. The quick look up table provided in this paper can be used as a guideline in implementing suffix separation in Malayalam language

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This investigation moves beyond the traditional studies of word reading to identify how the production complexity of words affects reading accuracy in an individual with deep dyslexia (JO). We examined JO’s ability to read words aloud while manipulating both the production complexity of the words and the semantic context. The classification of words as either phonetically simple or complex was based on the Index of Phonetic Complexity. The semantic context was varied using a semantic blocking paradigm (i.e., semantically blocked and unblocked conditions). In the semantically blocked condition words were grouped by semantic categories (e.g., table, sit, seat, couch,), whereas in the unblocked condition the same words were presented in a random order. JO’s performance on reading aloud was also compared to her performance on a repetition task using the same items. Results revealed a strong interaction between word complexity and semantic blocking for reading aloud but not for repetition. JO produced the greatest number of errors for phonetically complex words in semantically blocked condition. This interaction suggests that semantic processes are constrained by output production processes which are exaggerated when derived from visual rather than auditory targets. This complex relationship between orthographic, semantic, and phonetic processes highlights the need for word recognition models to explicitly account for production processes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Complex networks have been employed to model many real systems and as a modeling tool in a myriad of applications. In this paper, we use the framework of complex networks to the problem of supervised classification in the word disambiguation task, which consists in deriving a function from the supervised (or labeled) training data of ambiguous words. Traditional supervised data classification takes into account only topological or physical features of the input data. On the other hand, the human (animal) brain performs both low- and high-level orders of learning and it has facility to identify patterns according to the semantic meaning of the input data. In this paper, we apply a hybrid technique which encompasses both types of learning in the field of word sense disambiguation and show that the high-level order of learning can really improve the accuracy rate of the model. This evidence serves to demonstrate that the internal structures formed by the words do present patterns that, generally, cannot be correctly unveiled by only traditional techniques. Finally, we exhibit the behavior of the model for different weights of the low- and high-level classifiers by plotting decision boundaries. This study helps one to better understand the effectiveness of the model. Copyright (C) EPLA, 2012

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Reading and listening involve complex psychological processes that recruit many brain areas. The anatomy of processing English words has been studied by a variety of imaging methods. Although there is widespread agreement on the general anatomical areas involved in comprehending words, there are still disputes about the computations that go on in these areas. Examination of the time relations (circuitry) among these anatomical areas can aid in understanding their computations. In this paper, we concentrate on tasks that involve obtaining the meaning of a word in isolation or in relation to a sentence. Our current data support a finding in the literature that frontal semantic areas are active well before posterior areas. We use the subject’s attention to amplify relevant brain areas involved either in semantic classification or in judging the relation of the word to a sentence to test the hypothesis that frontal areas are concerned with lexical semantics and posterior areas are more involved in comprehension of propositions that involve several words.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study aimed to investigate the acute effects of mild Traumatic Brain Injury (mTBI) on the performance of a finger tapping and word repetition dual task in order to determine working memory impairment in mTBI Sixty-four (50 male, 14 female) right-handed cases of mTBI and 26 (18 male and 8 female) right-handed cases of orthopaedic injuries were tested within 24 hours of injury. Patients with mTBI completed fewer correct taps in 10 seconds than patients with orthopaedic injuries, and female mTBI cases repeated fewer words. The size of the dual task decrement did not vary between groups. When added to a test battery including the Rapid Screen of Concussion (RSC; Comerford, Geffen, May, Medland T Geffen, 2002) and the Digit Symbol Substitution Test,finger tapping speed accounted for 1% of between groups variance and did not improve classification rates of male participants. While the addition of tapping rate did not improve the sensitivity and specificity of the RSC and DSST to mTBI in males, univariate analysis of motor performance in females indicated. that dual task performance might be diagnostic. An increase in female sample Size is warranted. These results confirm the view that there is a generalized slowing of processing ability following mTBI.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We propose a novel framework where an initial classifier is learned by incorporating prior information extracted from an existing sentiment lexicon. Preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. Documents classified with high confidence are then used as pseudo-labeled examples for automatical domain-specific feature acquisition. The word-class distributions of such self-learned features are estimated from the pseudo-labeled examples and are used to train another classifier by constraining the model's predictions on unlabeled instances. Experiments on both the movie review data and the multi-domain sentiment dataset show that our approach attains comparable or better performance than exiting weakly-supervised sentiment classification methods despite using no labeled documents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Joint sentiment-topic (JST) model was previously proposed to detect sentiment and topic simultaneously from text. The only supervision required by JST model learning is domain-independent polarity word priors. In this paper, we modify the JST model by incorporating word polarity priors through modifying the topic-word Dirichlet priors. We study the polarity-bearing topics extracted by JST and show that by augmenting the original feature space with polarity-bearing topics, the in-domain supervised classifiers learned from augmented feature representation achieve the state-of-the-art performance of 95% on the movie review data and an average of 90% on the multi-domain sentiment dataset. Furthermore, using feature augmentation and selection according to the information gain criteria for cross-domain sentiment classification, our proposed approach performs either better or comparably compared to previous approaches. Nevertheless, our approach is much simpler and does not require difficult parameter tuning.