996 resultados para Word Classification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Explaining the diversity of languages across the world is one of the central aims of typological, historical, and evolutionary linguistics. We consider the effect of language contact-the number of non-native speakers a language has-on the way languages change and evolve. By analysing hundreds of languages within and across language families, regions, and text types, we show that languages with greater levels of contact typically employ fewer word forms to encode the same information content (a property we refer to as lexical diversity). Based on three types of statistical analyses, we demonstrate that this variance can in part be explained by the impact of non-native speakers on information encoding strategies. Finally, we argue that languages are information encoding systems shaped by the varying needs of their speakers. Language evolution and change should be modeled as the co-evolution of multiple intertwined adaptive systems: On one hand, the structure of human societies and human learning capabilities, and on the other, the structure of language.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The objective of this article is to study the problem of pedestrian classification across different light spectrum domains (visible and far-infrared (FIR)) and modalities (intensity, depth and motion). In recent years, there has been a number of approaches for classifying and detecting pedestrians in both FIR and visible images, but the methods are difficult to compare, because either the datasets are not publicly available or they do not offer a comparison between the two domains. Our two primary contributions are the following: (1) we propose a public dataset, named RIFIR , containing both FIR and visible images collected in an urban environment from a moving vehicle during daytime; and (2) we compare the state-of-the-art features in a multi-modality setup: intensity, depth and flow, in far-infrared over visible domains. The experiments show that features families, intensity self-similarity (ISS), local binary patterns (LBP), local gradient patterns (LGP) and histogram of oriented gradients (HOG), computed from FIR and visible domains are highly complementary, but their relative performance varies across different modalities. In our experiments, the FIR domain has proven superior to the visible one for the task of pedestrian classification, but the overall best results are obtained by a multi-domain multi-modality multi-feature fusion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We know that from mid-childhood onwards most new words are learned implicitly via reading; however, most word learning studies have taught novel items explicitly. We examined incidental word learning during reading by focusing on the well-documented finding that words which are acquired early in life are processed more quickly than those acquired later. Novel words were embedded in meaningful sentences and were presented to adult readers early (day 1) or later (day 2) during a five-day exposure phase. At test adults read the novel words in semantically neutral sentences. Participants’ eye movements were monitored throughout exposure and test. Adults also completed a surprise memory test in which they had to match each novel word with its definition. Results showed a decrease in reading times for all novel words over exposure, and significantly longer total reading times at test for early than late novel words. Early-presented novel words were also remembered better in the offline test. Our results show that order of presentation influences processing time early in the course of acquiring a new word, consistent with partial and incremental growth in knowledge occurring as a function of an individual’s experience with each word.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While eye movements have been used widely to investigate how skilled adult readers process written language, relatively little research has used this methodology with children. This is unfortunate as, as we discuss here, eye-movement studies have significant potential to inform our understanding of children’s reading development. We consider some of the empirical and theoretical issues that arise when using this methodology with children, illustrating our points with data from an experiment examining word frequency effects in 8-year-old children’s sentence reading. Children showed significantly longer gaze durations to low than high-frequency words, demonstrating that linguistic characteristics of text drive children’s eye movements as they read. We discuss these findings within the broader context of how eye-movement studies can inform our understanding of children’s reading, and can assist with the development of appropriately targeted interventions to support children as they learn to read.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present study examined the effects of word length on children’s eye movement behaviour when other variables were carefully controlled. Importantly, the results showed that word length influenced children’s reading times and fixation positions on words. Furthermore, children exhibited stronger word length effects than adults in gaze durations and refixations. Adults and children generally did not differ in initial landing positions, but did differ in refixation behaviour. Overall, the results indicated that while adults and children show similar effects of word length for early measures of eye movement behaviour, differences emerge in later measures.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parkinson is a neurodegenerative disease, in which tremor is the main symptom. This paper investigates the use of different classification methods to identify tremors experienced by Parkinsonian patients.Some previous research has focussed tremor analysis on external body signals (e.g., electromyography, accelerometer signals, etc.). Our advantage is that we have access to sub-cortical data, which facilitates the applicability of the obtained results into real medical devices since we are dealing with brain signals directly. Local field potentials (LFP) were recorded in the subthalamic nucleus of 7 Parkinsonian patients through the implanted electrodes of a deep brain stimulation (DBS) device prior to its internalization. Measured LFP signals were preprocessed by means of splinting, down sampling, filtering, normalization and rec-tification. Then, feature extraction was conducted through a multi-level decomposition via a wavelettrans form. Finally, artificial intelligence techniques were applied to feature selection, clustering of tremor types, and tremor detection.The key contribution of this paper is to present initial results which indicate, to a high degree of certainty, that there appear to be two distinct subgroups of patients within the group-1 of patients according to the Consensus Statement of the Movement Disorder Society on Tremor. Such results may well lead to different resultant treatments for the patients involved, depending on how their tremor has been classified. Moreover, we propose a new approach for demand driven stimulation, in which tremor detection is also based on the subtype of tremor the patient has. Applying this knowledge to the tremor detection problem, it can be concluded that the results improve when patient clustering is applied prior to detection.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Electronic word-of-mouth (eWOM) is recognised as a means of interpersonal communication and a powerful marketing tool. However, previous studies have focussed on related motivations, and limited attention has been given to understanding the antecedents of eWOM communication behaviour in the travel industry. This study proposes a full and partial mediation model, which brings together for the first time three key antecedents: adoption of electronic communication technology, consumer dis/satisfaction with travel consumption experience, and subjective norm. The model aims to understand the impact of these antecedents on travellers' attitude towards eWOM communication and intention to use eWOM communication media. The data were collected from international travellers (n = 524), and structural equation modelling is used to test the conceptual framework. The findings of the study suggest that overall attitude towards eWOM communication partially mediates the impact of the traveller's adoption of electronic communication technology and subjective norm, and fully mediates the impact of consumer dis/satisfaction with travel consumption experience on travellers' intention to use eWOM communication media.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work investigates the problem of feature selection in neuroimaging features from structural MRI brain images for the classification of subjects as healthy controls, suffering from Mild Cognitive Impairment or Alzheimer’s Disease. A Genetic Algorithm wrapper method for feature selection is adopted in conjunction with a Support Vector Machine classifier. In very large feature sets, feature selection is found to be redundant as the accuracy is often worsened when compared to an Support Vector Machine with no feature selection. However, when just the hippocampal subfields are used, feature selection shows a significant improvement of the classification accuracy. Three-class Support Vector Machines and two-class Support Vector Machines combined with weighted voting are also compared with the former and found more useful. The highest accuracy achieved at classifying the test data was 65.5% using a genetic algorithm for feature selection with a three-class Support Vector Machine classifier.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The personalised conditioning system (PCS) is widely studied. Potentially, it is able to reduce energy consumption while securing occupants’ thermal comfort requirements. It has been suggested that automatic optimised operation schemes for PCS should be introduced to avoid energy wastage and discomfort caused by inappropriate operation. In certain automatic operation schemes, personalised thermal sensation models are applied as key components to help in setting targets for PCS operation. In this research, a novel personal thermal sensation modelling method based on the C-Support Vector Classification (C-SVC) algorithm has been developed for PCS control. The personal thermal sensation modelling has been regarded as a classification problem. During the modelling process, the method ‘learns’ an occupant’s thermal preferences from his/her feedback, environmental parameters and personal physiological and behavioural factors. The modelling method has been verified by comparing the actual thermal sensation vote (TSV) with the modelled one based on 20 individual cases. Furthermore, the accuracy of each individual thermal sensation model has been compared with the outcomes of the PMV model. The results indicate that the modelling method presented in this paper is an effective tool to model personal thermal sensations and could be integrated within the PCS for optimised system operation and control.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The subtle juncture cues in older varieties of English such as Received Pronunciation can be difficult for speakers of new English varieties to perceive. This study looks at the perception of word juncture characteristics in three varieties of English (British, Hong Kong and Singapore) among British, Hong Kong and Singaporean listeners in order to widen our understanding of English juncture characteristics in general. We find that, even though reaction time data indicates that listeners perform quickest in the variety they are most familiar with, not only are juncture differences in British English difficult for Hong Kong and Singaporean listeners to perceive, they are also the most difficult for British listeners. Juncture characteristics in Hong Kong English are the easiest to distinguish among the three varieties.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sea-ice concentrations in the Laptev Sea simulated by the coupled North Atlantic-Arctic Ocean-Sea-Ice Model and Finite Element Sea-Ice Ocean Model are evaluated using sea-ice concentrations from Advanced Microwave Scanning Radiometer-Earth Observing System satellite data and a polynya classification method for winter 2007/08. While developed to simulate largescale sea-ice conditions, both models are analysed here in terms of polynya simulation. The main modification of both models in this study is the implementation of a landfast-ice mask. Simulated sea-ice fields from different model runs are compared with emphasis placed on the impact of this prescribed landfast-ice mask. We demonstrate that sea-ice models are not able to simulate flaw polynyas realistically when used without fast-ice description. Our investigations indicate that without landfast ice and with coarse horizontal resolution the models overestimate the fraction of open water in the polynya. This is not because a realistic polynya appears but due to a larger-scale reduction of ice concentrations and smoothed ice-concentration fields. After implementation of a landfast-ice mask, the polynya location is realistically simulated but the total open-water area is still overestimated in most cases. The study shows that the fast-ice parameterization is essential for model improvements. However, further improvements are necessary in order to progress from the simulation of large-scale features in the Arctic towards a more detailed simulation of smaller-scaled features (here polynyas) in an Arctic shelf sea.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Creating non-word lists is a necessary but time consuming exercise often needed when conducting behavioural language tasks involving lexical decision-making or non-word reading. The following article describes the process whereby we created a list of 226 non-words matching 226 of the Snodgrass picture set (Snodgrass & Vanderwart, 1980).The non-words were matched for number of syllables, stress pattern, number of phonemes, bigram count and presence and location of the target sound when relevant.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Epidendrum L. is the largest genus of Orchidaceae in the Neotropical region; it has an impressive morphological diversification, which imposes difficulties in delimitation of both infrageneric and interspecific boundaries. In this study, we review infrageneric boundaries within the subgenus Amphiglottium and try to contribute to the understanding of morphological diversification and taxa delimitation within this group. We tested the monophyly of the subgenus Amphiglottium sect. Amphiglottium, expanding previous phylogenetic investigations and reevaluated previous infrageneric classifications proposed. Sequence data from the trnL-trnF region were analyzed with both parsimony and maximum likelihood criteria. AFLP markers were also obtained and analyzed with phylogenetic and principal coordinate analyses. Additionally, we obtained chromosome numbers for representative species within the group. The results strengthen the monophyly of the subgenus Amphiglottium but do not support the current classification system proposed by previous authors. Only section Tuberculata comprises a well-supported monophyletic group, with sections Carinata and Integra not supported. Instead of morphology, biogeographical and ecological patterns are reflected in the phylogenetic signal in this group. This study also confirms the large variability of chromosome numbers for the subgenus Amphiglottium (numbers ranging from 2n = 24 to 2n = 240), suggesting that polyploidy and hybridization are probably important mechanisms of speciation within the group.