903 resultados para Audio-visual content classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A novel framework for multimodal semantic-associative collateral image labelling, aiming at associating image regions with textual keywords, is described. Both the primary image and collateral textual modalities are exploited in a cooperative and complementary fashion. The collateral content and context based knowledge is used to bias the mapping from the low-level region-based visual primitives to the high-level visual concepts defined in a visual vocabulary. We introduce the notion of collateral context, which is represented as a co-occurrence matrix, of the visual keywords, A collaborative mapping scheme is devised using statistical methods like Gaussian distribution or Euclidean distance together with collateral content and context-driven inference mechanism. Finally, we use Self Organising Maps to examine the classification and retrieval effectiveness of the proposed high-level image feature vector model which is constructed based on the image labelling results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Embodied theories of cognition propose that neural substrates used in experiencing the referent of a word, for example perceiving upward motion, should be engaged in weaker form when that word, for example ‘rise’, is comprehended. Motivated by the finding that the perception of irrelevant background motion at near-threshold, but not supra-threshold, levels interferes with task execution, we assessed whether interference from near-threshold background motion was modulated by its congruence with the meaning of words (semantic content) when participants completed a lexical decision task (deciding if a string of letters is a real word or not). Reaction times for motion words, such as ‘rise’ or ‘fall’, were slower when the direction of visual motion and the ‘motion’ of the word were incongruent — but only when the visual motion was at nearthreshold levels. When motion was supra-threshold, the distribution of error rates, not reaction times, implicated low-level motion processing in the semantic processing of motion words. As the perception of near-threshold signals is not likely to be influenced by strategies, our results support a close contact between semantic information and perceptual systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A unified view on the interfacial instability in a model of aluminium reduction cells in the presence of a uniform, vertical, background magnetic field is presented. The classification of instability modes is based on the asymptotic theory for high values of parameter β, which characterises the ratio of the Lorentz force based on the disturbance current, and gravity. It is shown that the spectrum of the travelling waves consists of two parts independent of the horizontal cross-section of the cell: highly unstable wall modes and stable or weakly unstable centre, or Sele’s modes. The wall modes with the disturbance of the interface being localised at the sidewalls of the cell dominate the dynamics of instability. Sele’s modes are characterised by a distributed disturbance over the whole horizontal extent of the cell. As β increases these modes are stabilized by the field.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We provide a unified framework for a range of linear transforms that can be used for the analysis of terahertz spectroscopic data, with particular emphasis on their application to the measurement of leaf water content. The use of linear transforms for filtering, regression, and classification is discussed. For illustration, a classification problem involving leaves at three stages of drought and a prediction problem involving simulated spectra are presented. Issues resulting from scaling the data set are discussed. Using Lagrange multipliers, we arrive at the transform that yields the maximum separation between the spectra and show that this optimal transform is equivalent to computing the Euclidean distance between the samples. The optimal linear transform is compared with the average for all the spectra as well as with the Karhunen–Loève transform to discriminate a wet leaf from a dry leaf. We show that taking several principal components into account is equivalent to defining new axes in which data are to be analyzed. The procedure shows that the coefficients of the Karhunen–Loève transform are well suited to the process of classification of spectra. This is in line with expectations, as these coefficients are built from the statistical properties of the data set analyzed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This essay traces the development of Otto Neurath’s ideas that led to the publication of one of the first series of children’s books produced by the Isotype Institute in the late 1940s, the Visual History of Mankind. Described in its publicity material as ‘new in content’ and ‘new in method’, it embodied much of Otto Neurath’s thinking about visual education, and also coincided with other educational ideas in the UK in the 1930s and 1940s. It exemplified the Isotype Institute’s approach: teamwork, thinking about the needs of younger readers, clear explanation, and accessible content. Further, drawing on correspondence, notes and drawings from the Otto and Marie Neurath Isotype Collection at the University of Reading, the essay presents insights to the making of the books and the people involved, the costs of production and the influence of this on design decisions, and how the books were received by teachers and children.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background: The analysis of the Auditory Brainstem Response (ABR) is of fundamental importance to the investigation of the auditory system behaviour, though its interpretation has a subjective nature because of the manual process employed in its study and the clinical experience required for its analysis. When analysing the ABR, clinicians are often interested in the identification of ABR signal components referred to as Jewett waves. In particular, the detection and study of the time when these waves occur (i.e., the wave latency) is a practical tool for the diagnosis of disorders affecting the auditory system. Significant differences in inter-examiner results may lead to completely distinct clinical interpretations of the state of the auditory system. In this context, the aim of this research was to evaluate the inter-examiner agreement and variability in the manual classification of ABR. Methods: A total of 160 ABR data samples were collected, for four different stimulus intensity (80dBHL, 60dBHL, 40dBHL and 20dBHL), from 10 normal-hearing subjects (5 men and 5 women, from 20 to 52 years). Four examiners with expertise in the manual classification of ABR components participated in the study. The Bland-Altman statistical method was employed for the assessment of inter-examiner agreement and variability. The mean, standard deviation and error for the bias, which is the difference between examiners’ annotations, were estimated for each pair of examiners. Scatter plots and histograms were employed for data visualization and analysis. Results: In most comparisons the differences between examiner’s annotations were below 0.1 ms, which is clinically acceptable. In four cases, it was found a large error and standard deviation (>0.1 ms) that indicate the presence of outliers and thus, discrepancies between examiners. Conclusions: Our results quantify the inter-examiner agreement and variability of the manual analysis of ABR data, and they also allows for the determination of different patterns of manual ABR analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Since their inception, Twitter and related microblogging systems have provided a rich source of information for researchers and have attracted interest in their affordances and use. Since 2009 PubMed has included 123 journal articles on medicine and Twitter, but no overview exists as to how the field uses Twitter in research. // Objective: This paper aims to identify published work relating to Twitter indexed by PubMed, and then to classify it. This classification will provide a framework in which future researchers will be able to position their work, and to provide an understanding of the current reach of research using Twitter in medical disciplines. Limiting the study to papers indexed by PubMed ensures the work provides a reproducible benchmark. // Methods: Papers, indexed by PubMed, on Twitter and related topics were identified and reviewed. The papers were then qualitatively classified based on the paper’s title and abstract to determine their focus. The work that was Twitter focused was studied in detail to determine what data, if any, it was based on, and from this a categorization of the data set size used in the studies was developed. Using open coded content analysis additional important categories were also identified, relating to the primary methodology, domain and aspect. // Results: As of 2012, PubMed comprises more than 21 million citations from biomedical literature, and from these a corpus of 134 potentially Twitter related papers were identified, eleven of which were subsequently found not to be relevant. There were no papers prior to 2009 relating to microblogging, a term first used in 2006. Of the remaining 123 papers which mentioned Twitter, thirty were focussed on Twitter (the others referring to it tangentially). The early Twitter focussed papers introduced the topic and highlighted the potential, not carrying out any form of data analysis. The majority of published papers used analytic techniques to sort through thousands, if not millions, of individual tweets, often depending on automated tools to do so. Our analysis demonstrates that researchers are starting to use knowledge discovery methods and data mining techniques to understand vast quantities of tweets: the study of Twitter is becoming quantitative research. // Conclusions: This work is to the best of our knowledge the first overview study of medical related research based on Twitter and related microblogging. We have used five dimensions to categorise published medical related research on Twitter. This classification provides a framework within which researchers studying development and use of Twitter within medical related research, and those undertaking comparative studies of research relating to Twitter in the area of medicine and beyond, can position and ground their work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In The Conduct of Inquiry in International Relations, Patrick Jackson situates methodologies in International Relations in relation to their underlying philosophical assumptions. One of his aims is to map International Relations debates in a way that ‘capture[s] current controversies’ (p. 40). This ambition is overstated: whilst Jackson’s typology is useful as a clarificatory tool, (re)classifying existing scholarship in International Relations is more problematic. One problem with Jackson’s approach is that he tends to run together the philosophical assumptions which decisively differentiate his methodologies (by stipulating a distinctive warrant for knowledge claims) and the explanatory strategies that are employed to generate such knowledge claims, suggesting that the latter are entailed by the former. In fact, the explanatory strategies which Jackson associates with each methodology reflect conventional practice in International Relations just as much as they reflect philosophical assumptions. This makes it more difficult to identify each methodology at work than Jackson implies. I illustrate this point through a critical analysis of Jackson’s controversial reclassification of Waltz as an analyticist, showing that whilst Jackson’s typology helps to expose inconsistencies in Waltz’s approach, it does not fully support the proposed reclassification. The conventional aspect of methodologies in International Relations also raises questions about the limits of Jackson’s ‘engaged pluralism’.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prism is a modular classification rule generation method based on the ‘separate and conquer’ approach that is alternative to the rule induction approach using decision trees also known as ‘divide and conquer’. Prism often achieves a similar level of classification accuracy compared with decision trees, but tends to produce a more compact noise tolerant set of classification rules. As with other classification rule generation methods, a principle problem arising with Prism is that of overfitting due to over-specialised rules. In addition, over-specialised rules increase the associated computational complexity. These problems can be solved by pruning methods. For the Prism method, two pruning algorithms have been introduced recently for reducing overfitting of classification rules - J-pruning and Jmax-pruning. Both algorithms are based on the J-measure, an information theoretic means for quantifying the theoretical information content of a rule. Jmax-pruning attempts to exploit the J-measure to its full potential because J-pruning does not actually achieve this and may even lead to underfitting. A series of experiments have proved that Jmax-pruning may outperform J-pruning in reducing overfitting. However, Jmax-pruning is computationally relatively expensive and may also lead to underfitting. This paper reviews the Prism method and the two existing pruning algorithms above. It also proposes a novel pruning algorithm called Jmid-pruning. The latter is based on the J-measure and it reduces overfitting to a similar level as the other two algorithms but is better in avoiding underfitting and unnecessary computational effort. The authors conduct an experimental study on the performance of the Jmid-pruning algorithm in terms of classification accuracy and computational efficiency. The algorithm is also evaluated comparatively with the J-pruning and Jmax-pruning algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual motion cues play an important role in animal and humans locomotion without the need to extract actual ego-motion information. This paper demonstrates a method for estimating the visual motion parameters, namely the Time-To-Contact (TTC), Focus of Expansion (FOE), and image angular velocities, from a sparse optical flow estimation registered from a downward looking camera. The presented method is capable of estimating the visual motion parameters in a complicated 6 degrees of freedom motion and in real time with suitable accuracy for mobile robots visual navigation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work presents a method of information fusion involving data captured by both a standard CCD camera and a ToF camera to be used in the detection of the proximity between a manipulator robot and a human. Both cameras are assumed to be located above the work area of an industrial robot. The fusion of colour images and time of light information makes it possible to know the 3D localization of objects with respect to a world coordinate system. At the same time this allows to know their colour information. Considering that ToF information given by the range camera contains innacuracies including distance error, border error, and pixel saturation, some corrections over the ToF information are proposed and developed to improve the results. The proposed fusion method uses the calibration parameters of both cameras to reproject 3D ToF points, expressed in a common coordinate system for both cameras and a robot arm, in 2D colour images. In addition to this, using the 3D information, the motion detection in a robot industrial environment is achieved, and the fusion of information is applied to the foreground objects previously detected. This combination of information results in a matrix that links colour and 3D information, giving the possibility of characterising the object by its colour in addition to its 3D localization. Further development of these methods will make it possible to identify objects and their position in the real world, and to use this information to prevent possible collisions between the robot and such objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work presents a method of information fusion involving data captured by both a standard charge-coupled device (CCD) camera and a time-of-flight (ToF) camera to be used in the detection of the proximity between a manipulator robot and a human. Both cameras are assumed to be located above the work area of an industrial robot. The fusion of colour images and time-of-flight information makes it possible to know the 3D localization of objects with respect to a world coordinate system. At the same time, this allows to know their colour information. Considering that ToF information given by the range camera contains innacuracies including distance error, border error, and pixel saturation, some corrections over the ToF information are proposed and developed to improve the results. The proposed fusion method uses the calibration parameters of both cameras to reproject 3D ToF points, expressed in a common coordinate system for both cameras and a robot arm, in 2D colour images. In addition to this, using the 3D information, the motion detection in a robot industrial environment is achieved, and the fusion of information is applied to the foreground objects previously detected. This combination of information results in a matrix that links colour and 3D information, giving the possibility of characterising the object by its colour in addition to its 3D localisation. Further development of these methods will make it possible to identify objects and their position in the real world and to use this information to prevent possible collisions between the robot and such objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information was collated on the seed storage behaviour of 67 tree species native to the Amazon rainforest of Brazil; 38 appeared to show orthodox, 23 recalcitrant and six intermediate seed storage behaviour. A double-criteria key based on thousand-seed weight and seed moisture content at shedding to estimate likely seed storage behaviour, developed previously, showed good agreement with the above classifications. The key can aid seed storage behaviour identification considerably.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective. Interferences from spatially adjacent non-target stimuli are known to evoke event-related potentials (ERPs) during non-target flashes and, therefore, lead to false positives. This phenomenon was commonly seen in visual attention-based brain–computer interfaces (BCIs) using conspicuous stimuli and is known to adversely affect the performance of BCI systems. Although users try to focus on the target stimulus, they cannot help but be affected by conspicuous changes of the stimuli (such as flashes or presenting images) which were adjacent to the target stimulus. Furthermore, subjects have reported that conspicuous stimuli made them tired and annoyed. In view of this, the aim of this study was to reduce adjacent interference, annoyance and fatigue using a new stimulus presentation pattern based upon facial expression changes. Our goal was not to design a new pattern which could evoke larger ERPs than the face pattern, but to design a new pattern which could reduce adjacent interference, annoyance and fatigue, and evoke ERPs as good as those observed during the face pattern. Approach. Positive facial expressions could be changed to negative facial expressions by minor changes to the original facial image. Although the changes are minor, the contrast is big enough to evoke strong ERPs. In this paper, a facial expression change pattern between positive and negative facial expressions was used to attempt to minimize interference effects. This was compared against two different conditions, a shuffled pattern containing the same shapes and colours as the facial expression change pattern, but without the semantic content associated with a change in expression, and a face versus no face pattern. Comparisons were made in terms of classification accuracy and information transfer rate as well as user supplied subjective measures. Main results. The results showed that interferences from adjacent stimuli, annoyance and the fatigue experienced by the subjects could be reduced significantly (p < 0.05) by using the facial expression change patterns in comparison with the face pattern. The offline results show that the classification accuracy of the facial expression change pattern was significantly better than that of the shuffled pattern (p < 0.05) and the face pattern (p < 0.05). Significance. The facial expression change pattern presented in this paper reduced interference from adjacent stimuli and decreased the fatigue and annoyance experienced by BCI users significantly (p < 0.05) compared to the face pattern.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Interferences from the spatially adjacent non-target stimuli evoke ERPs during non-target sub-trials and lead to false positives. This phenomenon is commonly seen in visual attention based BCIs and affects the performance of BCI system. Although, users or subjects tried to focus on the target stimulus, they still could not help being affected by conspicuous changes of the stimuli (flashes or presenting images) which were adjacent to the target stimulus. In view of this case, the aim of this study is to reduce the adjacent interference using new stimulus presentation pattern based on facial expression changes. Positive facial expressions can be changed to negative facial expressions by minor changes to the original facial image. Although the changes are minor, the contrast will be big enough to evoke strong ERPs. In this paper, two different conditions (Pattern_1, Pattern_2) were used to compare across objective measures such as classification accuracy and information transfer rate as well as subjective measures. Pattern_1 was a “flash-only” pattern and Pattern_2 was a facial expression change of a dummy face. In the facial expression change patterns, the background is a positive facial expression and the stimulus is a negative facial expression. The results showed that the interferences from adjacent stimuli could be reduced significantly (P<;0.05) by using the facial expression change patterns. The online performance of the BCI system using the facial expression change patterns was significantly better than that using the “flash-only” patterns in terms of classification accuracy (p<;0.01), bit rate (p<;0.01), and practical bit rate (p<;0.01). Subjects reported that the annoyance and fatigue could be significantly decreased (p<;0.05) using the new stimulus presentation pattern presented in this paper.