936 resultados para Image recognition
Resumo:
The proliferation of news reports published in online websites and news information sharing among social media users necessitates effective techniques for analysing the image, text and video data related to news topics. This paper presents the first study to classify affective facial images on emerging news topics. The proposed system dynamically monitors and selects the current hot (of great interest) news topics with strong affective interestingness using textual keywords in news articles and social media discussions. Images from the selected hot topics are extracted and classified into three categorized emotions, positive, neutral and negative, based on facial expressions of subjects in the images. Performance evaluations on two facial image datasets collected from real-world resources demonstrate the applicability and effectiveness of the proposed system in affective classification of facial images in news reports. Facial expression shows high consistency with the affective textual content in news reports for positive emotion, while only low correlation has been observed for neutral and negative. The system can be directly used for applications, such as assisting editors in choosing photos with a proper affective semantic for a certain topic during news report preparation.
Resumo:
State and local governments frequently look to flagship cultural projects to improve the city image and catalyze tourism but, in the process, often overlook their potential to foster local arts development. To better understand this role, the article examines if and how cultural institutions in Los Angeles and San Francisco attract and support arts-related activity. The analysis reveals that cultural flagships have mixed success in generating arts-based development and that their ability may be improved through attention to the local context, facility and institutional characteristics, and the approach of the sponsoring agencies. Such knowledge is useful for planners to enhance their revitalization efforts, particularly as the economic development potential of arts organizations and artists has become more apparent.
Resumo:
Dealing with digital medical images is raising many new security problems with legal and ethical complexities for local archiving and distant medical services. These include image retention and fraud, distrust and invasion of privacy. This project was a significant step forward in developing a complete framework for systematically designing, analyzing, and applying digital watermarking, with a particular focus on medical image security. A formal generic watermarking model, three new attack models, and an efficient watermarking technique for medical images were developed. These outcomes contribute to standardizing future research in formal modeling and complete security and computational analysis of watermarking schemes.
Resumo:
While formal definitions and security proofs are well established in some fields like cryptography and steganography, they are not as evident in digital watermarking research. A systematic development of watermarking schemes is desirable, but at present their development is usually informal, ad hoc, and omits the complete realization of application scenarios. This practice not only hinders the choice and use of a suitable scheme for a watermarking application, but also leads to debate about the state-of-the-art for different watermarking applications. With a view to the systematic development of watermarking schemes, we present a formal generic model for digital image watermarking. Considering possible inputs, outputs, and component functions, the initial construction of a basic watermarking model is developed further to incorporate the use of keys. On the basis of our proposed model, fundamental watermarking properties are defined and their importance exemplified for different image applications. We also define a set of possible attacks using our model showing different winning scenarios depending on the adversary capabilities. It is envisaged that with a proper consideration of watermarking properties and adversary actions in different image applications, use of the proposed model would allow a unified treatment of all practically meaningful variants of watermarking schemes.
Resumo:
RNA polymerase II (pol II) transcription termination requires co‐transcriptional recognition of a functional polyadenylation signal, but the molecular mechanisms that transduce this signal to pol II remain unclear. We show that Yhh1p/Cft1p, the yeast homologue of the mammalian AAUAAA interacting protein CPSF 160, is an RNA‐binding protein and provide evidence that it participates in poly(A) site recognition. Interestingly, RNA binding is mediated by a central domain composed of predicted β‐propeller‐forming repeats, which occurs in proteins of diverse cellular functions. We also found that Yhh1p/Cft1p bound specifically to the phosphorylated C‐terminal domain (CTD) of pol II in vitro and in a two‐hybrid test in vivo. Furthermore, transcriptional run‐on analysis demonstrated that yhh1 mutants were defective in transcription termination, suggesting that Yhh1p/Cft1p functions in the coupling of transcription and 3′‐end formation. We propose that direct interactions of Yhh1p/Cft1p with both the RNA transcript and the CTD are required to communicate poly(A) site recognition to elongating pol II to initiate transcription termination.
Resumo:
In increasingly competitive labour markets, attracting and retaining talent has become a prime concern of organisations. Employers need to understand the range of factors that influence career decision making and the role of employer branding in attracting human capital that best fits and contributes to the strategic aims of an organisation. This chapter identifies the changing factors that attract people to certain employment and industries and discusses the importance of aligning employer branding with employee branding to create a strong, genuine and lasting employer brand. Whilst organisations have long used marketing and branding practices to engender loyalty in customers, they are increasingly expanding this activity to differentiate organisations and make them attractive from an employee perspective. This chapter discusses employer branding and industry image as two important components of attraction strategies and describes ways companies can maximise their brand awareness in the employment market to both current and future employees.
Resumo:
For robots operating in outdoor environments, a number of factors, including weather, time of day, rough terrain, high speeds, and hardware limitations, make performing vision-based simultaneous localization and mapping with current techniques infeasible due to factors such as image blur and/or underexposure, especially on smaller platforms and low-cost hardware. In this paper, we present novel visual place-recognition and odometry techniques that address the challenges posed by low lighting, perceptual change, and low-cost cameras. Our primary contribution is a novel two-step algorithm that combines fast low-resolution whole image matching with a higher-resolution patch-verification step, as well as image saliency methods that simultaneously improve performance and decrease computing time. The algorithms are demonstrated using consumer cameras mounted on a small vehicle in a mixed urban and vegetated environment and a car traversing highway and suburban streets, at different times of day and night and in various weather conditions. The algorithms achieve reliable mapping over the course of a day, both when incrementally incorporating new visual scenes from different times of day into an existing map, and when using a static map comprising visual scenes captured at only one point in time. Using the two-step place-recognition process, we demonstrate for the first time single-image, error-free place recognition at recall rates above 50% across a day-night dataset without prior training or utilization of image sequences. This place-recognition performance enables topologically correct mapping across day-night cycles.
Resumo:
This paper investigates the effect of topic dependent language models (TDLM) on phonetic spoken term detection (STD) using dynamic match lattice spotting (DMLS). Phonetic STD consists of two steps: indexing and search. The accuracy of indexing audio segments into phone sequences using phone recognition methods directly affects the accuracy of the final STD system. If the topic of a document in known, recognizing the spoken words and indexing them to an intermediate representation is an easier task and consequently, detecting a search word in it will be more accurate and robust. In this paper, we propose the use of TDLMs in the indexing stage to improve the accuracy of STD in situations where the topic of the audio document is known in advance. It is shown that using TDLMs instead of the traditional general language model (GLM) improves STD performance according to figure of merit (FOM) criteria.
Resumo:
There is substantial evidence for facial emotion recognition (FER) deficits in autism spectrum disorder (ASD). The extent of this impairment, however, remains unclear, and there is some suggestion that clinical groups might benefit from the use of dynamic rather than static images. High-functioning individuals with ASD (n = 36) and typically developing controls (n = 36) completed a computerised FER task involving static and dynamic expressions of the six basic emotions. The ASD group showed poorer overall performance in identifying anger and disgust and were disadvantaged by dynamic (relative to static) stimuli when presented with sad expressions. Among both groups, however, dynamic stimuli appeared to improve recognition of anger. This research provides further evidence of specific impairment in the recognition of negative emotions in ASD, but argues against any broad advantages associated with the use of dynamic displays.
Resumo:
Representation of facial expressions using continuous dimensions has shown to be inherently more expressive and psychologically meaningful than using categorized emotions, and thus has gained increasing attention over recent years. Many sub-problems have arisen in this new field that remain only partially understood. A comparison of the regression performance of different texture and geometric features and investigation of the correlations between continuous dimensional axes and basic categorized emotions are two of these. This paper presents empirical studies addressing these problems, and it reports results from an evaluation of different methods for detecting spontaneous facial expressions within the arousal-valence dimensional space (AV). The evaluation compares the performance of texture features (SIFT, Gabor, LBP) against geometric features (FAP-based distances), and the fusion of the two. It also compares the prediction of arousal and valence, obtained using the best fusion method, to the corresponding ground truths. Spatial distribution, shift, similarity, and correlation are considered for the six basic categorized emotions (i.e. anger, disgust, fear, happiness, sadness, surprise). Using the NVIE database, results show that the fusion of LBP and FAP features performs the best. The results from the NVIE and FEEDTUM databases reveal novel findings about the correlations of arousal and valence dimensions to each of six basic emotion categories.
Resumo:
This thesis investigates face recognition in video under the presence of large pose variations. It proposes a solution that performs simultaneous detection of facial landmarks and head poses across large pose variations, employs discriminative modelling of feature distributions of faces with varying poses, and applies fusion of multiple classifiers to pose-mismatch recognition. Experiments on several benchmark datasets have demonstrated that improved performance is achieved using the proposed solution.
Resumo:
While existing multi-biometic Dempster-Shafer the- ory fusion approaches have demonstrated promising perfor- mance, they do not model the uncertainty appropriately, sug- gesting that further improvement can be achieved. This research seeks to develop a unified framework for multimodal biometric fusion to take advantage of the uncertainty concept of Dempster- Shafer theory, improving the performance of multi-biometric authentication systems. Modeling uncertainty as a function of uncertainty factors affecting the recognition performance of the biometric systems helps to address the uncertainty of the data and the confidence of the fusion outcome. A weighted combination of quality measures and classifiers performance (Equal Error Rate) are proposed to encode the uncertainty concept to improve the fusion. We also found that quality measures contribute unequally to the recognition performance, thus selecting only significant factors and fusing them with a Dempster-Shafer approach to generate an overall quality score play an important role in the success of uncertainty modeling. The proposed approach achieved a competitive performance (approximate 1% EER) in comparison with other Dempster-Shafer based approaches and other conventional fusion approaches.
Resumo:
This paper evaluates the performance of different text recognition techniques for a mobile robot in an indoor (university campus) environment. We compared four different methods: our own approach using existing text detection methods (Minimally Stable Extremal Regions detector and Stroke Width Transform) combined with a convolutional neural network, two modes of the open source program Tesseract, and the experimental mobile app Google Goggles. The results show that a convolutional neural network combined with the Stroke Width Transform gives the best performance in correctly matched text on images with single characters whereas Google Goggles gives the best performance on images with multiple words. The dataset used for this work is released as well.
Resumo:
Problem addressed Wrist-worn accelerometers are associated with greater compliance. However, validated algorithms for predicting activity type from wrist-worn accelerometer data are lacking. This study compared the activity recognition rates of an activity classifier trained on acceleration signal collected on the wrist and hip. Methodology 52 children and adolescents (mean age 13.7 +/- 3.1 year) completed 12 activity trials that were categorized into 7 activity classes: lying down, sitting, standing, walking, running, basketball, and dancing. During each trial, participants wore an ActiGraph GT3X+ tri-axial accelerometer on the right hip and the non-dominant wrist. Features were extracted from 10-s windows and inputted into a regularized logistic regression model using R (Glmnet + L1). Results Classification accuracy for the hip and wrist was 91.0% +/- 3.1% and 88.4% +/- 3.0%, respectively. The hip model exhibited excellent classification accuracy for sitting (91.3%), standing (95.8%), walking (95.8%), and running (96.8%); acceptable classification accuracy for lying down (88.3%) and basketball (81.9%); and modest accuracy for dance (64.1%). The wrist model exhibited excellent classification accuracy for sitting (93.0%), standing (91.7%), and walking (95.8%); acceptable classification accuracy for basketball (86.0%); and modest accuracy for running (78.8%), lying down (74.6%) and dance (69.4%). Potential Impact Both the hip and wrist algorithms achieved acceptable classification accuracy, allowing researchers to use either placement for activity recognition.