395 resultados para Binary Image Representation
Resumo:
In this paper we present a robust method to detect handwritten text from unconstrained drawings on normal whiteboards. Unlike printed text on documents, free form handwritten text has no pattern in terms of size, orientation and font and it is often mixed with other drawings such as lines and shapes. Unlike handwritings on paper, handwritings on a normal whiteboard cannot be scanned so the detection has to be based on photos. Our work traces straight edges on photos of the whiteboard and builds graph representation of connected components. We use geometric properties such as edge density, graph density, aspect ratio and neighborhood similarity to differentiate handwritten text from other drawings. The experiment results show that our method achieves satisfactory precision and recall. Furthermore, the method is robust and efficient enough to be deployed in a mobile device. This is an important enabler of business applications that support whiteboard-centric visual meetings in enterprise scenarios. © 2012 IEEE.
Resumo:
The inverse temperature hyperparameter of the hidden Potts model governs the strength of spatial cohesion and therefore has a substantial influence over the resulting model fit. The difficulty arises from the dependence of an intractable normalising constant on the value of the inverse temperature, thus there is no closed form solution for sampling from the distribution directly. We review three computational approaches for addressing this issue, namely pseudolikelihood, path sampling, and the approximate exchange algorithm. We compare the accuracy and scalability of these methods using a simulation study.
Resumo:
Photographic and image-based dietary records have limited evidence evaluating their performance and use among adults with a chronic disease. This study evaluated the performance of a mobile phone image-based dietary record, the Nutricam Dietary Assessment Method (NuDAM), in adults with type 2 diabetes mellitus (T2DM). Criterion validity was determined by comparing energy intake (EI) with total energy expenditure (TEE) measured by the doubly-labelled water technique. Relative validity was established by comparison to a weighed food record (WFR). Inter-rater reliability was assessed by comparing estimates of intake from three dietitians. Ten adults (6 males, age=61.2±6.9 years, BMI=31.0±4.5 kg/m2) participated. Compared to TEE, mean EI was under-reported using both methods, with a mean ratio of EI:TEE 0.76±0.20 for the NuDAM and 0.76±0.17 for the WFR. There was moderate to high correlations between the NuDAM and WFR for energy (r=0.57), carbohydrate (r=0.63, p<0.05), protein (r=0.78, p<0.01) and alcohol (rs=0.85, p<0.01), with a weaker relationship for fat (r=0.24). Agreement between dietitians for nutrient intake for the 3-day NuDAM (ICC = 0.77-0.99) was marginally lower when compared with the 3-day WFR (ICC=0.82-0.99). All subjects preferred using the NuDAM and were willing to use it again for longer recording periods.
Resumo:
The latest generation of Deep Convolutional Neural Networks (DCNN) have dramatically advanced challenging computer vision tasks, especially in object detection and object classification, achieving state-of-the-art performance in several computer vision tasks including text recognition, sign recognition, face recognition and scene understanding. The depth of these supervised networks has enabled learning deeper and hierarchical representation of features. In parallel, unsupervised deep learning such as Convolutional Deep Belief Network (CDBN) has also achieved state-of-the-art in many computer vision tasks. However, there is very limited research on jointly exploiting the strength of these two approaches. In this paper, we investigate the learning capability of both methods. We compare the output of individual layers and show that many learnt filters and outputs of the corresponding level layer are almost similar for both approaches. Stacking the DCNN on top of unsupervised layers or replacing layers in the DCNN with the corresponding learnt layers in the CDBN can improve the recognition/classification accuracy and training computational expense. We demonstrate the validity of the proposal on ImageNet dataset.
Resumo:
In the field of face recognition, sparse representation (SR) has received considerable attention during the past few years, with a focus on holistic descriptors in closed-set identification applications. The underlying assumption in such SR-based methods is that each class in the gallery has sufficient samples and the query lies on the subspace spanned by the gallery of the same class. Unfortunately, such an assumption is easily violated in the face verification scenario, where the task is to determine if two faces (where one or both have not been seen before) belong to the same person. In this study, the authors propose an alternative approach to SR-based face verification, where SR encoding is performed on local image patches rather than the entire face. The obtained sparse signals are pooled via averaging to form multiple region descriptors, which then form an overall face descriptor. Owing to the deliberate loss of spatial relations within each region (caused by averaging), the resulting descriptor is robust to misalignment and various image deformations. Within the proposed framework, they evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder Neural Network (SANN) and an implicit probabilistic technique based on Gaussian mixture models. Thorough experiments on AR, FERET, exYaleB, BANCA and ChokePoint datasets show that the local SR approach obtains considerably better and more robust performance than several previous state-of-the-art holistic SR methods, on both the traditional closed-set identification task and the more applicable face verification task. The experiments also show that l1-minimisation-based encoding has a considerably higher computational cost when compared with SANN-based and probabilistic encoding, but leads to higher recognition rates.
Resumo:
The process of spray drying is applied in a number of contexts. One such application is the production of a synthetic rock used for storage of nuclear waste. To establish a framework for a model of the spray drying process for this application, we here develop a model describing evaporation from droplets of pure water, such that the model may be extended to account for the presence of colloid within the droplet. We develop a spherically-symmetric model and formulate continuum equations describing mass, momentum, and energy balance in both the liquid and gas phases from first principles. We establish appropriate boundary conditions at the surface of the droplet, including a generalised Clapeyron equation that accurately describes the temperature at the surface of the droplet. To account for experiment design, we introduce a simplified platinum ball and wire model into the system using a thin wire problem. The resulting system of equations is transformed in order to simplify a finite volume solution scheme. The results from numerical simulation are compared with data collected for validation, and the sensitivity of the model to variations in key parameters, and to the use of Clausius–Clapeyron and generalised Clapeyron equations, is investigated. Good agreement is found between the model and experimental data, despite the simplicity of the platinum phase model.
Resumo:
The impact of disease and treatment on a young adult's self-image and sexuality has been largely overlooked. This is surprising given that establishing social and romantic relationships is a normal occurrence in young adulthood. This article describes three female patients' cancer journeys and demonstrates how their experiences have impacted their psychosocial function and self-regard. The themes of body image, self-esteem, and identity formation are explored, in relation to implications for relationship-building and moving beyond a cancer diagnosis. This article has been written by young cancer survivors, Danielle Tindle, Kelly Denver, and Faye Lilley, in an effort to elucidate the ongoing struggle to reconcile cancer into a normal young adult's life.
Resumo:
Frog species have been declining worldwide at unprecedented rates in the past decades. There are many reasons for this decline including pollution, habitat loss, and invasive species [1]. To preserve, protect, and restore frog biodiversity, it is important to monitor and assess frog species. In this paper, a novel method using image processing techniques for analyzing Australian frog vocalisations is proposed. An FFT is applied to audio data to produce a spectrogram. Then, acoustic events are detected and isolated into corresponding segments through image processing techniques applied to the spectrogram. For each segment, spectral peak tracks are extracted with selected seeds and a region growing technique is utilised to obtain the contour of each frog vocalisation. Based on spectral peak tracks and the contour of each frog vocalisation, six feature sets are extracted. Principal component analysis reduces each feature set down to six principal components which are tested for classification performance with a k-nearest neighbor classifier. This experiment tests the proposed method of classification on fourteen frog species which are geographically well distributed throughout Queensland, Australia. The experimental results show that the best average classification accuracy for the fourteen frog species can be up to 87%.
Resumo:
There is a well-founded ethical concern in the present regarding the question Ήow can we include everybody's voice equally in the framing of reviews?' This paper is a response to the complexities that inhere in that question. It is not about Review of Educational Research (RER) as a specific site but about the systems of reasoning that construct the opening question about reviews and that suggest possible answers, including the response: 'What is voice?'
Resumo:
Although robotics research has seen advances over the last decades robots are still not in widespread use outside industrial applications. Yet a range of proposed scenarios have robots working together, helping and coexisting with humans in daily life. In all these a clear need to deal with a more unstructured, changing environment arises. I herein present a system that aims to overcome the limitations of highly complex robotic systems, in terms of autonomy and adaptation. The main focus of research is to investigate the use of visual feedback for improving reaching and grasping capabilities of complex robots. To facilitate this a combined integration of computer vision and machine learning techniques is employed. From a robot vision point of view the combination of domain knowledge from both imaging processing and machine learning techniques, can expand the capabilities of robots. I present a novel framework called Cartesian Genetic Programming for Image Processing (CGP-IP). CGP-IP can be trained to detect objects in the incoming camera streams and successfully demonstrated on many different problem domains. The approach requires only a few training images (it was tested with 5 to 10 images per experiment) is fast, scalable and robust yet requires very small training sets. Additionally, it can generate human readable programs that can be further customized and tuned. While CGP-IP is a supervised-learning technique, I show an integration on the iCub, that allows for the autonomous learning of object detection and identification. Finally this dissertation includes two proof-of-concepts that integrate the motion and action sides. First, reactive reaching and grasping is shown. It allows the robot to avoid obstacles detected in the visual stream, while reaching for the intended target object. Furthermore the integration enables us to use the robot in non-static environments, i.e. the reaching is adapted on-the- fly from the visual feedback received, e.g. when an obstacle is moved into the trajectory. The second integration highlights the capabilities of these frameworks, by improving the visual detection by performing object manipulation actions.
Resumo:
Transitioning the personal brand from one representation to another is sometimes necessary, particularly within the public eye. Marketing literature regarding celebrities focuses on brand endorsement (see for example Till, 1998 or Erdogan, 1999), rather than the positioning of a celebrity brand. Furthermore, the role of social media in strengthening the celebrity brand has received limited attention in the literature outside of political marketing (see for example Crawford, 2009 and Grant, Moon and Grant, 2010). This study focuses on the brand of “Elizabeth Gilbert,” author of the bestseller memoir, Eat, Pray, Love (2006). Through critical discourse analysis, the way the author has used social media to reposition her celebrity brand at the time of the launch of her new novel, ‘The Signature of All Things’ (2013) is examined. This study focuses on the use of social media by celebrities to strengthen the celebrity brand.
Resumo:
The primary objective of this paper is to study the use of medical image-based finite element (FE) modelling in subjectspecific midsole design and optimisation for heel pressure reduction using a midsole plug under the calcaneus area (UCA). Plugs with different relative dimensions to the size of the calcaneus of the subject have been incorporated in the heel region of the midsole. The FE foot model was validated by comparing the numerically predicted plantar pressure with biomechanical tests conducted on the same subject. For each UCA midsole plug design, the effect of material properties and plug thicknesses on the plantar pressure distribution and peak pressure level during the heel strike phase of normal walking was systematically studied. The results showed that the UCA midsole insert could effectively modify the pressure distribution, and its effect is directly associated with the ratio of the plug dimension to the size of the calcaneus bone of the subject. A medium hardness plug with a size of 95% of the calcaneus has achieved the best performance for relieving the peak pressure in comparison with the pressure level for a solid midsole without a plug, whereas a smaller plug with a size of 65% of the calcaneus insert with a very soft material showed minimum beneficial effect for the pressure relief.
Resumo:
Developing accurate and reliable crop detection algorithms is an important step for harvesting automation in horticulture. This paper presents a novel approach to visual detection of highly-occluded fruits. We use a conditional random field (CRF) on multi-spectral image data (colour and Near-Infrared Reflectance, NIR) to model two classes: crop and background. To describe these two classes, we explore a range of visual-texture features including local binary pattern, histogram of oriented gradients, and learn auto-encoder features. The pro-posed methods are evaluated using hand-labelled images from a dataset captured on a commercial capsicum farm. Experimental results are presented, and performance is evaluated in terms of the Area Under the Curve (AUC) of the precision-recall curves.Our current results achieve a maximum performance of 0.81AUC when combining all of the texture features in conjunction with colour information.
Resumo:
In this paper we tackle the problem of efficient video event detection. We argue that linear detection functions should be preferred in this regard due to their scalability and efficiency during estimation and evaluation. A popular approach in this regard is to represent a sequence using a bag of words (BOW) representation due to its: (i) fixed dimensionality irrespective of the sequence length, and (ii) its ability to compactly model the statistics in the sequence. A drawback to the BOW representation, however, is the intrinsic destruction of the temporal ordering information. In this paper we propose a new representation that leverages the uncertainty in relative temporal alignments between pairs of sequences while not destroying temporal ordering. Our representation, like BOW, is of a fixed dimensionality making it easily integrated with a linear detection function. Extensive experiments on CK+, 6DMG, and UvA-NEMO databases show significant performance improvements across both isolated and continuous event detection tasks.
Resumo:
This paper presents an approach, based on Lean production philosophy, for rationalising the processes involved in the production of specification documents for construction projects. Current construction literature erroneously depicts the process for the creation of construction specifications as a linear one. This traditional understanding of the specification process often culminates in process-wastes. On the contrary, the evidence suggests that though generalised, the activities involved in producing specification documents are nonlinear. Drawing on the outcome of participant observation, this paper presents an optimised approach for representing construction specifications. Consequently, the actors typically involved in producing specification documents are identified, the processes suitable for automation are highlighted and the central role of tacit knowledge is integrated into a conceptual template of construction specifications. By applying the transformation, flow, value (TFV) theory of Lean production the paper argues that value creation can be realised by eliminating the wastes associated with the traditional preparation of specification documents with a view to integrating specifications in digital models such as Building Information Models (BIM). Therefore, the paper presents an approach for rationalising the TFV theory as a method for optimising current approaches for generating construction specifications based on a revised specification writing model.