761 resultados para activity recognition
em Queensland University of Technology - ePrints Archive
Resumo:
Accurate and detailed measurement of an individual's physical activity is a key requirement for helping researchers understand the relationship between physical activity and health. Accelerometers have become the method of choice for measuring physical activity due to their small size, low cost, convenience and their ability to provide objective information about physical activity. However, interpreting accelerometer data once it has been collected can be challenging. In this work, we applied machine learning algorithms to the task of physical activity recognition from triaxial accelerometer data. We employed a simple but effective approach of dividing the accelerometer data into short non-overlapping windows, converting each window into a feature vector, and treating each feature vector as an i.i.d training instance for a supervised learning algorithm. In addition, we improved on this simple approach with a multi-scale ensemble method that did not need to commit to a single window size and was able to leverage the fact that physical activities produced time series with repetitive patterns and discriminative features for physical activity occurred at different temporal scales.
Resumo:
Problem addressed Wrist-worn accelerometers are associated with greater compliance. However, validated algorithms for predicting activity type from wrist-worn accelerometer data are lacking. This study compared the activity recognition rates of an activity classifier trained on acceleration signal collected on the wrist and hip. Methodology 52 children and adolescents (mean age 13.7 +/- 3.1 year) completed 12 activity trials that were categorized into 7 activity classes: lying down, sitting, standing, walking, running, basketball, and dancing. During each trial, participants wore an ActiGraph GT3X+ tri-axial accelerometer on the right hip and the non-dominant wrist. Features were extracted from 10-s windows and inputted into a regularized logistic regression model using R (Glmnet + L1). Results Classification accuracy for the hip and wrist was 91.0% +/- 3.1% and 88.4% +/- 3.0%, respectively. The hip model exhibited excellent classification accuracy for sitting (91.3%), standing (95.8%), walking (95.8%), and running (96.8%); acceptable classification accuracy for lying down (88.3%) and basketball (81.9%); and modest accuracy for dance (64.1%). The wrist model exhibited excellent classification accuracy for sitting (93.0%), standing (91.7%), and walking (95.8%); acceptable classification accuracy for basketball (86.0%); and modest accuracy for running (78.8%), lying down (74.6%) and dance (69.4%). Potential Impact Both the hip and wrist algorithms achieved acceptable classification accuracy, allowing researchers to use either placement for activity recognition.
Resumo:
This paper presents an effective classification method based on Support Vector Machines (SVM) in the context of activity recognition. Local features that capture both spatial and temporal information in activity videos have made significant progress recently. Efficient and effective features, feature representation and classification plays a crucial role in activity recognition. For classification, SVMs are popularly used because of their simplicity and efficiency; however the common multi-class SVM approaches applied suffer from limitations including having easily confused classes and been computationally inefficient. We propose using a binary tree SVM to address the shortcomings of multi-class SVMs in activity recognition. We proposed constructing a binary tree using Gaussian Mixture Models (GMM), where activities are repeatedly allocated to subnodes until every new created node contains only one activity. Then, for each internal node a separate SVM is learned to classify activities, which significantly reduces the training time and increases the speed of testing compared to popular the `one-against-the-rest' multi-class SVM classifier. Experiments carried out on the challenging and complex Hollywood dataset demonstrates comparable performance over the baseline bag-of-features method.
Resumo:
Many conventional statistical machine learning al- gorithms generalise poorly if distribution bias ex- ists in the datasets. For example, distribution bias arises in the context of domain generalisation, where knowledge acquired from multiple source domains need to be used in a previously unseen target domains. We propose Elliptical Summary Randomisation (ESRand), an efficient domain generalisation approach that comprises of a randomised kernel and elliptical data summarisation. ESRand learns a domain interdependent projection to a la- tent subspace that minimises the existing biases to the data while maintaining the functional relationship between domains. In the latent subspace, ellipsoidal summaries replace the samples to enhance the generalisation by further removing bias and noise in the data. Moreover, the summarisation enables large-scale data processing by significantly reducing the size of the data. Through comprehensive analysis, we show that our subspace-based approach outperforms state-of-the-art results on several activity recognition benchmark datasets, while keeping the computational complexity significantly low.
Resumo:
This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational requirements, their performance is still limited for real world applications due to a lack of contextual information and models not being tailored to specific activities. We propose a new activity representation framework to address the shortcomings of the popular, but simple bag-of-words approach. In our framework, first multiple instance SVM (mi-SVM) is used to identify positive features for each action category and the k-means algorithm is used to generate a codebook. Then locality-constrained linear coding is used to encode the features into the generated codebook, followed by spatio-temporal pyramid pooling to convey the spatio-temporal statistics. Finally, an SVM is used to classify the videos. Experiments carried out on two popular datasets with varying complexity demonstrate significant performance improvement over the base-line bag-of-feature method.
Resumo:
This paper proposes a semi-supervised intelligent visual surveillance system to exploit the information from multi-camera networks for the monitoring of people and vehicles. Modules are proposed to perform critical surveillance tasks including: the management and calibration of cameras within a multi-camera network; tracking of objects across multiple views; recognition of people utilising biometrics and in particular soft-biometrics; the monitoring of crowds; and activity recognition. Recent advances in these computer vision modules and capability gaps in surveillance technology are also highlighted.
Resumo:
In this paper we investigate the effectiveness of class specific sparse codes in the context of discriminative action classification. The bag-of-words representation is widely used in activity recognition to encode features, and although it yields state-of-the art performance with several feature descriptors it still suffers from large quantization errors and reduces the overall performance. Recently proposed sparse representation methods have been shown to effectively represent features as a linear combination of an over complete dictionary by minimizing the reconstruction error. In contrast to most of the sparse representation methods which focus on Sparse-Reconstruction based Classification (SRC), this paper focuses on a discriminative classification using a SVM by constructing class-specific sparse codes for motion and appearance separately. Experimental results demonstrates that separate motion and appearance specific sparse coefficients provide the most effective and discriminative representation for each class compared to a single class-specific sparse coefficients.
Resumo:
For several reasons, the Fourier phase domain is less favored than the magnitude domain in signal processing and modeling of speech. To correctly analyze the phase, several factors must be considered and compensated, including the effect of the step size, windowing function and other processing parameters. Building on a review of these factors, this paper investigates a spectral representation based on the Instantaneous Frequency Deviation, but in which the step size between processing frames is used in calculating phase changes, rather than the traditional single sample interval. Reflecting these longer intervals, the term delta-phase spectrum is used to distinguish this from instantaneous derivatives. Experiments show that mel-frequency cepstral coefficients features derived from the delta-phase spectrum (termed Mel-Frequency delta-phase features) can produce broadly similar performance to equivalent magnitude domain features for both voice activity detection and speaker recognition tasks. Further, it is shown that the fusion of the magnitude and phase representations yields performance benefits over either in isolation.
Resumo:
In the present study, items pre-exposed in a familiarization series were included in a list discrimination task to manipulate memory strength. At test, participants were required to discriminate strong targets and strong lures from weak targets and new lures. This resulted in a concordant pattern of increased "old" responses to strong targets and lures. Model estimates attributed this pattern to either equivalent increases in memory strength across the two types of items (unequal variance signal detection model) or equivalent increases in both familiarity and recollection (dual process signal detection [DPSD] model). Hippocampal activity associated with strong targets and lures showed equivalent increases compared with missed items. This remained the case when analyses were restricted to high-confidence responses considered by the DPSD model to reflect predominantly recollection. A similar pattern of activity was observed in parahippocampal cortex for high-confidence responses. The present results are incompatible with "noncriterial" or "false" recollection being reflected solely in inflated DPSD familiarity estimates and support a positive correlation between hippocampal activity and memory strength irrespective of the accuracy of list discrimination, consistent with the unequal variance signal detection model account.
Resumo:
There is increased recognition that determinants of health should be investigated in a life-course perspective. Retirement is a major transition in the life course and offers opportunities for changes in physical activity that may improve health in the aging population. The authors examined the effect of retirement on changes in physical activity in the GLOBE Study, a prospective cohort study known by the Dutch acronym for "Health and Living Conditions of the Population of Eindhoven and surroundings," 1991–2004. They followed respondents (n = 971) by postal questionnaire who were employed and aged 40–65 years in 1991 for 13 years, after which they were still employed (n = 287) or had retired (n = 684). Physical activity included 1) work-related transportation, 2) sports participation, and 3) nonsports leisure-time physical activity. Multinomial logistic regression analyses indicated that retirement was associated with a significantly higher odds for a decline in physical activity from work-related transportation (odds ratio (OR) = 3.03, 95% confidence interval (CI): 1.97, 4.65), adjusted for sex, age, marital status, chronic diseases, and education, compared with remaining employed. Retirement was not associated with an increase in sports participation (OR = 1.12, 95% CI: 0.71, 1.75) or nonsports leisure-time physical activity (OR = 0.80, 95% CI: 0.54, 1.19). In conclusion, retirement introduces a reduction in physical activity from work-related transportation that is not compensated for by an increase in sports participation or an increase in nonsports leisure-time physical activity.
Resumo:
Two archaeal Holliday junction resolving enzymes, Holliday junction cleavage (Hjc) and Holliday junction endonuclease (Hje), have been characterized. Both are members of a nuclease superfamily that includes the type II restriction enzymes, although their DNA cleaving activity is highly specific for four-way junction structure and not nucleic acid sequence. Despite 28% sequence identity, Hje and Hjc cleave junctions with distinct cutting patterns—they cut different strands of a four-way junction, at different distances from the junction centre. We report the high-resolution crystal structure of Hje from Sulfolobus solfataricus. The structure provides a basis to explain the differences in substrate specificity of Hje and Hjc, which result from changes in dimer organization, and suggests a viral origin for the Hje gene. Structural and biochemical data support the modelling of an Hje:DNA junction complex, highlighting a flexible loop that interacts intimately with the junction centre. A highly conserved serine residue on this loop is shown to be essential for the enzyme's activity, suggesting a novel variation of the nuclease active site. The loop may act as a conformational switch, ensuring that the active site is completed only on binding a four-way junction, thus explaining the exquisite specificity of these enzymes.
Resumo:
Previous studies have demonstrated that pattern recognition approaches to accelerometer data reduction are feasible and moderately accurate in classifying activity type in children. Whether pattern recognition techniques can be used to provide valid estimates of physical activity (PA) energy expenditure in youth remains unexplored in the research literature. Purpose: The objective of this study is to develop and test artificial neural networks (ANNs) to predict PA type and energy expenditure (PAEE) from processed accelerometer data collected in children and adolescents. Methods: One hundred participants between the ages of 5 and 15 yr completed 12 activity trials that were categorized into five PA types: sedentary, walking, running, light-intensity household activities or games, and moderate-to-vigorous intensity games or sports. During each trial, participants wore an ActiGraph GTIM on the right hip, and (V) Over dotO(2) was measured using the Oxycon Mobile (Viasys Healthcare, Yorba Linda, CA) portable metabolic system. ANNs to predict PA type and PAEE (METs) were developed using the following features: 10th, 25th, 50th, 75th, and 90th percentiles and the lag one autocorrelation. To determine the highest time resolution achievable, we extracted features from 10-, 15-, 20-, 30-, and 60-s windows. Accuracy was assessed by calculating the percentage of windows correctly classified and root mean square en-or (RMSE). Results: As window size increased from 10 to 60 s, accuracy for the PA-type ANN increased from 81.3% to 88.4%. RMSE for the MET prediction ANN decreased from 1.1 METs to 0.9 METs. At any given window size, RMSE values for the MET prediction ANN were 30-40% lower than the conventional regression-based approaches. Conclusions: ANNs can be used to predict both PA type and PAEE in children and adolescents using count data from a single waist mounted accelerometer.
Resumo:
Objectives Recent research has shown that machine learning techniques can accurately predict activity classes from accelerometer data in adolescents and adults. The purpose of this study is to develop and test machine learning models for predicting activity type in preschool-aged children. Design Participants completed 12 standardised activity trials (TV, reading, tablet game, quiet play, art, treasure hunt, cleaning up, active game, obstacle course, bicycle riding) over two laboratory visits. Methods Eleven children aged 3–6 years (mean age = 4.8 ± 0.87; 55% girls) completed the activity trials while wearing an ActiGraph GT3X+ accelerometer on the right hip. Activities were categorised into five activity classes: sedentary activities, light activities, moderate to vigorous activities, walking, and running. A standard feed-forward Artificial Neural Network and a Deep Learning Ensemble Network were trained on features in the accelerometer data used in previous investigations (10th, 25th, 50th, 75th and 90th percentiles and the lag-one autocorrelation). Results Overall recognition accuracy for the standard feed forward Artificial Neural Network was 69.7%. Recognition accuracy for sedentary activities, light activities and games, moderate-to-vigorous activities, walking, and running was 82%, 79%, 64%, 36% and 46%, respectively. In comparison, overall recognition accuracy for the Deep Learning Ensemble Network was 82.6%. For sedentary activities, light activities and games, moderate-to-vigorous activities, walking, and running recognition accuracy was 84%, 91%, 79%, 73% and 73%, respectively. Conclusions Ensemble machine learning approaches such as Deep Learning Ensemble Network can accurately predict activity type from accelerometer data in preschool children.
Resumo:
Local spatio-temporal features with a Bag-of-visual words model is a popular approach used in human action recognition. Bag-of-features methods suffer from several challenges such as extracting appropriate appearance and motion features from videos, converting extracted features appropriate for classification and designing a suitable classification framework. In this paper we address the problem of efficiently representing the extracted features for classification to improve the overall performance. We introduce two generative supervised topic models, maximum entropy discrimination LDA (MedLDA) and class- specific simplex LDA (css-LDA), to encode the raw features suitable for discriminative SVM based classification. Unsupervised LDA models disconnect topic discovery from the classification task, hence yield poor results compared to the baseline Bag-of-words framework. On the other hand supervised LDA techniques learn the topic structure by considering the class labels and improve the recognition accuracy significantly. MedLDA maximizes likelihood and within class margins using max-margin techniques and yields a sparse highly discriminative topic structure; while in css-LDA separate class specific topics are learned instead of common set of topics across the entire dataset. In our representation first topics are learned and then each video is represented as a topic proportion vector, i.e. it can be comparable to a histogram of topics. Finally SVM classification is done on the learned topic proportion vector. We demonstrate the efficiency of the above two representation techniques through the experiments carried out in two popular datasets. Experimental results demonstrate significantly improved performance compared to the baseline Bag-of-features framework which uses kmeans to construct histogram of words from the feature vectors.
Resumo:
CoMFA and CoMSIA analysis were utilized in this investigation to define the important interacting regions in paclitaxel/tubulin binding site and to develop selective paclitaxel-like active compounds. The starting geometry of paclitaxel analogs was taken from the crystal structure of docetaxel. A total of 28 derivatives of paclitaxel were divided into two groups—a training set comprising of 19 compounds and a test set comprising of nine compounds. They were constructed and geometrically optimized using SYBYL v6.6. CoMFA studies provided a good predictability (q2 = 0.699, r2 = 0.991, PC = 6, S.E.E. = 0.343 and F = 185.910). They showed the steric and electrostatic properties as the major interacting forces whilst the lipophilic property contribution was a minor factor for recognition forces of the binding site. These results were in agreement with the experimental data of the binding activities of these compounds. Five fields in CoMSIA analysis (steric, electrostatic, hydrophobic, hydrogen-bond acceptor and donor properties) were considered contributors in the ligand–receptor interactions. The results obtained from the CoMSIA studies were: q2 = 0.535, r2 = 0.983, PC = 5, S.E.E. = 0.452 and F = 127.884. The data obtained from both CoMFA and CoMSIA studies were interpreted with respect to the paclitaxel/tubulin binding site. This intuitively suggested where the most significant anchoring points for binding affinity are located. This information could be used for the development of new compounds having paclitaxel-like activity with new chemical entities to overcome the existing pharmaceutical barriers and the economical problem associated with the synthesis of the paclitaxel analogs. These will boost the wide use of this useful class of compounds, i.e. in brain tumors as the most of the present active compounds have poor blood–brain barrier crossing ratios and also, various tubulin isotypes has shown resistance to taxanes and other antimitotic agents.