351 resultados para Automatic Recognition
Resumo:
The mining industry presents us with a number of ideal applications for sensor based machine control because of the unstructured environment that exists within each mine. The aim of the research presented here is to increase the productivity of existing large compliant mining machines by retrofitting with enhanced sensing and control technology. The current research focusses on the automatic control of the swing motion cycle of a dragline and an automated roof bolting system. We have achieved: * closed-loop swing control of an one-tenth scale model dragline; * single degree of freedom closed-loop visual control of an electro-hydraulic manipulator in the lab developed from standard components.
Resumo:
The QUT-NOISE-SRE protocol is designed to mix the large QUT-NOISE database, consisting of over 10 hours of back- ground noise, collected across 10 unique locations covering 5 common noise scenarios, with commonly used speaker recognition datasets such as Switchboard, Mixer and the speaker recognition evaluation (SRE) datasets provided by NIST. By allowing common, clean, speech corpora to be mixed with a wide variety of noise conditions, environmental reverberant responses, and signal-to-noise ratios, this protocol provides a solid basis for the development, evaluation and benchmarking of robust speaker recognition algorithms, and is freely available to download alongside the QUT-NOISE database. In this work, we use the QUT-NOISE-SRE protocol to evaluate a state-of-the-art PLDA i-vector speaker recognition system, demonstrating the importance of designing voice-activity-detection front-ends specifically for speaker recognition, rather than aiming for perfect coherence with the true speech/non-speech boundaries.
Resumo:
Semantic priming occurs when a subject is faster in recognising a target word when it is preceded by a related word compared to an unrelated word. The effect is attributed to automatic or controlled processing mechanisms elicited by short or long interstimulus intervals (ISIs) between primes and targets. We employed event-related functional magnetic resonance imaging (fMRI) to investigate blood oxygen level dependent (BOLD) responses associated with automatic semantic priming using an experimental design identical to that used in standard behavioural priming tasks. Prime-target semantic strength was manipulated by using lexical ambiguity primes (e.g., bank) and target words related to dominant or subordinate meaning of the ambiguity. Subjects made speeded lexical decisions (word/nonword) on dominant related, subordinate related, and unrelated word pairs presented randomly with a short ISI. The major finding was a pattern of reduced activity in middle temporal and inferior prefrontal regions for dominant versus unrelated and subordinate versus unrelated comparisons, respectively. These findings are consistent with both a dual process model of semantic priming and recent repetition priming data that suggest that reductions in BOLD responses represent neural priming associated with automatic semantic activation and implicate the left middle temporal cortex and inferior prefrontal cortex in more automatic aspects of semantic processing.
Resumo:
We used event-related fMRI to investigate the neural correlates of encoding strength and word frequency effects in recognition memory. At test, participants made Old/New decisions to intermixed low (LF) and high frequency (HF) words that had been presented once or twice at study and to new, unstudied words. The Old/New effect for all hits vs. correctly rejected unstudied words was associated with differential activity in multiple cortical regions, including the anterior medial temporal lobe (MTL), hippocampus, left lateral parietal cortex and anterior left inferior prefrontal cortex (LIPC). Items repeated at study had superior hit rates (HR) compared to items presented once and were associated with reduced activity in the right anterior MTL. By contrast, other regions that had shown conventional Old/New effects did not demonstrate modulation according to memory strength. A mirror effect for word frequency was demonstrated, with the LF word HR advantage associated with increased activity in the left lateral temporal cortex. However, none of the regions that had demonstrated Old/New item retrieval effects showed modulation according to word frequency. These findings are interpreted as supporting single-process memory models proposing a unitary strength-like memory signal and models attributing the LF word HR advantage to the greater lexico-semantic context-noise associated with HF words due to their being experienced in many pre-experimental contexts.
Resumo:
In the present study, items pre-exposed in a familiarization series were included in a list discrimination task to manipulate memory strength. At test, participants were required to discriminate strong targets and strong lures from weak targets and new lures. This resulted in a concordant pattern of increased "old" responses to strong targets and lures. Model estimates attributed this pattern to either equivalent increases in memory strength across the two types of items (unequal variance signal detection model) or equivalent increases in both familiarity and recollection (dual process signal detection [DPSD] model). Hippocampal activity associated with strong targets and lures showed equivalent increases compared with missed items. This remained the case when analyses were restricted to high-confidence responses considered by the DPSD model to reflect predominantly recollection. A similar pattern of activity was observed in parahippocampal cortex for high-confidence responses. The present results are incompatible with "noncriterial" or "false" recollection being reflected solely in inflated DPSD familiarity estimates and support a positive correlation between hippocampal activity and memory strength irrespective of the accuracy of list discrimination, consistent with the unequal variance signal detection model account.
Resumo:
To understand factors that affect brain connectivity and integrity, it is beneficial to automatically cluster white matter (WM) fibers into anatomically recognizable tracts. Whole brain tractography, based on diffusion-weighted MRI, generates vast sets of fibers throughout the brain; clustering them into consistent and recognizable bundles can be difficult as there are wide individual variations in the trajectory and shape of WM pathways. Here we introduce a novel automated tract clustering algorithm based on label fusion - a concept from traditional intensity-based segmentation. Streamline tractography generates many incorrect fibers, so our top-down approach extracts tracts consistent with known anatomy, by mapping multiple hand-labeled atlases into a new dataset. We fuse clustering results from different atlases, using a mean distance fusion scheme. We reliably extracted the major tracts from 105-gradient high angular resolution diffusion images (HARDI) of 198 young normal twins. To compute population statistics, we use a pointwise correspondence method to match, compare, and average WM tracts across subjects. We illustrate our method in a genetic study of white matter tract heritability in twins.
Resumo:
Automatic labeling of white matter fibres in diffusion-weighted brain MRI is vital for comparing brain integrity and connectivity across populations, but is challenging. Whole brain tractography generates a vast set of fibres throughout the brain, but it is hard to cluster them into anatomically meaningful tracts, due to wide individual variations in the trajectory and shape of white matter pathways. We propose a novel automatic tract labeling algorithm that fuses information from tractography and multiple hand-labeled fibre tract atlases. As streamline tractography can generate a large number of false positive fibres, we developed a top-down approach to extract tracts consistent with known anatomy, based on a distance metric to multiple hand-labeled atlases. Clustering results from different atlases were fused, using a multi-stage fusion scheme. Our "label fusion" method reliably extracted the major tracts from 105-gradient HARDI scans of 100 young normal adults. © 2012 Springer-Verlag.
Resumo:
We introduce a framework for population analysis of white matter tracts based on diffusion-weighted images of the brain. The framework enables extraction of fibers from high angular resolution diffusion images (HARDI); clustering of the fibers based partly on prior knowledge from an atlas; representation of the fiber bundles compactly using a path following points of highest density (maximum density path; MDP); and registration of these paths together using geodesic curve matching to find local correspondences across a population. We demonstrate our method on 4-Tesla HARDI scans from 565 young adults to compute localized statistics across 50 white matter tracts based on fractional anisotropy (FA). Experimental results show increased sensitivity in the determination of genetic influences on principal fiber tracts compared to the tract-based spatial statistics (TBSS) method. Our results show that the MDP representation reveals important parts of the white matter structure and considerably reduces the dimensionality over comparable fiber matching approaches.
Resumo:
Spoken term detection (STD) is the task of looking up a spoken term in a large volume of speech segments. In order to provide fast search, speech segments are first indexed into an intermediate representation using speech recognition engines which provide multiple hypotheses for each speech segment. Approximate matching techniques are usually applied at the search stage to compensate the poor performance of automatic speech recognition engines during indexing. Recently, using visual information in addition to audio information has been shown to improve phone recognition performance, particularly in noisy environments. In this paper, we will make use of visual information in the form of lip movements of the speaker in indexing stage and will investigate its effect on STD performance. Particularly, we will investigate if gains in phone recognition accuracy will carry through the approximate matching stage to provide similar gains in the final audio-visual STD system over a traditional audio only approach. We will also investigate the effect of using visual information on STD performance in different noise environments.
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
As critical infrastructure such as transportation hubs continue to grow in complexity, greater importance is placed on monitoring these facilities to ensure their secure and efficient operation. In order to achieve these goals, technology continues to evolve in response to the needs of various infrastructure. To date, however, the focus of technology for surveillance has been primarily concerned with security, and little attention has been placed on assisting operations and monitoring performance in real-time. Consequently, solutions have emerged to provide real-time measurements of queues and crowding in spaces, but have been installed as system add-ons (rather than making better use of existing infrastructure), resulting in expensive infrastructure outlay for the owner/operator, and an overload of surveillance systems which in itself creates further complexity. Given many critical infrastructure already have camera networks installed, it is much more desirable to better utilise these networks to address operational monitoring as well as security needs. Recently, a growing number of approaches have been proposed to monitor operational aspects such as pedestrian throughput, crowd size and dwell times. In this paper, we explore how these techniques relate to and complement the more commonly seen security analytics, and demonstrate the value that can be added by operational analytics by demonstrating their performance on airport surveillance data. We explore how multiple analytics and systems can be combined to better leverage the large amount of data that is available, and we discuss the applicability and resulting benefits of the proposed framework for the ongoing operation of airports and airport networks.
Resumo:
Purpose Traditional construction planning relies upon the critical path method (CPM) and bar charts. Both of these methods suffer from visualization and timing issues that could be addressed by 4D technology specifically geared to meet the needs of the construction industry. This paper proposed a new construction planning approach based on simulation by using a game engine. Design/methodology/approach A 4D automatic simulation tool was developed and a case study was carried out. The proposed tool was used to simulate and optimize the plans for the installation of a temporary platform for piling in a civil construction project in Hong Kong. The tool simulated the result of the construction process with three variables: 1) equipment, 2) site layout and 3) schedule. Through this, the construction team was able to repeatedly simulate a range of options. Findings The results indicate that the proposed approach can provide a user-friendly 4D simulation platform for the construction industry. The simulation can also identify the solution being sought by the construction team. The paper also identifies directions for further development of the 4D technology as an aid in construction planning and decision-making. Research limitations/implications The tests on the tool are limited to a single case study and further research is needed to test the use of game engines for construction planning in different construction projects to verify its effectiveness. Future research could also explore the use of alternative game engines and compare their performance and results. Originality/value The authors proposed the use of game engine to simulate the construction process based on resources, working space and construction schedule. The developed tool can be used by end-users without simulation experience.
Resumo:
Purpose Optical blur and ageing are known to affect driving performance but their effects on drivers' eye movements are poorly understood. This study examined the effects of optical blur and age on eye movement patterns and performance on the DriveSafe slide recognition test which is purported to predict fitness to drive. Methods Twenty young (27.1 ± 4.6 years) and 20 older (73.3 ± 5.7 years) visually normal drivers performed the DriveSafe under two visual conditions: best-corrected vision and with +2.00 DS blur. The DriveSafe is a Visual Recognition Slide Test that consists of brief presentations of static, real-world driving scenes containing different road users (pedestrians, bicycles and vehicles). Participants reported the types, relative positions and direction of travel of the road users in each image; the score was the number of correctly reported items (maximum score of 128). Eye movements were recorded while participants performed the DriveSafe test using a Tobii TX300 eye tracking system. Results There was a significant main effect of blur on DriveSafe scores (best-corrected: 114.9 vs blur: 93.2; p < 0.001). There was also a significant age and blur interaction on the DriveSafe scores (p < 0.001) such that the young drivers were more negatively affected by blur than the older drivers (reductions of 22% and 13% respectively; p < 0.001): with best-corrected vision, the young drivers performed better than the older drivers (DriveSafe scores: 118.4 vs 111.5; p = 0.001), while with blur, the young drivers performed worse than the older drivers (88.6 vs 95.9; p = 0.009). For the eye movement patterns, blur significantly reduced the number of fixations on road users (best-corrected: 5.1 vs blur: 4.5; p < 0.001), fixation duration on road users (2.0 s vs 1.8 s; p < 0.001) and saccade amplitudes (7.4° vs 6.7°; p < 0.001). A main effect of age on eye movements was also found where older drivers made smaller saccades than the young drivers (6.7° vs 7.4°; p < 0.001). Conclusions Blur reduced DriveSafe scores for both age groups and this effect was greater for the young drivers. The decrease in number of fixations and fixation duration on road users, as well as the reduction in saccade amplitudes under the blurred condition, highlight the difficulty experienced in performing the task in the presence of optical blur, which suggests that uncorrected refractive errors may have a detrimental impact on aspects of driving performance.
Resumo:
INTRODUCTION There is a large range in the reported prevalence of end plate lesions (EPLs), sometimes referred to as Schmorl's nodes in the general population (3.8-76%). One possible reason for this large range is the differences in definitions used by authors. Previous research has suggested that EPLs may potentially be a primary disturbance of growth plates that leads to the onset of scoliosis. The aim of this study was to develop a technique to measure the size, prevalence and location of EPLs on Computed Tomography (CT) images of scoliosis patients in a consistent manner. METHODS A detection algorithm was developed and applied to measure EPLs for five adolescent females with idiopathic scoliosis (average age 15.1 years, average major Cobb 60°). In this algorithm, the EPL definition was based on the lesion depth, the distance from the edge of the vertebral body and the gradient of the lesion edge. Existing low-dose, CT scans of the patients' spines were segmented semi-automatically to extract 3D vertebral endplate morphology. Manual sectioning of any attachments between posterior elements of adjacent vertebrae and, if necessary, endplates was carried out before the automatic algorithm was used to determine the presence and position of EPLs. RESULTS EPLs were identified in 15 of the 170 (8.8%) endplates analysed with an average depth of 3.1mm. 73% of the EPLs were seen in the lumbar spines (11/15). A sensitivity study demonstrated that the algorithm was most sensitive to changes in the minimum gradient required at the lesion edge. CONCLUSION An imaging analysis technique for consistent measurement of the prevalence, location and size of EPLs on CT images has been developed. Although the technique was tested on scoliosis patients, it can be used to analyse other populations without observer errors in EPL definitions.
Resumo:
Highly efficient loading of bone morphogenetic protein-2 (BMP-2) onto carriers with desirable performance is still a major challenge in the field of bone regeneration. Till now, the nanoscaled surface-induced changes of the structure and bioactivity of BMP-2 remains poorly understood. Here, the effect of nanoscaled surface on the adsorption and bioactivity of BMP-2 was investigated with a series of hydroxyapatite surfaces (HAPs): HAP crystal-coated surface (HAP), HAP crystal-coated polished surface (HAP-Pol), and sintered HAP crystal-coated surface (HAP-Sin). The adsorption dynamics of recombinant human BMP-2 (rhBMP-2) and the accessibility of the binding epitopes of adsorbed rhBMP-2 for BMP receptors (BMPRs) were examined by a quartz crystal microbalance with dissipation. Moreover, the bioactivity of adsorbed rhBMP-2 and the BMP-induced Smad signaling were investigated with C2C12 model cells. A noticeably high mass-uptake of rhBMP-2 and enhanced recognition of BMPR-IA to adsorbed rhBMP-2 were found on the HAP-Pol surface. For the rhBMP-2-adsorbed HAPs, both ALP activity and Smad signaling increased in the order of HAP-Sin < HAP < HAP-Pol. Furthermore, hybrid molecular dynamics and steered molecular dynamics simulations validated that BMP-2 tightly anchored on the HAP-Pol surface with a relative loosened conformation, but the HAP-Sin surface induced a compact conformation of BMP-2. In conclusion, the nanostructured HAPs can modulate the way of adsorption of rhBMP-2, and thus the recognition of BMPR-IA and the bioactivity of rhBMP-2. These findings can provide insightful suggestions for the future design and fabrication of rhBMP-2-based scaffolds/implants.