927 resultados para 3D object recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite being poised as a standard for data exchange for operation and maintenance data, the database heritage of the MIMOSA OSA-EAI is clearly evident from using a relational model at its core. The XML schema (XSD) definitions, which are used for communication between asset management systems, are based on the MIMOSA common relational information schema (CRIS), a relational model, and consequently, many database concepts permeate the communications layer. The adoption of a relational model leads to several deficiencies, and overlooks advances in object-oriented approach for an upcoming version of the specification, and the common conceptual object model (CCOM) sees a transition to fully utilising object-oriented features for the standard. Unified modelling language (UML) is used as a medium for documentation as well as facilitating XSD code generation. This paper details some of the decisions faced in developing the CCOM and provides a glimpse into the future of asset management and data exchange models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Since the availability of 3D full body scanners and the associated software systems for operations with large point clouds, 3D anthropometry has been marketed as a breakthrough and milestone in ergonomic design. The assumptions made by the representatives of the 3D paradigm need to be critically reviewed though. 3D anthropometry has advantages as well as shortfalls, which need to be carefully considered. While it is apparent that the measurement of a full body point cloud allows for easier storage of raw data and improves quality control, the difficulties in calculation of standardized measurements from the point cloud are widely underestimated. Early studies that made use of 3D point clouds to derive anthropometric dimensions have shown unacceptable deviations from the standardized results measured manually. While 3D human point clouds provide a valuable tool to replicate specific single persons for further virtual studies, or personalize garment, their use in ergonomic design must be critically assessed. Ergonomic, volumetric problems are defined by their 2-dimensional boundary or one dimensional sections. A 1D/2D approach is therefore sufficient to solve an ergonomic design problem. As a consequence, all modern 3D human manikins are defined by the underlying anthropometric girths (2D) and lengths/widths (1D), which can be measured efficiently using manual techniques. Traditionally, Ergonomists have taken a statistical approach to design for generalized percentiles of the population rather than for a single user. The underlying method is based on the distribution function of meaningful single and two-dimensional anthropometric variables. Compared to these variables, the distribution of human volume has no ergonomic relevance. On the other hand, if volume is to be seen as a two-dimensional integral or distribution function of length and girth, the calculation of combined percentiles – a common ergonomic requirement - is undefined. Consequently, we suggest to critically review the cost and use of 3D anthropometry. We also recommend making proper use of widely available single and 2-dimensional anthropometric data in ergonomic design.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Accurate and detailed road models play an important role in a number of geospatial applications, such as infrastructure planning, traffic monitoring, and driver assistance systems. In this thesis, an integrated approach for the automatic extraction of precise road features from high resolution aerial images and LiDAR point clouds is presented. A framework of road information modeling has been proposed, for rural and urban scenarios respectively, and an integrated system has been developed to deal with road feature extraction using image and LiDAR analysis. For road extraction in rural regions, a hierarchical image analysis is first performed to maximize the exploitation of road characteristics in different resolutions. The rough locations and directions of roads are provided by the road centerlines detected in low resolution images, both of which can be further employed to facilitate the road information generation in high resolution images. The histogram thresholding method is then chosen to classify road details in high resolution images, where color space transformation is used for data preparation. After the road surface detection, anisotropic Gaussian and Gabor filters are employed to enhance road pavement markings while constraining other ground objects, such as vegetation and houses. Afterwards, pavement markings are obtained from the filtered image using the Otsu's clustering method. The final road model is generated by superimposing the lane markings on the road surfaces, where the digital terrain model (DTM) produced by LiDAR data can also be combined to obtain the 3D road model. As the extraction of roads in urban areas is greatly affected by buildings, shadows, vehicles, and parking lots, we combine high resolution aerial images and dense LiDAR data to fully exploit the precise spectral and horizontal spatial resolution of aerial images and the accurate vertical information provided by airborne LiDAR. Objectoriented image analysis methods are employed to process the feature classiffcation and road detection in aerial images. In this process, we first utilize an adaptive mean shift (MS) segmentation algorithm to segment the original images into meaningful object-oriented clusters. Then the support vector machine (SVM) algorithm is further applied on the MS segmented image to extract road objects. Road surface detected in LiDAR intensity images is taken as a mask to remove the effects of shadows and trees. In addition, normalized DSM (nDSM) obtained from LiDAR is employed to filter out other above-ground objects, such as buildings and vehicles. The proposed road extraction approaches are tested using rural and urban datasets respectively. The rural road extraction method is performed using pan-sharpened aerial images of the Bruce Highway, Gympie, Queensland. The road extraction algorithm for urban regions is tested using the datasets of Bundaberg, which combine aerial imagery and LiDAR data. Quantitative evaluation of the extracted road information for both datasets has been carried out. The experiments and the evaluation results using Gympie datasets show that more than 96% of the road surfaces and over 90% of the lane markings are accurately reconstructed, and the false alarm rates for road surfaces and lane markings are below 3% and 2% respectively. For the urban test sites of Bundaberg, more than 93% of the road surface is correctly reconstructed, and the mis-detection rate is below 10%.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Contact lenses are a common method for the correction of refractive errors of the eye. While there have been significant advancements in contact lens designs and materials over the past few decades, the lenses still represent a foreign object in the ocular environment and may lead to physiological as well as mechanical effects on the eye. When contact lenses are placed in the eye, the ocular anatomical structures behind and in front of the lenses are directly affected. This thesis presents a series of experiments that investigate the mechanical and physiological effects of the short-term use of contact lenses on anterior and posterior corneal topography, corneal thickness, the eyelids, tarsal conjunctiva and tear film surface quality. The experimental paradigm used in these studies was a repeated measures, cross-over study design where subjects wore various types of contact lenses on different days and the lenses were varied in one or more key parameters (e.g. material or design). Both, old and newer lens materials were investigated, soft and rigid lenses were used, high and low oxygen permeability materials were tested, toric and spherical lens designs were examined, high and low powers and small and large diameter lenses were used in the studies. To establish the natural variability in the ocular measurements used in the studies, each experiment also contained at least one “baseline” day where an identical measurement protocol was followed, with no contact lenses worn. In this way, changes associated with contact lens wear were considered in relation to those changes that occurred naturally during the 8 hour period of the experiment. In the first study, the regional distribution and magnitude of change in corneal thickness and topography was investigated in the anterior and posterior cornea after short-term use of soft contact lenses in 12 young adults using the Pentacam. Four different types of contact lenses (Silicone hydrogel/ Spherical/–3D, Silicone Hydrogel/Spherical/–7D, Silicone Hydrogel/Toric/–3D and HEMA/Toric/–3D) of different materials, designs and powers were worn for 8 hours each, on 4 different days. The natural diurnal changes in corneal thickness and curvature were measured on two separate days before any contact lens wear. Significant diurnal changes in corneal thickness and curvature within the duration of the study were observed and these were taken into consideration for calculating the contact lens induced corneal changes. Corneal thickness changed significantly with lens wear and the greatest corneal swelling was seen with the hydrogel (HEMA) toric lens with a noticeable regional swelling of the cornea beneath the stabilization zones, the thickest regions of the lenses. The anterior corneal surface generally showed a slight flattening with lens wear. All contact lenses resulted in central posterior corneal steepening, which correlated with the relative degree of corneal swelling. The corneal swelling induced by the silicone hydrogel contact lenses was typically less than the natural diurnal thinning of the cornea over this same period (i.e. net thinning). This highlights why it is important to consider the natural diurnal variations in corneal thickness observed from morning to afternoon to accurately interpret contact lens induced corneal swelling. In the second experiment, the relative influence of lenses of different rigidity (polymethyl methacrylate – PMMA, rigid gas permeable – RGP and silicone hydrogel – SiHy) and diameters (9.5, 10.5 and 14.0) on corneal thickness, topography, refractive power and wavefront error were investigated. Four different types of contact lenses (PMMA/9.5, RGP/9.5, RGP/10.5, SiHy/14.0), were worn by 14 young healthy adults for a period of 8 hours on 4 different days. There was a clear association between fluorescein fitting pattern characteristics (i.e. regions of minimum clearance in the fluorescein pattern) and the resulting corneal shape changes. PMMA lenses resulted in significant corneal swelling (more in the centre than periphery) along with anterior corneal steepening and posterior flattening. RGP lenses, on the other hand, caused less corneal swelling (more in the periphery than centre) along with opposite effects on corneal curvature, anterior corneal flattening and posterior steepening. RGP lenses also resulted in a clinically and statistically significant decrease in corneal refractive power (ranging from 0.99 to 0.01 D), large enough to affect vision and require adjustment in the lens power. Wavefront analysis also showed a significant increase in higher order aberrations after PMMA lens wear, which may partly explain previous reports of "spectacle blur" following PMMA lens wear. We further explored corneal curvature, thickness and refractive changes with back surface toric and spherical RGP lenses in a group of 6 subjects with toric corneas. The lenses were worn for 8 hours and measurements were taken before and after lens wear, as in previous experiments. Both lens types caused anterior corneal flattening and a decrease in corneal refractive power but the changes were greater with the spherical lens. The spherical lens also caused a significant decrease in WTR astigmatism (WRT astigmatism defined as major axis within 30 degrees of horizontal). Both the lenses caused slight posterior corneal steepening and corneal swelling, with a greater effect in the periphery compared to the central cornea. Eyelid position, lid-wiper and tarsal conjunctival staining were also measured in Experiment 2 after short-term use of the rigid and SiHy contact lenses. Digital photos of the external eyes were captured for lid position analysis. The lid-wiper region of the marginal conjunctiva was stained using fluorescein and lissamine green dyes and digital photos were graded by an independent masked observer. A grading scale was developed in order to describe the tarsal conjunctival staining. A significant decrease in the palpebral aperture height (blepharoptosis) was found after wearing of PMMA/9.5 and RGP/10.5 lenses. All three rigid contact lenses caused a significant increase in lid-wiper and tarsal staining after 8 hours of lens wear. There was also a significant diurnal increase in tarsal staining, even without contact lens wear. These findings highlight the need for better contact lens edge design to minimise the interactions between the lid and contact lens edge during blinking and more lubricious contact lens surfaces to reduce ocular surface micro-trauma due to friction and for. Tear film surface quality (TFSQ) was measured using a high-speed videokeratoscopy technique in Experiment 2. TFSQ was worse with all the lenses compared to baseline (PMMA/9.5, RGP/9.5, RGP/10.5, and SiHy/14) in the afternoon (after 8 hours) during normal and suppressed blinking conditions. The reduction in TFSQ was similar with all the contact lenses used, irrespective of their material and diameter. An unusual pattern of change in TFSQ in suppressed blinking conditions was also found. The TFSQ with contact lens was found to decrease until a certain time after which it improved to a value even better than the bare eye. This is likely to be due to the tear film drying completely over the surface of the contact lenses. The findings of this study also show that there is still a scope for improvement in contact lens materials in terms of better wettability and hydrophilicity in order to improve TFSQ and patient comfort. These experiments showed that a variety of changes can occur in the anterior eye as a result of the short-term use of a range of commonly used contact lens types. The greatest corneal changes occurred with lenses manufactured from older HEMA and PMMA lens materials, whereas modern SiHy and rigid gas permeable materials caused more subtle changes in corneal shape and thickness. All lenses caused signs of micro-trauma to the eyelid wiper and palpebral conjunctiva, although rigid lenses appeared to cause more significant changes. Tear film surface quality was also significantly reduced with all types of contact lenses. These short-term changes in the anterior eye are potential markers for further long term changes and the relative differences between lens types that we have identified provide an indication of areas of contact lens design and manufacture that warrant further development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The design of pre-contoured fracture fixation implants (plates and nails) that correctly fit the anatomy of a patient utilises 3D models of long bones with accurate geometric representation. 3D data is usually available from computed tomography (CT) scans of human cadavers that generally represent the above 60 year old age group. Thus, despite the fact that half of the seriously injured population comes from the 30 year age group and below, virtually no data exists from these younger age groups to inform the design of implants that optimally fit patients from these groups. Hence, relevant bone data from these age groups is required. The current gold standard for acquiring such data–CT–involves ionising radiation and cannot be used to scan healthy human volunteers. Magnetic resonance imaging (MRI) has been shown to be a potential alternative in the previous studies conducted using small bones (tarsal bones) and parts of the long bones. However, in order to use MRI effectively for 3D reconstruction of human long bones, further validations using long bones and appropriate reference standards are required. Accurate reconstruction of 3D models from CT or MRI data sets requires an accurate image segmentation method. Currently available sophisticated segmentation methods involve complex programming and mathematics that researchers are not trained to perform. Therefore, an accurate but relatively simple segmentation method is required for segmentation of CT and MRI data. Furthermore, some of the limitations of 1.5T MRI such as very long scanning times and poor contrast in articular regions can potentially be reduced by using higher field 3T MRI imaging. However, a quantification of the signal to noise ratio (SNR) gain at the bone - soft tissue interface should be performed; this is not reported in the literature. As MRI scanning of long bones has very long scanning times, the acquired images are more prone to motion artefacts due to random movements of the subject‟s limbs. One of the artefacts observed is the step artefact that is believed to occur from the random movements of the volunteer during a scan. This needs to be corrected before the models can be used for implant design. As the first aim, this study investigated two segmentation methods: intensity thresholding and Canny edge detection as accurate but simple segmentation methods for segmentation of MRI and CT data. The second aim was to investigate the usability of MRI as a radiation free imaging alternative to CT for reconstruction of 3D models of long bones. The third aim was to use 3T MRI to improve the poor contrast in articular regions and long scanning times of current MRI. The fourth and final aim was to minimise the step artefact using 3D modelling techniques. The segmentation methods were investigated using CT scans of five ovine femora. The single level thresholding was performed using a visually selected threshold level to segment the complete femur. For multilevel thresholding, multiple threshold levels calculated from the threshold selection method were used for the proximal, diaphyseal and distal regions of the femur. Canny edge detection was used by delineating the outer and inner contour of 2D images and then combining them to generate the 3D model. Models generated from these methods were compared to the reference standard generated using the mechanical contact scans of the denuded bone. The second aim was achieved using CT and MRI scans of five ovine femora and segmenting them using the multilevel threshold method. A surface geometric comparison was conducted between CT based, MRI based and reference models. To quantitatively compare the 1.5T images to the 3T MRI images, the right lower limbs of five healthy volunteers were scanned using scanners from the same manufacturer. The images obtained using the identical protocols were compared by means of SNR and contrast to noise ratio (CNR) of muscle, bone marrow and bone. In order to correct the step artefact in the final 3D models, the step was simulated in five ovine femora scanned with a 3T MRI scanner. The step was corrected using the iterative closest point (ICP) algorithm based aligning method. The present study demonstrated that the multi-threshold approach in combination with the threshold selection method can generate 3D models from long bones with an average deviation of 0.18 mm. The same was 0.24 mm of the single threshold method. There was a significant statistical difference between the accuracy of models generated by the two methods. In comparison, the Canny edge detection method generated average deviation of 0.20 mm. MRI based models exhibited 0.23 mm average deviation in comparison to the 0.18 mm average deviation of CT based models. The differences were not statistically significant. 3T MRI improved the contrast in the bone–muscle interfaces of most anatomical regions of femora and tibiae, potentially improving the inaccuracies conferred by poor contrast of the articular regions. Using the robust ICP algorithm to align the 3D surfaces, the step artefact that occurred by the volunteer moving the leg was corrected, generating errors of 0.32 ± 0.02 mm when compared with the reference standard. The study concludes that magnetic resonance imaging, together with simple multilevel thresholding segmentation, is able to produce 3D models of long bones with accurate geometric representations. The method is, therefore, a potential alternative to the current gold standard CT imaging.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes a new system, dubbed Continuous Appearance-based Trajectory Simultaneous Localisation and Mapping (CAT-SLAM), which augments sequential appearance-based place recognition with local metric pose filtering to improve the frequency and reliability of appearance-based loop closure. As in other approaches to appearance-based mapping, loop closure is performed without calculating global feature geometry or performing 3D map construction. Loop-closure filtering uses a probabilistic distribution of possible loop closures along the robot’s previous trajectory, which is represented by a linked list of previously visited locations linked by odometric information. Sequential appearance-based place recognition and local metric pose filtering are evaluated simultaneously using a Rao–Blackwellised particle filter, which weights particles based on appearance matching over sequential frames and the similarity of robot motion along the trajectory. The particle filter explicitly models both the likelihood of revisiting previous locations and exploring new locations. A modified resampling scheme counters particle deprivation and allows loop-closure updates to be performed in constant time for a given environment. We compare the performance of CAT-SLAM with FAB-MAP (a state-of-the-art appearance-only SLAM algorithm) using multiple real-world datasets, demonstrating an increase in the number of correct loop closures detected by CAT-SLAM.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Large margin learning approaches, such as support vector machines (SVM), have been successfully applied to numerous classification tasks, especially for automatic facial expression recognition. The risk of such approaches however, is their sensitivity to large margin losses due to the influence from noisy training examples and outliers which is a common problem in the area of affective computing (i.e., manual coding at the frame level is tedious so coarse labels are normally assigned). In this paper, we leverage the relaxation of the parallel-hyperplanes constraint and propose the use of modified correlation filters (MCF). The MCF is similar in spirit to SVMs and correlation filters, but with the key difference of optimizing only a single hyperplane. We demonstrate the superiority of MCF over current techniques on a battery of experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Volume measurements are useful in many branches of science and medicine. They are usually accomplished by acquiring a sequence of cross sectional images through the object using an appropriate scanning modality, for example x-ray computed tomography (CT), magnetic resonance (MR) or ultrasound (US). In the cases of CT and MR, a dividing cubes algorithm can be used to describe the surface as a triangle mesh. However, such algorithms are not suitable for US data, especially when the image sequence is multiplanar (as it usually is). This problem may be overcome by manually tracing regions of interest (ROIs) on the registered multiplanar images and connecting the points into a triangular mesh. In this paper we describe and evaluate a new discreet form of Gauss’ theorem which enables the calculation of the volume of any enclosed surface described by a triangular mesh. The volume is calculated by summing the vector product of the centroid, area and normal of each surface triangle. The algorithm was tested on computer-generated objects, US-scanned balloons, livers and kidneys and CT-scanned clay rocks. The results, expressed as the mean percentage difference ± one standard deviation were 1.2 ± 2.3, 5.5 ± 4.7, 3.0 ± 3.2 and −1.2 ± 3.2% for balloons, livers, kidneys and rocks respectively. The results compare favourably with other volume estimation methods such as planimetry and tetrahedral decomposition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

“The Relevance of Religion” is the title of a recent address delivered by The Honourable Chief Justice Murray Gleeson of the High Court of Australia.1 In making the point “about the continuing public importance of religion”, the Chief Justice referenced Lord Devlin’s contention that “no society has yet solved the problem of how to teach morality without religion”....

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While researchers strive to improve automatic face recognition performance, the relationship between image resolution and face recognition performance has not received much attention. This relationship is examined systematically and a framework is developed such that results from super-resolution techniques can be compared. Three super-resolution techniques are compared with the Eigenface and Elastic Bunch Graph Matching face recognition engines. Parameter ranges over which these techniques provide better recognition performance than interpolated images is determined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the cancer research field, most in vitro studies still rely on two-dimensional (2D) cultures. However, the trend is rapidly shifting towards using a three-dimensional (3D) culture system. This is because 3D models better recapitulate the microenvironment of cells, and therefore, yield cellular and molecular responses that more accurately describe the pathophysiology of cancer. By adopting technology platforms established by the tissue engineering discipline, it is now possible to grow cancer cells in extracellular matrix (ECM)-like environments and dictate the biophysical and biochemical properties of the matrix. In addition, 3D models can be modified to recapitulate different stages of cancer progression for instance from the initial development of tumor to metastasis. Inevitably, to recapitulate a heterotypic condition, comprising more than one cell type, it requires a more complex 3D model. To date, 3D models that are available for studying the prostate cancer (CaP)-bone interactions are still lacking. Therefore, the aim of this study is to establish a co-culture model that allows investigation of direct and indirect CaP-bone interactions. Prior to that, 3D polyethylene glycol (PEG)-based hydrogel cultures for CaP cells were first developed and growth conditions were optimised. Characterization of the 3D hydrogel cultures show that LNCaP cells form a multicellular mass that resembles avascular tumor. In comparison to 2D cultures, besides the difference in cell morphology, the response of LNCaP cells to the androgen analogue (R1881) stimulation is different compared to the cells in 2D cultures. This discrepancy between 2D and 3D cultures is likely associated with the cell-cell contact, density and ligand-receptor interactions. Following the 3D monoculture study, a 3D direct co-culture model of CaP cells and the human tissue engineered bone (hTEBC) construct was developed. Interactions between the CaP cells and human osteoblasts (hOBs) resulted in elevation of Matrix Metalloproteinase 9 (MMP9) for PC-3 cells and Prostate Specific Antigen (PSA) for LNCaP cells. To further investigate the paracrine interaction of CaP cells and (hOBs), a 3D indirect co-culture model was developed, where LNCaP cells embedded within PEG hydrogels were co-cultured with hTEBC. It was found that the cellular changes observed reflect the early event of CaP colonizing the bone site. In the absence of androgens, interestingly, up-regulation of PSA and other kallikreins is also detected in the co-culture compared to the LNCaP monoculture. This non androgenic stimulation could be triggered by the soluble factors secreted by the hOB such as Interleukin-6. There are also decrease in alkaline phosphatase (ALP) activity and down-regulation of genes of the hOB when co-cultured with LNCaP cells that have not been previously described. These genes include transforming growth factor β1 (TGFβ1), osteocalcin and Vimentin. However, no changes to epithelial markers (e.g E-cadherin, Cytokeratin 8) were observed in both cell types from the co-culture. Some of these intriguing changes observed in the co-cultures that had not been previously described have enriched the basic knowledge of the CaP cell-bone interaction. From this study, we have shown evidence of the feasibility and versatility of our established 3D models. These models can be adapted to test various hypotheses for studies pertaining to underlying mechanisms of bone metastasis and could provide a vehicle for anticancer drug screening purposes in the future.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Calcium silicate (CaSiO3, CS) ceramics have received significant attention for application in bone regeneration due to their excellent in vitro apatite-mineralization ability; however, how to prepare porous CS scaffolds with a controllable pore structure for bone tissue engineering still remains a challenge. Conventional methods could not efficiently control the pore structure and mechanical strength of CS scaffolds, resulting in unstable in vivo osteogenesis. The aim of this study is to set out to solve these problems by applying a modified 3D-printing method to prepare highly uniform CS scaffolds with controllable pore structure and improved mechanical strength. The in vivo osteogenesis of the prepared 3D-printed CS scaffolds was further investigated by implanting them in the femur defects of rats. The results show that the CS scaffolds prepared by the modified 3D-printing method have uniform scaffold morphology. The pore size and pore structure of CS scaffolds can be efficiently adjusted. The compressive strength of 3D-printed CS scaffolds is around 120 times that of conventional polyurethane templated CS scaffolds. 3D-Printed CS scaffolds possess excellent apatite-mineralization ability in simulated body fluids. Micro-CT analysis has shown that 3D-printed CS scaffolds play an important role in assisting the regeneration of bone defects in vivo. The healing level of bone defects implanted by 3D-printed CS scaffolds is obviously higher than that of 3D-printed b-tricalcium phosphate (b-TCP) scaffolds at both 4 and 8 weeks. Hematoxylin and eosin (H&E) staining shows that 3D-printed CS scaffolds induce higher quality of the newly formed bone than 3D-printed b-TCP scaffolds. Immunohistochemical analyses have further shown that stronger expression of human type I collagen (COL1) and alkaline phosphate (ALP) in the bone matrix occurs in the 3D-printed CS scaffolds than in the 3D-printed b-TCP scaffolds. Considering these important advantages, such as controllable structure architecture, significant improvement in mechanical strength, excellent in vivo osteogenesis and since there is no need for second-time sintering, it is indicated that the prepared 3D-printed CS scaffolds are a promising material for application in bone regeneration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we use a sequence-based visual localization algorithm to reveal surprising answers to the question, how much visual information is actually needed to conduct effective navigation? The algorithm actively searches for the best local image matches within a sliding window of short route segments or 'sub-routes', and matches sub-routes by searching for coherent sequences of local image matches. In contract to many existing techniques, the technique requires no pre-training or camera parameter calibration. We compare the algorithm's performance to the state-of-the-art FAB-MAP 2.0 algorithm on a 70 km benchmark dataset. Performance matches or exceeds the state of the art feature-based localization technique using images as small as 4 pixels, fields of view reduced by a factor of 250, and pixel bit depths reduced to 2 bits. We present further results demonstrating the system localizing in an office environment with near 100% precision using two 7 bit Lego light sensors, as well as using 16 and 32 pixel images from a motorbike race and a mountain rally car stage. By demonstrating how little image information is required to achieve localization along a route, we hope to stimulate future 'low fidelity' approaches to visual navigation that complement probabilistic feature-based techniques.