973 resultados para visual odometry algorithms
Resumo:
Magnetoencephalography (MEG) can be used to reconstruct neuronal activity with high spatial and temporal resolution. However, this reconstruction problem is ill-posed, and requires the use of prior constraints in order to produce a unique solution. At present there are a multitude of inversion algorithms, each employing different assumptions, but one major problem when comparing the accuracy of these different approaches is that often the true underlying electrical state of the brain is unknown. In this study, we explore one paradigm, retinotopic mapping in the primary visual cortex (V1), for which the ground truth is known to a reasonable degree of accuracy, enabling the comparison of MEG source reconstructions with the true electrical state of the brain. Specifically, we attempted to localize, using a beanforming method, the induced responses in the visual cortex generated by a high contrast, retinotopically varying stimulus. Although well described in primate studies, it has been an open question whether the induced gamma power in humans due to high contrast gratings derives from V1 rather than the prestriate cortex (V2). We show that the beanformer source estimate in the gamma and theta bands does vary in a manner consistent with the known retinotopy of V1. However, these peak locations, although retinotopically organized, did not accurately localize to the cortical surface. We considered possible causes for this discrepancy and suggest that improved MEG/magnetic resonance imaging co-registration and the use of more accurate source models that take into account the spatial extent and shape of the active cortex may, in future, improve the accuracy of the source reconstructions.
Resumo:
Intermittent photic stimulation (IPS) is a common procedure performed in the electroencephalography (EEG) laboratory in children and adults to detect abnormal epileptogenic sensitivity to flickering light (i.e., photosensitivity). In practice, substantial variability in outcome is anecdotally found due to the many different methods used per laboratory and country. We believe that standardization of procedure, based on scientific and clinical data, should permit reproducible identification and quantification of photosensitivity. We hope that the use of our new algorithm will help in standardizing the IPS procedure, which in turn may more clearly identify and assist monitoring of patients with epilepsy and photosensitivity. Our algorithm goes far beyond that published in 1999 (Epilepsia, 1999a, 40, 75; Neurophysiol Clin, 1999b, 29, 318): it has substantially increased content, detailing technical and logistical aspects of IPS testing and the rationale for many of the steps in the IPS procedure. Furthermore, our latest algorithm incorporates the consensus of repeated scientific meetings of European experts in this field over a period of 6 years with feedback from general neurologists and epileptologists to improve its validity and utility. Accordingly, our European group has provided herein updated algorithms for two different levels of methodology: (1) requirements for defining photosensitivity in patients and in family members of known photosensitive patients and (2) requirements for tailored studies in patients with a clear history of visually induced seizures or complaints, and in those already known to be photosensitive.
Resumo:
Most existing color-based tracking algorithms utilize the statistical color information of the object as the tracking clues, without maintaining the spatial structure within a single chromatic image. Recently, the researches on the multilinear algebra provide the possibility to hold the spatial structural relationship in a representation of the image ensembles. In this paper, a third-order color tensor is constructed to represent the object to be tracked. Considering the influence of the environment changing on the tracking, the biased discriminant analysis (BDA) is extended to the tensor biased discriminant analysis (TBDA) for distinguishing the object from the background. At the same time, an incremental scheme for the TBDA is developed for the tensor biased discriminant subspace online learning, which can be used to adapt to the appearance variant of both the object and background. The experimental results show that the proposed method can track objects precisely undergoing large pose, scale and lighting changes, as well as partial occlusion. © 2009 Elsevier B.V.
Resumo:
The polyparametric intelligence information system for diagnostics human functional state in medicine and public health is developed. The essence of the system consists in polyparametric describing of human functional state with the unified set of physiological parameters and using the polyparametric cognitive model developed as the tool for a system analysis of multitude data and diagnostics of a human functional state. The model is developed on the basis of general principles geometry and symmetry by algorithms of artificial intelligence systems. The architecture of the system is represented. The model allows analyzing traditional signs - absolute values of electrophysiological parameters and new signs generated by the model – relationships of ones. The classification of physiological multidimensional data is made with a transformer of the model. The results are presented to a physician in a form of visual graph – a pattern individual functional state. This graph allows performing clinical syndrome analysis. A level of human functional state is defined in the case of the developed standard (“ideal”) functional state. The complete formalization of results makes it possible to accumulate physiological data and to analyze them by mathematics methods.
Resumo:
Background: Vigabatrin (VGB) is an anti-epileptic medication which has been linked to peripheral constriction of the visual field. Documenting the natural history associated with continued VGB exposure is important when making decisions about the risk and benefits associated with the treatment. Due to its speed the Swedish Interactive Threshold Algorithm (SITA) has become the algorithm of choice when carrying out Full Threshold automated static perimetry. SITA uses prior distributions of normal and glaucomatous visual field behaviour to estimate threshold sensitivity. As the abnormal model is based on glaucomatous behaviour this algorithm has not been validated for VGB recipients. We aim to assess the clinical utility of the SITA algorithm for accurately mapping VGB attributed field loss. Methods: The sample comprised one randomly selected eye of 16 patients diagnosed with epilepsy, exposed to VGB therapy. A clinical diagnosis of VGB attributed visual field loss was documented in 44% of the group. The mean age was 39.3 years∈±∈14.5 years and the mean deviation was -4.76 dB ±4.34 dB. Each patient was examined with the Full Threshold, SITA Standard and SITA Fast algorithm. Results: SITA Standard was on average approximately twice as fast (7.6 minutes) and SITA Fast approximately 3 times as fast (4.7 minutes) as examinations completed using the Full Threshold algorithm (15.8 minutes). In the clinical environment, the visual field outcome with both SITA algorithms was equivalent to visual field examination using the Full Threshold algorithm in terms of visual inspection of the grey scale plots, defect area and defect severity. Conclusions: Our research shows that both SITA algorithms are able to accurately map visual field loss attributed to VGB. As patients diagnosed with epilepsy are often vulnerable to fatigue, the time saving offered by SITA Fast means that this algorithm has a significant advantage for use with VGB recipients.
Resumo:
This paper considers the problem of low-dimensional visualisation of very high dimensional information sources for the purpose of situation awareness in the maritime environment. In response to the requirement for human decision support aids to reduce information overload (and specifically, data amenable to inter-point relative similarity measures) appropriate to the below-water maritime domain, we are investigating a preliminary prototype topographic visualisation model. The focus of the current paper is on the mathematical problem of exploiting a relative dissimilarity representation of signals in a visual informatics mapping model, driven by real-world sonar systems. A realistic noise model is explored and incorporated into non-linear and topographic visualisation algorithms building on the approach of [9]. Concepts are illustrated using a real world dataset of 32 hydrophones monitoring a shallow-water environment in which targets are present and dynamic.
Resumo:
lmage super-resolution is defined as a class of techniques that enhance the spatial resolution of images. Super-resolution methods can be subdivided in single and multi image methods. This thesis focuses on developing algorithms based on mathematical theories for single image super resolution problems. lndeed, in arder to estimate an output image, we adopta mixed approach: i.e., we use both a dictionary of patches with sparsity constraints (typical of learning-based methods) and regularization terms (typical of reconstruction-based methods). Although the existing methods already per- form well, they do not take into account the geometry of the data to: regularize the solution, cluster data samples (samples are often clustered using algorithms with the Euclidean distance as a dissimilarity metric), learn dictionaries (they are often learned using PCA or K-SVD). Thus, state-of-the-art methods still suffer from shortcomings. In this work, we proposed three new methods to overcome these deficiencies. First, we developed SE-ASDS (a structure tensor based regularization term) in arder to improve the sharpness of edges. SE-ASDS achieves much better results than many state-of-the- art algorithms. Then, we proposed AGNN and GOC algorithms for determining a local subset of training samples from which a good local model can be computed for recon- structing a given input test sample, where we take into account the underlying geometry of the data. AGNN and GOC methods outperform spectral clustering, soft clustering, and geodesic distance based subset selection in most settings. Next, we proposed aSOB strategy which takes into account the geometry of the data and the dictionary size. The aSOB strategy outperforms both PCA and PGA methods. Finally, we combine all our methods in a unique algorithm, named G2SR. Our proposed G2SR algorithm shows better visual and quantitative results when compared to the results of state-of-the-art methods.
Resumo:
SANTANA, André M.; SANTIAGO, Gutemberg S.; MEDEIROS, Adelardo A. D. Real-Time Visual SLAM Using Pre-Existing Floor Lines as Landmarks and a Single Camera. In: CONGRESSO BRASILEIRO DE AUTOMÁTICA, 2008, Juiz de Fora, MG. Anais... Juiz de Fora: CBA, 2008.
Resumo:
SANTANA, André M.; SANTIAGO, Gutemberg S.; MEDEIROS, Adelardo A. D. Real-Time Visual SLAM Using Pre-Existing Floor Lines as Landmarks and a Single Camera. In: CONGRESSO BRASILEIRO DE AUTOMÁTICA, 2008, Juiz de Fora, MG. Anais... Juiz de Fora: CBA, 2008.
Resumo:
Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation.
Resumo:
Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.
Resumo:
Sequences of timestamped events are currently being generated across nearly every domain of data analytics, from e-commerce web logging to electronic health records used by doctors and medical researchers. Every day, this data type is reviewed by humans who apply statistical tests, hoping to learn everything they can about how these processes work, why they break, and how they can be improved upon. To further uncover how these processes work the way they do, researchers often compare two groups, or cohorts, of event sequences to find the differences and similarities between outcomes and processes. With temporal event sequence data, this task is complex because of the variety of ways single events and sequences of events can differ between the two cohorts of records: the structure of the event sequences (e.g., event order, co-occurring events, or frequencies of events), the attributes about the events and records (e.g., gender of a patient), or metrics about the timestamps themselves (e.g., duration of an event). Running statistical tests to cover all these cases and determining which results are significant becomes cumbersome. Current visual analytics tools for comparing groups of event sequences emphasize a purely statistical or purely visual approach for comparison. Visual analytics tools leverage humans' ability to easily see patterns and anomalies that they were not expecting, but is limited by uncertainty in findings. Statistical tools emphasize finding significant differences in the data, but often requires researchers have a concrete question and doesn't facilitate more general exploration of the data. Combining visual analytics tools with statistical methods leverages the benefits of both approaches for quicker and easier insight discovery. Integrating statistics into a visualization tool presents many challenges on the frontend (e.g., displaying the results of many different metrics concisely) and in the backend (e.g., scalability challenges with running various metrics on multi-dimensional data at once). I begin by exploring the problem of comparing cohorts of event sequences and understanding the questions that analysts commonly ask in this task. From there, I demonstrate that combining automated statistics with an interactive user interface amplifies the benefits of both types of tools, thereby enabling analysts to conduct quicker and easier data exploration, hypothesis generation, and insight discovery. The direct contributions of this dissertation are: (1) a taxonomy of metrics for comparing cohorts of temporal event sequences, (2) a statistical framework for exploratory data analysis with a method I refer to as high-volume hypothesis testing (HVHT), (3) a family of visualizations and guidelines for interaction techniques that are useful for understanding and parsing the results, and (4) a user study, five long-term case studies, and five short-term case studies which demonstrate the utility and impact of these methods in various domains: four in the medical domain, one in web log analysis, two in education, and one each in social networks, sports analytics, and security. My dissertation contributes an understanding of how cohorts of temporal event sequences are commonly compared and the difficulties associated with applying and parsing the results of these metrics. It also contributes a set of visualizations, algorithms, and design guidelines for balancing automated statistics with user-driven analysis to guide users to significant, distinguishing features between cohorts. This work opens avenues for future research in comparing two or more groups of temporal event sequences, opening traditional machine learning and data mining techniques to user interaction, and extending the principles found in this dissertation to data types beyond temporal event sequences.