931 resultados para large data sets


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main problem of pedestrian dead-reckoning (PDR) using only a body-attached inertial measurement unit is the accumulation of heading errors. The heading provided by magnetometers in indoor buildings is in general not reliable and therefore it is commonly not used. Recently, a new method was proposed called heuristic drift elimination (HDE) that minimises the heading error when navigating in buildings. It assumes that the majority of buildings have their corridors parallel to each other, or they intersect at right angles, and consequently most of the time the person walks along a straight path with a heading constrained to one of the four possible directions. In this article we study the performance of HDE-based methods in complex buildings, i.e. with pathways also oriented at 45°, long curved corridors, and wide areas where non-oriented motion is possible. We explain how the performance of the original HDE method can be deteriorated in complex buildings, and also, how severe errors can appear in the case of false matches with the building's dominant directions. Although magnetic compassing indoors has a chaotic behaviour, in this article we analyse large data-sets in order to study the potential use that magnetic compassing has to estimate the absolute yaw angle of a walking person. Apart from these analysis, this article also proposes an improved HDE method called Magnetically-aided Improved Heuristic Drift Elimination (MiHDE), that is implemented over a PDR framework that uses foot-mounted inertial navigation with an extended Kalman filter (EKF). The EKF is fed with the MiHDE-estimated orientation error, gyro bias corrections, as well as the confidence over that corrections. We experimentally evaluated the performance of the proposed MiHDE-based PDR method, comparing it with the original HDE implementation. Results show that both methods perform very well in ideal orthogonal narrow-corridor buildings, and MiHDE outperforms HDE for non-ideal trajectories (e.g. curved paths) and also makes it robust against potential false dominant direction matchings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Gray scale images make the bulk of data in bio-medical image analysis, and hence, the main focus of many image processing tasks lies in the processing of these monochrome images. With ever improving acquisition devices, spatial and temporal image resolution increases, and data sets become very large. Various image processing frameworks exists that make the development of new algorithms easy by using high level programming languages or visual programming. These frameworks are also accessable to researchers that have no background or little in software development because they take care of otherwise complex tasks. Specifically, the management of working memory is taken care of automatically, usually at the price of requiring more it. As a result, processing large data sets with these tools becomes increasingly difficult on work station class computers. One alternative to using these high level processing tools is the development of new algorithms in a languages like C++, that gives the developer full control over how memory is handled, but the resulting workflow for the prototyping of new algorithms is rather time intensive, and also not appropriate for a researcher with little or no knowledge in software development. Another alternative is in using command line tools that run image processing tasks, use the hard disk to store intermediate results, and provide automation by using shell scripts. Although not as convenient as, e.g. visual programming, this approach is still accessable to researchers without a background in computer science. However, only few tools exist that provide this kind of processing interface, they are usually quite task specific, and don’t provide an clear approach when one wants to shape a new command line tool from a prototype shell script. Results The proposed framework, MIA, provides a combination of command line tools, plug-ins, and libraries that make it possible to run image processing tasks interactively in a command shell and to prototype by using the according shell scripting language. Since the hard disk becomes the temporal storage memory management is usually a non-issue in the prototyping phase. By using string-based descriptions for filters, optimizers, and the likes, the transition from shell scripts to full fledged programs implemented in C++ is also made easy. In addition, its design based on atomic plug-ins and single tasks command line tools makes it easy to extend MIA, usually without the requirement to touch or recompile existing code. Conclusion In this article, we describe the general design of MIA, a general purpouse framework for gray scale image processing. We demonstrated the applicability of the software with example applications from three different research scenarios, namely motion compensation in myocardial perfusion imaging, the processing of high resolution image data that arises in virtual anthropology, and retrospective analysis of treatment outcome in orthognathic surgery. With MIA prototyping algorithms by using shell scripts that combine small, single-task command line tools is a viable alternative to the use of high level languages, an approach that is especially useful when large data sets need to be processed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-06

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The n-tuple pattern recognition method has been tested using a selection of 11 large data sets from the European Community StatLog project, so that the results could be compared with those reported for the 23 other algorithms the project tested. The results indicate that this ultra-fast memory-based method is a viable competitor with the others, which include optimisation-based neural network algorithms, even though the theory of memory-based neural computing is less highly developed in terms of statistical theory.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop an approach for a sparse representation for Gaussian Process (GP) models in order to overcome the limitations of GPs caused by large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the model. Experimental results on toy examples and large real-world datasets indicate the efficiency of the approach.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We have recently developed a principled approach to interactive non-linear hierarchical visualization [8] based on the Generative Topographic Mapping (GTM). Hierarchical plots are needed when a single visualization plot is not sufficient (e.g. when dealing with large quantities of data). In this paper we extend our system by giving the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode the user interactively selects ``regions of interest'' as in [8], whereas in the automatic mode an unsupervised minimum message length (MML)-driven construction of a mixture of GTMs is used. The latter is particularly useful when the plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. We illustrate our approach on a data set of 2300 18-dimensional points and mention extension of our system to accommodate discrete data types.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We develop an approach for sparse representations of Gaussian Process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets. The method is based on a combination of a Bayesian online algorithm together with a sequential construction of a relevant subsample of the data which fully specifies the prediction of the GP model. By using an appealing parametrisation and projection techniques that use the RKHS norm, recursions for the effective parameters and a sparse Gaussian approximation of the posterior process are obtained. This allows both for a propagation of predictions as well as of Bayesian error measures. The significance and robustness of our approach is demonstrated on a variety of experiments.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatically generating maps of a measured variable of interest can be problematic. In this work we focus on the monitoring network context where observations are collected and reported by a network of sensors, and are then transformed into interpolated maps for use in decision making. Using traditional geostatistical methods, estimating the covariance structure of data collected in an emergency situation can be difficult. Variogram determination, whether by method-of-moment estimators or by maximum likelihood, is very sensitive to extreme values. Even when a monitoring network is in a routine mode of operation, sensors can sporadically malfunction and report extreme values. If this extreme data destabilises the model, causing the covariance structure of the observed data to be incorrectly estimated, the generated maps will be of little value, and the uncertainty estimates in particular will be misleading. Marchant and Lark [2007] propose a REML estimator for the covariance, which is shown to work on small data sets with a manual selection of the damping parameter in the robust likelihood. We show how this can be extended to allow treatment of large data sets together with an automated approach to all parameter estimation. The projected process kriging framework of Ingram et al. [2007] is extended to allow the use of robust likelihood functions, including the two component Gaussian and the Huber function. We show how our algorithm is further refined to reduce the computational complexity while at the same time minimising any loss of information. To show the benefits of this method, we use data collected from radiation monitoring networks across Europe. We compare our results to those obtained from traditional kriging methodologies and include comparisons with Box-Cox transformations of the data. We discuss the issue of whether to treat or ignore extreme values, making the distinction between the robust methods which ignore outliers and transformation methods which treat them as part of the (transformed) process. Using a case study, based on an extreme radiological events over a large area, we show how radiation data collected from monitoring networks can be analysed automatically and then used to generate reliable maps to inform decision making. We show the limitations of the methods and discuss potential extensions to remedy these.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Development of mass spectrometry techniques to detect protein oxidation, which contributes to signalling and inflammation, is important. Label-free approaches have the advantage of reduced sample manipulation, but are challenging in complex samples owing to undirected analysis of large data sets using statistical search engines. To identify oxidised proteins in biological samples, we previously developed a targeted approach involving precursor ion scanning for diagnostic MS3 ions from oxidised residues. Here, we tested this approach for other oxidations, and compared it with an alternative approach involving the use of extracted ion chromatograms (XICs) generated from high-resolution MSMS data using very narrow mass windows. This accurate mass XIC data methodology was effective at identifying nitrotyrosine, chlorotyrosine, and oxidative deamination of lysine, and for tyrosine oxidations highlighted more modified peptide species than precursor ion scanning or statistical database searches. Although some false positive peaks still occurred in the XICs, these could be identified by comparative assessment of the peak intensities. The method has the advantage that a number of different modifications can be analysed simultaneously in a single LC-MSMS run. This article is part of a Special Issue entitled: Posttranslational Protein modifications in biology and Medicine. Biological significance: The use of accurate mass extracted product ion chromatograms to detect oxidised peptides could improve the identification of oxidatively damaged proteins in inflammatory conditions. © 2013 Elsevier B.V.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: To describe and validate bespoke software designed to extract morphometric data from ciliary muscle Visante Anterior Segment Optical Coherence Tomography (AS-OCT) images. Method: Initially, to ensure the software was capable of appropriately applying tiered refractive index corrections and accurately measuring orthogonal and oblique parameters, 5 sets of custom-made rigid gas-permeable lenses aligned to simulate the sclera and ciliary muscle were imaged by the Visante AS-OCT and were analysed by the software. Human temporal ciliary muscle data from 50 participants extracted via the internal Visante AS-OCT caliper method and the software were compared. The repeatability of the software was also investigated by imaging the temporal ciliary muscle of 10 participants on 2 occasions. Results: The mean difference between the software and the absolute thickness measurements of the rigid gas-permeable lenses were not statistically significantly different from 0 (t = -1.458, p = 0.151). Good correspondence was observed between human ciliary muscle measurements obtained by the software and the internal Visante AS-OCT calipers (maximum thickness t = -0.864, p = 0.392, total length t = 0.860, p = 0.394). The software extracted highly repeatable ciliary muscle measurements (variability ≤6% of mean value). Conclusion: The bespoke software is capable of extracting accurate and repeatable ciliary muscle measurements and is suitable for analysing large data sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The selected publications are focused on the relations between users, eGames and the educational context, and how they interact together, so that both learning and user performance are improved through feedback provision. A key part of this analysis is the identification of behavioural, anthropological patterns, so that users can be clustered based on their actions, and the steps taken in the system (e.g. social network, online community, or virtual campus). In doing so, we can analyse large data sets of information made by a broad user sample,which will provide more accurate statistical reports and readings. Furthermore, this research is focused on how users can be clustered based on individual and group behaviour, so that a personalized support through feedback is provided, and the personal learning process is improved as well as the group interaction. We take inputs from every person and from the group they belong to, cluster the contributions, find behavioural patterns and provide personalized feedback to the individual and the group, based on personal and group findings. And we do all this in the context of educational games integrated in learning communities and learning management systems. To carry out this research we design a set of research questions along the 10-year published work presented in this thesis. We ask if the users can be clustered together based on the inputs provided by them and their groups; if and how these data are useful to improve the learner performance and the group interaction; if and how feedback becomes a useful tool for such pedagogical goal; if and how eGames become a powerful context to deploy the pedagogical methodology and the various research methods and activities that make use of that feedback to encourage learning and interaction; if and how a game design and a learning design must be defined and implemented to achieve these objectives, and to facilitate the productive authoring and integration of eGames in pedagogical contexts and frameworks. We conclude that educational games are a resourceful tool to provide a user experience towards a better personalized learning performance and an enhance group interaction along the way. To do so, eGames, while integrated in an educational context, must follow a specific set of user and technical requirements, so that the playful context supports the pedagogical model underneath. We also conclude that, while playing, users can be clustered based on their personal behaviour and interaction with others, thanks to the pattern identification. Based on this information, a set of recommendations are provided Digital Anthropology and educational eGames 6 /216 to the user and the group in the form of personalized feedback, timely managed for an optimum impact on learning performance and group interaction level. In this research, Digital Anthropology is introduced as a concept at a late stage to provide a backbone across various academic fields including: Social Science, Cognitive Science, Behavioural Science, Educational games and, of course, Technology-enhance learning. Although just recently described as an evolution of traditional anthropology, this approach to digital behaviour and social structure facilitates the understanding amongst fields and a comprehensive view towards a combined approach. This research takes forward the already existing work and published research onusers and eGames for learning, and turns the focus onto the next step — the clustering of users based on their behaviour and offering proper, personalized feedback to the user based on that clustering, rather than just on isolated inputs from every user. Indeed, this pattern recognition in the described context of eGames in educational contexts, and towards the presented aim of personalized counselling to the user and the group through feedback, is something that has not been accomplished before.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Asymptomatic Plasmodium infection carriers represent a major threat to malaria control worldwide as they are silent natural reservoirs and do not seek medical care. There are no standard criteria for asymptomatic Plasmodium infection; therefore, its diagnosis relies on the presence of the parasite during a specific period of symptomless infection. The antiparasitic immune response can result in reduced Plasmodium sp. load with control of disease manifestations, which leads to asymptomatic infection. Both the innate and adaptive immune responses seem to play major roles in asymptomatic Plasmodium infection; T regulatory cell activity (through the production of interleukin- 10 and transforming growth factor-β) and B-cells (with a broad antibody response) both play prominent roles. Furthermore, molecules involved in the haem detoxification pathway (such as haptoglobin and haeme oxygenase-1) and iron metabolism (ferritin and activated c-Jun N-terminal kinase) have emerged in recent years as potential biomarkers and thus are helping to unravel the immune response underlying asymptomatic Plasmodium infection. The acquisition of large data sets and the use of robust statistical tools, including network analysis, associated with welldesigned malaria studies will likely help elucidate the immune mechanisms responsible for asymptomatic infection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A camera maps 3-dimensional (3D) world space to a 2-dimensional (2D) image space. In the process it loses the depth information, i.e., the distance from the camera focal point to the imaged objects. It is impossible to recover this information from a single image. However, by using two or more images from different viewing angles this information can be recovered, which in turn can be used to obtain the pose (position and orientation) of the camera. Using this pose, a 3D reconstruction of imaged objects in the world can be computed. Numerous algorithms have been proposed and implemented to solve the above problem; these algorithms are commonly called Structure from Motion (SfM). State-of-the-art SfM techniques have been shown to give promising results. However, unlike a Global Positioning System (GPS) or an Inertial Measurement Unit (IMU) which directly give the position and orientation respectively, the camera system estimates it after implementing SfM as mentioned above. This makes the pose obtained from a camera highly sensitive to the images captured and other effects, such as low lighting conditions, poor focus or improper viewing angles. In some applications, for example, an Unmanned Aerial Vehicle (UAV) inspecting a bridge or a robot mapping an environment using Simultaneous Localization and Mapping (SLAM), it is often difficult to capture images with ideal conditions. This report examines the use of SfM methods in such applications and the role of combining multiple sensors, viz., sensor fusion, to achieve more accurate and usable position and reconstruction information. This project investigates the role of sensor fusion in accurately estimating the pose of a camera for the application of 3D reconstruction of a scene. The first set of experiments is conducted in a motion capture room. These results are assumed as ground truth in order to evaluate the strengths and weaknesses of each sensor and to map their coordinate systems. Then a number of scenarios are targeted where SfM fails. The pose estimates obtained from SfM are replaced by those obtained from other sensors and the 3D reconstruction is completed. Quantitative and qualitative comparisons are made between the 3D reconstruction obtained by using only a camera versus that obtained by using the camera along with a LIDAR and/or an IMU. Additionally, the project also works towards the performance issue faced while handling large data sets of high-resolution images by implementing the system on the Superior high performance computing cluster at Michigan Technological University.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.