893 resultados para data driven approach
Resumo:
Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.
Resumo:
Correct predictions of future blood glucose levels in individuals with Type 1 Diabetes (T1D) can be used to provide early warning of upcoming hypo-/hyperglycemic events and thus to improve the patient's safety. To increase prediction accuracy and efficiency, various approaches have been proposed which combine multiple predictors to produce superior results compared to single predictors. Three methods for model fusion are presented and comparatively assessed. Data from 23 T1D subjects under sensor-augmented pump (SAP) therapy were used in two adaptive data-driven models (an autoregressive model with output correction - cARX, and a recurrent neural network - RNN). Data fusion techniques based on i) Dempster-Shafer Evidential Theory (DST), ii) Genetic Algorithms (GA), and iii) Genetic Programming (GP) were used to merge the complimentary performances of the prediction models. The fused output is used in a warning algorithm to issue alarms of upcoming hypo-/hyperglycemic events. The fusion schemes showed improved performance with lower root mean square errors, lower time lags, and higher correlation. In the warning algorithm, median daily false alarms (DFA) of 0.25%, and 100% correct alarms (CA) were obtained for both event types. The detection times (DT) before occurrence of events were 13.0 and 12.1 min respectively for hypo-/hyperglycemic events. Compared to the cARX and RNN models, and a linear fusion of the two, the proposed fusion schemes represents a significant improvement.
Resumo:
In this paper we make a further step towards a dispersive description of the hadronic light-by-light (HLbL) tensor, which should ultimately lead to a data-driven evaluation of its contribution to (g − 2) μ . We first provide a Lorentz decomposition of the HLbL tensor performed according to the general recipe by Bardeen, Tung, and Tarrach, generalizing and extending our previous approach, which was constructed in terms of a basis of helicity amplitudes. Such a tensor decomposition has several advantages: the role of gauge invariance and crossing symmetry becomes fully transparent; the scalar coefficient functions are free of kinematic singularities and zeros, and thus fulfill a Mandelstam double-dispersive representation; and the explicit relation for the HLbL contribution to (g − 2) μ in terms of the coefficient functions simplifies substantially. We demonstrate explicitly that the dispersive approach defines both the pion-pole and the pion-loop contribution unambiguously and in a model-independent way. The pion loop, dispersively defined as pion-box topology, is proven to coincide exactly with the one-loop scalar QED amplitude, multiplied by the appropriate pion vector form factors.
Resumo:
GOAL: In the following, we will present a newly developed X-ray calibration phantom and its integration for 2-D/3-D pelvis reconstruction and subsequent automatic cup planning. Two different planning strategies were applied and evaluated with clinical data. METHODS: Two different cup planning methods were investigated: The first planning strategy is based on a combined pelvis and cup statistical atlas. Thereby, the pelvis part of the combined atlas is matched to the reconstructed pelvis model, resulting in an optimized cup planning. The second planning strategy analyzes the morphology of the reconstructed pelvis model to determine the best fitting cup implant. RESULTS: The first planning strategy was compared to 3-D CT-based planning. Digitally reconstructed radiographs of THA patients with differently severe pathologies were used to evaluate the accuracy of predicting the cup size and position. Within a discrepancy of one cup size, the size was correctly identified in 100% of the cases for Crowe type I datasets and in 77.8% of the cases for Crowe type II, III, and IV datasets. The second planning strategy was analyzed with respect to the eventually implanted cup size. In seven patients, the estimated cup diameter was correct within one cup size, while the estimation for the remaining five patients differed by two cup sizes. CONCLUSION: While both planning strategies showed the same prediction rate with a discrepancy of one cup size (87.5%), the prediction of the exact cup size was increased for the statistical atlas-based strategy (56%) in contrast to the anatomically driven approach (37.5%). SIGNIFICANCE: The proposed approach demonstrated the clinical validity of using 2-D/3-D reconstruction technique for cup planning.
Resumo:
The largest uncertainties in the Standard Model calculation of the anomalous magnetic moment of the muon (g − 2)μ come from hadronic contributions. In particular, it can be expected that in a few years the subleading hadronic light-by-light (HLbL) contribution will dominate the theory uncertainty. We present a dispersive description of the HLbL tensor, which is based on unitarity, analyticity, crossing symmetry, and gauge invariance. Such a model-independent Approach opens up an avenue towards a data-driven determination of the HLbL contribution to the (g − 2)μ.
Resumo:
A wide variety of spatial data collection efforts are ongoing throughout local, state and federal agencies, private firms and non-profit organizations. Each effort is established for a different purpose but organizations and individuals often collect and maintain the same or similar information. The United States federal government has undertaken many initiatives such as the National Spatial Data Infrastructure, the National Map and Geospatial One-Stop to reduce duplicative spatial data collection and promote the coordinated use, sharing, and dissemination of spatial data nationwide. A key premise in most of these initiatives is that no national government will be able to gather and maintain more than a small percentage of the geographic data that users want and desire. Thus, national initiatives depend typically on the cooperation of those already gathering spatial data and those using GIs to meet specific needs to help construct and maintain these spatial data infrastructures and geo-libraries for their nations (Onsrud 2001). Some of the impediments to widespread spatial data sharing are well known from directly asking GIs data producers why they are not currently involved in creating datasets that are of common or compatible formats, documenting their datasets in a standardized metadata format or making their datasets more readily available to others through Data Clearinghouses or geo-libraries. The research described in this thesis addresses the impediments to wide-scale spatial data sharing faced by GIs data producers and explores a new conceptual data-sharing approach, the Public Commons for Geospatial Data, that supports user-friendly metadata creation, open access licenses, archival services and documentation of parent lineage of the contributors and value- adders of digital spatial data sets.
Resumo:
Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.
Resumo:
We introduce two probabilistic, data-driven models that predict a ship's speed and the situations where a ship is probable to get stuck in ice based on the joint effect of ice features such as the thickness and concentration of level ice, ice ridges, rafted ice, moreover ice compression is considered. To develop the models to datasets were utilized. First, the data from the Automatic Identification System about the performance of a selected ship was used. Second, a numerical ice model HELMI, developed in the Finnish Meteorological Institute, provided information about the ice field. The relations between the ice conditions and ship movements were established using Bayesian learning algorithms. The case study presented in this paper considers a single and unassisted trip of an ice-strengthened bulk carrier between two Finnish ports in the presence of challenging ice conditions, which varied in time and space. The obtained results show good prediction power of the models. This means, on average 80% for predicting the ship's speed within specified bins, and above 90% for predicting cases where a ship may get stuck in ice. We expect this new approach to facilitate the safe and effective route selection problem for ice-covered waters where the ship performance is reflected in the objective function.
Resumo:
Abstract. The uptake of Linked Data (LD) has promoted the proliferation of datasets and their associated ontologies for describing different domains. Ac-cording to LD principles, developers should reuse as many available terms as possible to describe their data. Importing ontologies or referring to their terms’ URIs are the two main ways to reuse knowledge from available ontologies. In this paper, we have analyzed 18589 terms appearing within 196 ontologies in-cluded in the Linked Open Vocabularies (LOV) registry with the aim of under-standing the current state of ontology reuse in the LD context. In order to char-acterize the landscape of ontology reuse in this context, we have extracted sta-tistics about currently reused elements, calculated ratios for reuse, and drawn graphs about imports and references between ontologies. Keywords: ontology, vocabulary, reuse, linked data, ontology import
Resumo:
The uptake of Linked Data (LD) has promoted the proliferation of datasets and their associated ontologies for describing different domains. Par-ticular LD development characteristics such as agility and web-based architec-ture necessitate the revision, adaption, and lightening of existing methodologies for ontology development. This thesis proposes a lightweight method for ontol-ogy development in an LD context which will be based in data-driven agile de-velopments, existing resources to be reused, and the evaluation of the obtained products considering both classical ontological engineering principles and LD characteristics.
Resumo:
System identification deals with the problem of building mathematical models of dynamical systems based on observed data from the system" [1]. In the context of civil engineering, the system refers to a large scale structure such as a building, bridge, or an offshore structure, and identification mostly involves the determination of modal parameters (the natural frequencies, damping ratios, and mode shapes). This paper presents some modal identification results obtained using a state-of-the-art time domain system identification method (data-driven stochastic subspace algorithms [2]) applied to the output-only data measured in a steel arch bridge. First, a three dimensional finite element model was developed for the numerical analysis of the structure using ANSYS. Modal analysis was carried out and modal parameters were extracted in the frequency range of interest, 0-10 Hz. The results obtained from the finite element modal analysis were used to determine the location of the sensors. After that, ambient vibration tests were conducted during April 23-24, 2009. The response of the structure was measured using eight accelerometers. Two stations of three sensors were formed (triaxial stations). These sensors were held stationary for reference during the test. The two remaining sensors were placed at the different measurement points along the bridge deck, in which only vertical and transversal measurements were conducted (biaxial stations). Point estimate and interval estimate have been carried out in the state space model using these ambient vibration measurements. In the case of parametric models (like state space), the dynamic behaviour of a system is described using mathematical models. Then, mathematical relationships can be established between modal parameters and estimated point parameters (thus, it is common to use experimental modal analysis as a synonym for system identification). Stable modal parameters are found using a stabilization diagram. Furthermore, this paper proposes a method for assessing the precision of estimates of the parameters of state-space models (confidence interval). This approach employs the nonparametric bootstrap procedure [3] and is applied to subspace parameter estimation algorithm. Using bootstrap results, a plot similar to a stabilization diagram is developed. These graphics differentiate system modes from spurious noise modes for a given order system. Additionally, using the modal assurance criterion, the experimental modes obtained have been compared with those evaluated from a finite element analysis. A quite good agreement between numerical and experimental results is observed.
Resumo:
Nuestro cerebro contiene cerca de 1014 sinapsis neuronales. Esta enorme cantidad de conexiones proporciona un entorno ideal donde distintos grupos de neuronas se sincronizan transitoriamente para provocar la aparición de funciones cognitivas, como la percepción, el aprendizaje o el pensamiento. Comprender la organización de esta compleja red cerebral en base a datos neurofisiológicos, representa uno de los desafíos más importantes y emocionantes en el campo de la neurociencia. Se han propuesto recientemente varias medidas para evaluar cómo se comunican las diferentes partes del cerebro a diversas escalas (células individuales, columnas corticales, o áreas cerebrales). Podemos clasificarlos, según su simetría, en dos grupos: por una parte, la medidas simétricas, como la correlación, la coherencia o la sincronización de fase, que evalúan la conectividad funcional (FC); mientras que las medidas asimétricas, como la causalidad de Granger o transferencia de entropía, son capaces de detectar la dirección de la interacción, lo que denominamos conectividad efectiva (EC). En la neurociencia moderna ha aumentado el interés por el estudio de las redes funcionales cerebrales, en gran medida debido a la aparición de estos nuevos algoritmos que permiten analizar la interdependencia entre señales temporales, además de la emergente teoría de redes complejas y la introducción de técnicas novedosas, como la magnetoencefalografía (MEG), para registrar datos neurofisiológicos con gran resolución. Sin embargo, nos hallamos ante un campo novedoso que presenta aun varias cuestiones metodológicas sin resolver, algunas de las cuales trataran de abordarse en esta tesis. En primer lugar, el creciente número de aproximaciones para determinar la existencia de FC/EC entre dos o más señales temporales, junto con la complejidad matemática de las herramientas de análisis, hacen deseable organizarlas todas en un paquete software intuitivo y fácil de usar. Aquí presento HERMES (http://hermes.ctb.upm.es), una toolbox en MatlabR, diseñada precisamente con este fin. Creo que esta herramienta será de gran ayuda para todos aquellos investigadores que trabajen en el campo emergente del análisis de conectividad cerebral y supondrá un gran valor para la comunidad científica. La segunda cuestión practica que se aborda es el estudio de la sensibilidad a las fuentes cerebrales profundas a través de dos tipos de sensores MEG: gradiómetros planares y magnetómetros, esta aproximación además se combina con un enfoque metodológico, utilizando dos índices de sincronización de fase: phase locking value (PLV) y phase lag index (PLI), este ultimo menos sensible a efecto la conducción volumen. Por lo tanto, se compara su comportamiento al estudiar las redes cerebrales, obteniendo que magnetómetros y PLV presentan, respectivamente, redes más densamente conectadas que gradiómetros planares y PLI, por los valores artificiales que crea el problema de la conducción de volumen. Sin embargo, cuando se trata de caracterizar redes epilépticas, el PLV ofrece mejores resultados, debido a la gran dispersión de las redes obtenidas con PLI. El análisis de redes complejas ha proporcionado nuevos conceptos que mejoran caracterización de la interacción de sistemas dinámicos. Se considera que una red está compuesta por nodos, que simbolizan sistemas, cuyas interacciones se representan por enlaces, y su comportamiento y topología puede caracterizarse por un elevado número de medidas. Existe evidencia teórica y empírica de que muchas de ellas están fuertemente correlacionadas entre sí. Por lo tanto, se ha conseguido seleccionar un pequeño grupo que caracteriza eficazmente estas redes, y condensa la información redundante. Para el análisis de redes funcionales, la selección de un umbral adecuado para decidir si un determinado valor de conectividad de la matriz de FC es significativo y debe ser incluido para un análisis posterior, se convierte en un paso crucial. En esta tesis, se han obtenido resultados más precisos al utilizar un test de subrogadas, basado en los datos, para evaluar individualmente cada uno de los enlaces, que al establecer a priori un umbral fijo para la densidad de conexiones. Finalmente, todas estas cuestiones se han aplicado al estudio de la epilepsia, caso práctico en el que se analizan las redes funcionales MEG, en estado de reposo, de dos grupos de pacientes epilépticos (generalizada idiopática y focal frontal) en comparación con sujetos control sanos. La epilepsia es uno de los trastornos neurológicos más comunes, con más de 55 millones de afectados en el mundo. Esta enfermedad se caracteriza por la predisposición a generar ataques epilépticos de actividad neuronal anormal y excesiva o bien síncrona, y por tanto, es el escenario perfecto para este tipo de análisis al tiempo que presenta un gran interés tanto desde el punto de vista clínico como de investigación. Los resultados manifiestan alteraciones especificas en la conectividad y un cambio en la topología de las redes en cerebros epilépticos, desplazando la importancia del ‘foco’ a la ‘red’, enfoque que va adquiriendo relevancia en las investigaciones recientes sobre epilepsia. ABSTRACT There are about 1014 neuronal synapses in the human brain. This huge number of connections provides the substrate for neuronal ensembles to become transiently synchronized, producing the emergence of cognitive functions such as perception, learning or thinking. Understanding the complex brain network organization on the basis of neuroimaging data represents one of the most important and exciting challenges for systems neuroscience. Several measures have been recently proposed to evaluate at various scales (single cells, cortical columns, or brain areas) how the different parts of the brain communicate. We can classify them, according to their symmetry, into two groups: symmetric measures, such as correlation, coherence or phase synchronization indexes, evaluate functional connectivity (FC); and on the other hand, the asymmetric ones, such as Granger causality or transfer entropy, are able to detect effective connectivity (EC) revealing the direction of the interaction. In modern neurosciences, the interest in functional brain networks has increased strongly with the onset of new algorithms to study interdependence between time series, the advent of modern complex network theory and the introduction of powerful techniques to record neurophysiological data, such as magnetoencephalography (MEG). However, when analyzing neurophysiological data with this approach several questions arise. In this thesis, I intend to tackle some of the practical open problems in the field. First of all, the increase in the number of time series analysis algorithms to study brain FC/EC, along with their mathematical complexity, creates the necessity of arranging them into a single, unified toolbox that allow neuroscientists, neurophysiologists and researchers from related fields to easily access and make use of them. I developed such a toolbox for this aim, it is named HERMES (http://hermes.ctb.upm.es), and encompasses several of the most common indexes for the assessment of FC and EC running for MatlabR environment. I believe that this toolbox will be very helpful to all the researchers working in the emerging field of brain connectivity analysis and will entail a great value for the scientific community. The second important practical issue tackled in this thesis is the evaluation of the sensitivity to deep brain sources of two different MEG sensors: planar gradiometers and magnetometers, in combination with the related methodological approach, using two phase synchronization indexes: phase locking value (PLV) y phase lag index (PLI), the latter one being less sensitive to volume conduction effect. Thus, I compared their performance when studying brain networks, obtaining that magnetometer sensors and PLV presented higher artificial values as compared with planar gradiometers and PLI respectively. However, when it came to characterize epileptic networks it was the PLV which gives better results, as PLI FC networks where very sparse. Complex network analysis has provided new concepts which improved characterization of interacting dynamical systems. With this background, networks could be considered composed of nodes, symbolizing systems, whose interactions with each other are represented by edges. A growing number of network measures is been applied in network analysis. However, there is theoretical and empirical evidence that many of these indexes are strongly correlated with each other. Therefore, in this thesis I reduced them to a small set, which could more efficiently characterize networks. Within this framework, selecting an appropriate threshold to decide whether a certain connectivity value of the FC matrix is significant and should be included in the network analysis becomes a crucial step, in this thesis, I used the surrogate data tests to make an individual data-driven evaluation of each of the edges significance and confirmed more accurate results than when just setting to a fixed value the density of connections. All these methodologies were applied to the study of epilepsy, analysing resting state MEG functional networks, in two groups of epileptic patients (generalized and focal epilepsy) that were compared to matching control subjects. Epilepsy is one of the most common neurological disorders, with more than 55 million people affected worldwide, characterized by its predisposition to generate epileptic seizures of abnormal excessive or synchronous neuronal activity, and thus, this scenario and analysis, present a great interest from both the clinical and the research perspective. Results revealed specific disruptions in connectivity and network topology and evidenced that networks’ topology is changed in epileptic brains, supporting the shift from ‘focus’ to ‘networks’ which is gaining importance in modern epilepsy research.
Resumo:
We propose a new Bayesian framework for automatically determining the position (location and orientation) of an uncalibrated camera using the observations of moving objects and a schematic map of the passable areas of the environment. Our approach takes advantage of static and dynamic information on the scene structures through prior probability distributions for object dynamics. The proposed approach restricts plausible positions where the sensor can be located while taking into account the inherent ambiguity of the given setting. The proposed framework samples from the posterior probability distribution for the camera position via data driven MCMC, guided by an initial geometric analysis that restricts the search space. A Kullback-Leibler divergence analysis is then used that yields the final camera position estimate, while explicitly isolating ambiguous settings. The proposed approach is evaluated in synthetic and real environments, showing its satisfactory performance in both ambiguous and unambiguous settings.
Resumo:
Purely data-driven approaches for machine learning present difficulties when data are scarce relative to the complexity of the model or when the model is forced to extrapolate. On the other hand, purely mechanistic approaches need to identify and specify all the interactions in the problem at hand (which may not be feasible) and still leave the issue of how to parameterize the system. In this paper, we present a hybrid approach using Gaussian processes and differential equations to combine data-driven modeling with a physical model of the system. We show how different, physically inspired, kernel functions can be developed through sensible, simple, mechanistic assumptions about the underlying system. The versatility of our approach is illustrated with three case studies from motion capture, computational biology, and geostatistics.
Resumo:
The monkey anterior intraparietal area (AIP) encodes visual information about three-dimensional object shape that is used to shape the hand for grasping. In robotics a similar role has been played by modules that fit point cloud data to the superquadric family of shapes and its various extensions. We developed a model of shape tuning in AIP based on cosine tuning to superquadric parameters. However, the model did not fit the data well, and we also found that it was difficult to accurately reproduce these parameters using neural networks with the appropriate inputs (modelled on the caudal intraparietal area, CIP). The latter difficulty was related to the fact that there are large discontinuities in the superquadric parameters between very similar shapes. To address these limitations we adopted an alternative shape parameterization based on an Isomap nonlinear dimension reduction. The Isomap was built using gradients and curvatures of object surface depth. This alternative parameterization was low-dimensional (like superquadrics), but data-driven (similar to an alternative clustering approach that is also sometimes used in robotics) and lacked large discontinuities. Isomaps with 16 or more dimensions reproduced the AIP data fairly well. Moreover, we found that the Isomap parameters could be approximated from CIP-like input much more accurately than the superquadric parameters. We conclude that Isomaps, or perhaps alternative dimension reductions of CIP signals, provide a promising model of AIP tuning. We have now started to integrate our model with a robot hand, to explore the efficacy of Isomap shape reductions in grasp planning. Future work will consider dynamics of spike responses and integration with related visual and motor area models.