18 resultados para Data Driven Modeling
em Universidad Politécnica de Madrid
Resumo:
Due to the advancement of both, information technology in general, and databases in particular; data storage devices are becoming cheaper and data processing speed is increasing. As result of this, organizations tend to store large volumes of data holding great potential information. Decision Support Systems, DSS try to use the stored data to obtain valuable information for organizations. In this paper, we use both data models and use cases to represent the functionality of data processing in DSS following Software Engineering processes. We propose a methodology to develop DSS in the Analysis phase, respective of data processing modeling. We have used, as a starting point, a data model adapted to the semantics involved in multidimensional databases or data warehouses, DW. Also, we have taken an algorithm that provides us with all the possible ways to automatically cross check multidimensional model data. Using the aforementioned, we propose diagrams and descriptions of use cases, which can be considered as patterns representing the DSS functionality, in regard to DW data processing, DW on which DSS are based. We highlight the reusability and automation benefits that this can be achieved, and we think this study can serve as a guide in the development of DSS.
Resumo:
Purely data-driven approaches for machine learning present difficulties when data are scarce relative to the complexity of the model or when the model is forced to extrapolate. On the other hand, purely mechanistic approaches need to identify and specify all the interactions in the problem at hand (which may not be feasible) and still leave the issue of how to parameterize the system. In this paper, we present a hybrid approach using Gaussian processes and differential equations to combine data-driven modeling with a physical model of the system. We show how different, physically inspired, kernel functions can be developed through sensible, simple, mechanistic assumptions about the underlying system. The versatility of our approach is illustrated with three case studies from motion capture, computational biology, and geostatistics.
Resumo:
Abstract. The uptake of Linked Data (LD) has promoted the proliferation of datasets and their associated ontologies for describing different domains. Ac-cording to LD principles, developers should reuse as many available terms as possible to describe their data. Importing ontologies or referring to their terms’ URIs are the two main ways to reuse knowledge from available ontologies. In this paper, we have analyzed 18589 terms appearing within 196 ontologies in-cluded in the Linked Open Vocabularies (LOV) registry with the aim of under-standing the current state of ontology reuse in the LD context. In order to char-acterize the landscape of ontology reuse in this context, we have extracted sta-tistics about currently reused elements, calculated ratios for reuse, and drawn graphs about imports and references between ontologies. Keywords: ontology, vocabulary, reuse, linked data, ontology import
Resumo:
The uptake of Linked Data (LD) has promoted the proliferation of datasets and their associated ontologies for describing different domains. Par-ticular LD development characteristics such as agility and web-based architec-ture necessitate the revision, adaption, and lightening of existing methodologies for ontology development. This thesis proposes a lightweight method for ontol-ogy development in an LD context which will be based in data-driven agile de-velopments, existing resources to be reused, and the evaluation of the obtained products considering both classical ontological engineering principles and LD characteristics.
Resumo:
Enterprises are increasingly using a wide range of heterogeneous information systems for executing and governing their business activities. Even if the adoption of service orientation has improved loose coupling and reusability, applications are still isolated data silos whose integration requires complex transformations and mediations. However, by leveraging Linked Data principles those data silos can now be seamlessly integrated, and this opens the door to new data-driven approaches for Enterprise Application Integration (EAI). In this paper we present LDP4j, an open souce Java-based framework for the development of interoperable read-write Linked Data applications, based on the W3C Linked Data Platform (LDP) specification.
Resumo:
Ozone stomatal fluxes were modeled for a 3-year period following different approaches for a commercial variety of durum wheat (Triticum durum Desf. cv. Camacho) at the phenological stage of anthesis. All models performed in the same range, although not all of them afforded equally significant results. Nevertheless, all of them suggest that stomatal conductance would account for the main percentage of ozone deposition fluxes. A new modeling approach was tested, based on a 3-D architectural model of the wheat canopy, and fairly accurate results were obtained. Plant species-specific measurements, as well as measurements of stomatal conductance and environmental parameters, were required. The method proposed for calculating ozone stomatal fluxes (FO(3_3-D)) from experimental gs data and modeling them as a function of certain environmental parameters in conjunction with the use of the YPLANT model seems to be adequate, providing realistic estimates of the canopy FO(3_3-D), integrating and not neglecting the contribution of the lower leaves with respect to the flag leaf, although a further development of this model is needed.
Resumo:
This paper presents an empirical evidence of user bias within a laboratory-oriented evaluation of a Spoken Dialog System. Specifically, we addressed user bias in their satisfaction judgements. We question the reliability of this data for modeling user emotion, focusing on contentment and frustration in a spoken dialog system. This bias is detected through machine learning experiments that were conducted on two datasets, users and annotators, which were then compared in order to assess the reliability of these datasets. The target used was the satisfaction rating and the predictors were conversational/dialog features. Our results indicated that standard classifiers were significantly more successful in discriminating frustration and contentment and the intensities of these emotions (reflected by user satisfaction ratings) from annotator data than from user data. Indirectly, the results showed that conversational features are reliable predictors of the two abovementioned emotions.
Resumo:
System identification deals with the problem of building mathematical models of dynamical systems based on observed data from the system" [1]. In the context of civil engineering, the system refers to a large scale structure such as a building, bridge, or an offshore structure, and identification mostly involves the determination of modal parameters (the natural frequencies, damping ratios, and mode shapes). This paper presents some modal identification results obtained using a state-of-the-art time domain system identification method (data-driven stochastic subspace algorithms [2]) applied to the output-only data measured in a steel arch bridge. First, a three dimensional finite element model was developed for the numerical analysis of the structure using ANSYS. Modal analysis was carried out and modal parameters were extracted in the frequency range of interest, 0-10 Hz. The results obtained from the finite element modal analysis were used to determine the location of the sensors. After that, ambient vibration tests were conducted during April 23-24, 2009. The response of the structure was measured using eight accelerometers. Two stations of three sensors were formed (triaxial stations). These sensors were held stationary for reference during the test. The two remaining sensors were placed at the different measurement points along the bridge deck, in which only vertical and transversal measurements were conducted (biaxial stations). Point estimate and interval estimate have been carried out in the state space model using these ambient vibration measurements. In the case of parametric models (like state space), the dynamic behaviour of a system is described using mathematical models. Then, mathematical relationships can be established between modal parameters and estimated point parameters (thus, it is common to use experimental modal analysis as a synonym for system identification). Stable modal parameters are found using a stabilization diagram. Furthermore, this paper proposes a method for assessing the precision of estimates of the parameters of state-space models (confidence interval). This approach employs the nonparametric bootstrap procedure [3] and is applied to subspace parameter estimation algorithm. Using bootstrap results, a plot similar to a stabilization diagram is developed. These graphics differentiate system modes from spurious noise modes for a given order system. Additionally, using the modal assurance criterion, the experimental modes obtained have been compared with those evaluated from a finite element analysis. A quite good agreement between numerical and experimental results is observed.
Resumo:
Nuestro cerebro contiene cerca de 1014 sinapsis neuronales. Esta enorme cantidad de conexiones proporciona un entorno ideal donde distintos grupos de neuronas se sincronizan transitoriamente para provocar la aparición de funciones cognitivas, como la percepción, el aprendizaje o el pensamiento. Comprender la organización de esta compleja red cerebral en base a datos neurofisiológicos, representa uno de los desafíos más importantes y emocionantes en el campo de la neurociencia. Se han propuesto recientemente varias medidas para evaluar cómo se comunican las diferentes partes del cerebro a diversas escalas (células individuales, columnas corticales, o áreas cerebrales). Podemos clasificarlos, según su simetría, en dos grupos: por una parte, la medidas simétricas, como la correlación, la coherencia o la sincronización de fase, que evalúan la conectividad funcional (FC); mientras que las medidas asimétricas, como la causalidad de Granger o transferencia de entropía, son capaces de detectar la dirección de la interacción, lo que denominamos conectividad efectiva (EC). En la neurociencia moderna ha aumentado el interés por el estudio de las redes funcionales cerebrales, en gran medida debido a la aparición de estos nuevos algoritmos que permiten analizar la interdependencia entre señales temporales, además de la emergente teoría de redes complejas y la introducción de técnicas novedosas, como la magnetoencefalografía (MEG), para registrar datos neurofisiológicos con gran resolución. Sin embargo, nos hallamos ante un campo novedoso que presenta aun varias cuestiones metodológicas sin resolver, algunas de las cuales trataran de abordarse en esta tesis. En primer lugar, el creciente número de aproximaciones para determinar la existencia de FC/EC entre dos o más señales temporales, junto con la complejidad matemática de las herramientas de análisis, hacen deseable organizarlas todas en un paquete software intuitivo y fácil de usar. Aquí presento HERMES (http://hermes.ctb.upm.es), una toolbox en MatlabR, diseñada precisamente con este fin. Creo que esta herramienta será de gran ayuda para todos aquellos investigadores que trabajen en el campo emergente del análisis de conectividad cerebral y supondrá un gran valor para la comunidad científica. La segunda cuestión practica que se aborda es el estudio de la sensibilidad a las fuentes cerebrales profundas a través de dos tipos de sensores MEG: gradiómetros planares y magnetómetros, esta aproximación además se combina con un enfoque metodológico, utilizando dos índices de sincronización de fase: phase locking value (PLV) y phase lag index (PLI), este ultimo menos sensible a efecto la conducción volumen. Por lo tanto, se compara su comportamiento al estudiar las redes cerebrales, obteniendo que magnetómetros y PLV presentan, respectivamente, redes más densamente conectadas que gradiómetros planares y PLI, por los valores artificiales que crea el problema de la conducción de volumen. Sin embargo, cuando se trata de caracterizar redes epilépticas, el PLV ofrece mejores resultados, debido a la gran dispersión de las redes obtenidas con PLI. El análisis de redes complejas ha proporcionado nuevos conceptos que mejoran caracterización de la interacción de sistemas dinámicos. Se considera que una red está compuesta por nodos, que simbolizan sistemas, cuyas interacciones se representan por enlaces, y su comportamiento y topología puede caracterizarse por un elevado número de medidas. Existe evidencia teórica y empírica de que muchas de ellas están fuertemente correlacionadas entre sí. Por lo tanto, se ha conseguido seleccionar un pequeño grupo que caracteriza eficazmente estas redes, y condensa la información redundante. Para el análisis de redes funcionales, la selección de un umbral adecuado para decidir si un determinado valor de conectividad de la matriz de FC es significativo y debe ser incluido para un análisis posterior, se convierte en un paso crucial. En esta tesis, se han obtenido resultados más precisos al utilizar un test de subrogadas, basado en los datos, para evaluar individualmente cada uno de los enlaces, que al establecer a priori un umbral fijo para la densidad de conexiones. Finalmente, todas estas cuestiones se han aplicado al estudio de la epilepsia, caso práctico en el que se analizan las redes funcionales MEG, en estado de reposo, de dos grupos de pacientes epilépticos (generalizada idiopática y focal frontal) en comparación con sujetos control sanos. La epilepsia es uno de los trastornos neurológicos más comunes, con más de 55 millones de afectados en el mundo. Esta enfermedad se caracteriza por la predisposición a generar ataques epilépticos de actividad neuronal anormal y excesiva o bien síncrona, y por tanto, es el escenario perfecto para este tipo de análisis al tiempo que presenta un gran interés tanto desde el punto de vista clínico como de investigación. Los resultados manifiestan alteraciones especificas en la conectividad y un cambio en la topología de las redes en cerebros epilépticos, desplazando la importancia del ‘foco’ a la ‘red’, enfoque que va adquiriendo relevancia en las investigaciones recientes sobre epilepsia. ABSTRACT There are about 1014 neuronal synapses in the human brain. This huge number of connections provides the substrate for neuronal ensembles to become transiently synchronized, producing the emergence of cognitive functions such as perception, learning or thinking. Understanding the complex brain network organization on the basis of neuroimaging data represents one of the most important and exciting challenges for systems neuroscience. Several measures have been recently proposed to evaluate at various scales (single cells, cortical columns, or brain areas) how the different parts of the brain communicate. We can classify them, according to their symmetry, into two groups: symmetric measures, such as correlation, coherence or phase synchronization indexes, evaluate functional connectivity (FC); and on the other hand, the asymmetric ones, such as Granger causality or transfer entropy, are able to detect effective connectivity (EC) revealing the direction of the interaction. In modern neurosciences, the interest in functional brain networks has increased strongly with the onset of new algorithms to study interdependence between time series, the advent of modern complex network theory and the introduction of powerful techniques to record neurophysiological data, such as magnetoencephalography (MEG). However, when analyzing neurophysiological data with this approach several questions arise. In this thesis, I intend to tackle some of the practical open problems in the field. First of all, the increase in the number of time series analysis algorithms to study brain FC/EC, along with their mathematical complexity, creates the necessity of arranging them into a single, unified toolbox that allow neuroscientists, neurophysiologists and researchers from related fields to easily access and make use of them. I developed such a toolbox for this aim, it is named HERMES (http://hermes.ctb.upm.es), and encompasses several of the most common indexes for the assessment of FC and EC running for MatlabR environment. I believe that this toolbox will be very helpful to all the researchers working in the emerging field of brain connectivity analysis and will entail a great value for the scientific community. The second important practical issue tackled in this thesis is the evaluation of the sensitivity to deep brain sources of two different MEG sensors: planar gradiometers and magnetometers, in combination with the related methodological approach, using two phase synchronization indexes: phase locking value (PLV) y phase lag index (PLI), the latter one being less sensitive to volume conduction effect. Thus, I compared their performance when studying brain networks, obtaining that magnetometer sensors and PLV presented higher artificial values as compared with planar gradiometers and PLI respectively. However, when it came to characterize epileptic networks it was the PLV which gives better results, as PLI FC networks where very sparse. Complex network analysis has provided new concepts which improved characterization of interacting dynamical systems. With this background, networks could be considered composed of nodes, symbolizing systems, whose interactions with each other are represented by edges. A growing number of network measures is been applied in network analysis. However, there is theoretical and empirical evidence that many of these indexes are strongly correlated with each other. Therefore, in this thesis I reduced them to a small set, which could more efficiently characterize networks. Within this framework, selecting an appropriate threshold to decide whether a certain connectivity value of the FC matrix is significant and should be included in the network analysis becomes a crucial step, in this thesis, I used the surrogate data tests to make an individual data-driven evaluation of each of the edges significance and confirmed more accurate results than when just setting to a fixed value the density of connections. All these methodologies were applied to the study of epilepsy, analysing resting state MEG functional networks, in two groups of epileptic patients (generalized and focal epilepsy) that were compared to matching control subjects. Epilepsy is one of the most common neurological disorders, with more than 55 million people affected worldwide, characterized by its predisposition to generate epileptic seizures of abnormal excessive or synchronous neuronal activity, and thus, this scenario and analysis, present a great interest from both the clinical and the research perspective. Results revealed specific disruptions in connectivity and network topology and evidenced that networks’ topology is changed in epileptic brains, supporting the shift from ‘focus’ to ‘networks’ which is gaining importance in modern epilepsy research.
Resumo:
We propose a new Bayesian framework for automatically determining the position (location and orientation) of an uncalibrated camera using the observations of moving objects and a schematic map of the passable areas of the environment. Our approach takes advantage of static and dynamic information on the scene structures through prior probability distributions for object dynamics. The proposed approach restricts plausible positions where the sensor can be located while taking into account the inherent ambiguity of the given setting. The proposed framework samples from the posterior probability distribution for the camera position via data driven MCMC, guided by an initial geometric analysis that restricts the search space. A Kullback-Leibler divergence analysis is then used that yields the final camera position estimate, while explicitly isolating ambiguous settings. The proposed approach is evaluated in synthetic and real environments, showing its satisfactory performance in both ambiguous and unambiguous settings.
Resumo:
Durante la actividad diaria, la sociedad actual interactúa constantemente por medio de dispositivos electrónicos y servicios de telecomunicaciones, tales como el teléfono, correo electrónico, transacciones bancarias o redes sociales de Internet. Sin saberlo, masivamente dejamos rastros de nuestra actividad en las bases de datos de empresas proveedoras de servicios. Estas nuevas fuentes de datos tienen las dimensiones necesarias para que se puedan observar patrones de comportamiento humano a grandes escalas. Como resultado, ha surgido una reciente explosión sin precedentes de estudios de sistemas sociales, dirigidos por el análisis de datos y procesos computacionales. En esta tesis desarrollamos métodos computacionales y matemáticos para analizar sistemas sociales por medio del estudio combinado de datos derivados de la actividad humana y la teoría de redes complejas. Nuestro objetivo es caracterizar y entender los sistemas emergentes de interacciones sociales en los nuevos espacios tecnológicos, tales como la red social Twitter y la telefonía móvil. Analizamos los sistemas por medio de la construcción de redes complejas y series temporales, estudiando su estructura, funcionamiento y evolución en el tiempo. También, investigamos la naturaleza de los patrones observados por medio de los mecanismos que rigen las interacciones entre individuos, así como medimos el impacto de eventos críticos en el comportamiento del sistema. Para ello, hemos propuesto modelos que explican las estructuras globales y la dinámica emergente con que fluye la información en el sistema. Para los estudios de la red social Twitter, hemos basado nuestros análisis en conversaciones puntuales, tales como protestas políticas, grandes acontecimientos o procesos electorales. A partir de los mensajes de las conversaciones, identificamos a los usuarios que participan y construimos redes de interacciones entre los mismos. Específicamente, construimos una red para representar quién recibe los mensajes de quién y otra red para representar quién propaga los mensajes de quién. En general, hemos encontrado que estas estructuras tienen propiedades complejas, tales como crecimiento explosivo y distribuciones de grado libres de escala. En base a la topología de estas redes, hemos indentificado tres tipos de usuarios que determinan el flujo de información según su actividad e influencia. Para medir la influencia de los usuarios en las conversaciones, hemos introducido una nueva medida llamada eficiencia de usuario. La eficiencia se define como el número de retransmisiones obtenidas por mensaje enviado, y mide los efectos que tienen los esfuerzos individuales sobre la reacción colectiva. Hemos observado que la distribución de esta propiedad es ubicua en varias conversaciones de Twitter, sin importar sus dimensiones ni contextos. Con lo cual, sugerimos que existe universalidad en la relación entre esfuerzos individuales y reacciones colectivas en Twitter. Para explicar los factores que determinan la emergencia de la distribución de eficiencia, hemos desarrollado un modelo computacional que simula la propagación de mensajes en la red social de Twitter, basado en el mecanismo de cascadas independientes. Este modelo nos permite medir el efecto que tienen sobre la distribución de eficiencia, tanto la topología de la red social subyacente, como la forma en que los usuarios envían mensajes. Los resultados indican que la emergencia de un grupo selecto de usuarios altamente eficientes depende de la heterogeneidad de la red subyacente y no del comportamiento individual. Por otro lado, hemos desarrollado técnicas para inferir el grado de polarización política en redes sociales. Proponemos una metodología para estimar opiniones en redes sociales y medir el grado de polarización en las opiniones obtenidas. Hemos diseñado un modelo donde estudiamos el efecto que tiene la opinión de un pequeño grupo de usuarios influyentes, llamado élite, sobre las opiniones de la mayoría de usuarios. El modelo da como resultado una distribución de opiniones sobre la cual medimos el grado de polarización. Aplicamos nuestra metodología para medir la polarización en redes de difusión de mensajes, durante una conversación en Twitter de una sociedad políticamente polarizada. Los resultados obtenidos presentan una alta correspondencia con los datos offline. Con este estudio, hemos demostrado que la metodología propuesta es capaz de determinar diferentes grados de polarización dependiendo de la estructura de la red. Finalmente, hemos estudiado el comportamiento humano a partir de datos de telefonía móvil. Por una parte, hemos caracterizado el impacto que tienen desastres naturales, como innundaciones, sobre el comportamiento colectivo. Encontramos que los patrones de comunicación se alteran de forma abrupta en las áreas afectadas por la catástofre. Con lo cual, demostramos que se podría medir el impacto en la región casi en tiempo real y sin necesidad de desplegar esfuerzos en el terreno. Por otra parte, hemos estudiado los patrones de actividad y movilidad humana para caracterizar las interacciones entre regiones de un país en desarrollo. Encontramos que las redes de llamadas y trayectorias humanas tienen estructuras de comunidades asociadas a regiones y centros urbanos. En resumen, hemos mostrado que es posible entender procesos sociales complejos por medio del análisis de datos de actividad humana y la teoría de redes complejas. A lo largo de la tesis, hemos comprobado que fenómenos sociales como la influencia, polarización política o reacción a eventos críticos quedan reflejados en los patrones estructurales y dinámicos que presentan la redes construidas a partir de datos de conversaciones en redes sociales de Internet o telefonía móvil. ABSTRACT During daily routines, we are constantly interacting with electronic devices and telecommunication services. Unconsciously, we are massively leaving traces of our activity in the service providers’ databases. These new data sources have the dimensions required to enable the observation of human behavioral patterns at large scales. As a result, there has been an unprecedented explosion of data-driven social research. In this thesis, we develop computational and mathematical methods to analyze social systems by means of the combined study of human activity data and the theory of complex networks. Our goal is to characterize and understand the emergent systems from human interactions on the new technological spaces, such as the online social network Twitter and mobile phones. We analyze systems by means of the construction of complex networks and temporal series, studying their structure, functioning and temporal evolution. We also investigate on the nature of the observed patterns, by means of the mechanisms that rule the interactions among individuals, as well as on the impact of critical events on the system’s behavior. For this purpose, we have proposed models that explain the global structures and the emergent dynamics of information flow in the system. In the studies of the online social network Twitter, we have based our analysis on specific conversations, such as political protests, important announcements and electoral processes. From the messages related to the conversations, we identify the participant users and build networks of interactions with them. We specifically build one network to represent whoreceives- whose-messages and another to represent who-propagates-whose-messages. In general, we have found that these structures have complex properties, such as explosive growth and scale-free degree distributions. Based on the topological properties of these networks, we have identified three types of user behavior that determine the information flow dynamics due to their influence. In order to measure the users’ influence on the conversations, we have introduced a new measure called user efficiency. It is defined as the number of retransmissions obtained by message posted, and it measures the effects of the individual activity on the collective reacixtions. We have observed that the probability distribution of this property is ubiquitous across several Twitter conversation, regardlessly of their dimension or social context. Therefore, we suggest that there is a universal behavior in the relationship between individual efforts and collective reactions on Twitter. In order to explain the different factors that determine the user efficiency distribution, we have developed a computational model to simulate the diffusion of messages on Twitter, based on the mechanism of independent cascades. This model, allows us to measure the impact on the emergent efficiency distribution of the underlying network topology, as well as the way that users post messages. The results indicate that the emergence of an exclusive group of highly efficient users depends upon the heterogeneity of the underlying network instead of the individual behavior. Moreover, we have also developed techniques to infer the degree of polarization in social networks. We propose a methodology to estimate opinions in social networks and to measure the degree of polarization in the obtained opinions. We have designed a model to study the effects of the opinions of a small group of influential users, called elite, on the opinions of the majority of users. The model results in an opinions distribution to which we measure the degree of polarization. We apply our methodology to measure the polarization on graphs from the messages diffusion process, during a conversation on Twitter from a polarized society. The results are in very good agreement with offline and contextual data. With this study, we have shown that our methodology is capable of detecting several degrees of polarization depending on the structure of the networks. Finally, we have also inferred the human behavior from mobile phones’ data. On the one hand, we have characterized the impact of natural disasters, like flooding, on the collective behavior. We found that the communication patterns are abruptly altered in the areas affected by the catastrophe. Therefore, we demonstrate that we could measure the impact of the disaster on the region, almost in real-time and without needing to deploy further efforts. On the other hand, we have studied human activity and mobility patterns in order to characterize regional interactions on a developing country. We found that the calls and trajectories networks present community structure associated to regional and urban areas. In summary, we have shown that it is possible to understand complex social processes by means of analyzing human activity data and the theory of complex networks. Along the thesis, we have demonstrated that social phenomena, like influence, polarization and reaction to critical events, are reflected in the structural and dynamical patterns of the networks constructed from data regarding conversations on online social networks and mobile phones.
Resumo:
Podemos definir la sociedad como un sistema complejo que emerge de la cooperación y coordinación de billones de individuos y centenares de países. En este sentido no vivimos en una isla sino que estamos integrados en redes sociales que influyen en nuestro comportamiento. En esta tesis doctoral, presentamos un modelo analítico y una serie de estudios empíricos en los que analizamos distintos procesos sociales dinámicos desde una perspectiva de la teoría de redes complejas. En primer lugar, introducimos un modelo para explorar el impacto que las redes sociales en las que vivimos inmersos tienen en la actividad económica que transcurre sobre ellas, y mas concretamente en hasta qué punto la estructura de estas redes puede limitar la meritocracia de una sociedad. Como concepto contrario a meritocracia, en esta tesis, introducimos el término topocracia. Definimos un sistema como topocrático cuando la influencia o el poder y los ingresos de los individuos vienen principalmente determinados por la posición que ocupan en la red. Nuestro modelo es perfectamente meritocrático para redes completamente conectadas (todos los nodos están enlazados con el resto de nodos). Sin embargo nuestro modelo predice una transición hacia la topocracia a medida que disminuye la densidad de la red, siendo las redes poco densascomo las de la sociedad- topocráticas. En este modelo, los individuos por un lado producen y venden contenidos, pero por otro lado también distribuyen los contenidos producidos por otros individuos mediando entre comprador y vendedor. La producción y distribución de contenidos definen dos medios por los que los individuos reciben ingresos. El primero de ellos es meritocrático, ya que los individuos ingresan de acuerdo a lo que producen. Por el contrario el segundo es topocrático, ya que los individuos son compensados de acuerdo al número de cadenas mas cortas de la red que pasan a través de ellos. En esta tesis resolvemos el modelo computacional y analíticamente. Los resultados indican que un sistema es meritocrático solamente si la conectividad media de los individuos es mayor que una raíz del número de individuos que hay en el sistema. Por tanto, a la luz de nuestros resultados la estructura de la red social puede representar una limitación para la meritocracia de una sociedad. En la segunda parte de esta tesis se presentan una serie de estudios empíricos en los que se analizan datos extraídos de la red social Twitter para caracterizar y modelar el comportamiento humano. En particular, nos centramos en analizar conversaciones políticas, como las que tienen lugar durante campañas electorales. Nuestros resultados indican que la atención colectiva está distribuida de una forma muy heterogénea, con una minoría de cuentas extremadamente influyente. Además, la capacidad de los individuos para diseminar información en Twitter está limitada por la estructura y la posición que ocupan en la red de seguidores. Por tanto, de acuerdo a nuestras observaciones las redes sociales de Internet no posibilitan que la mayoría sea escuchada por la mayoría. De hecho, nuestros resultados implican que Twitter es topocrático, ya que únicamente una minoría de cuentas ubicadas en posiciones privilegiadas en la red de seguidores consiguen que sus mensajes se expandan por toda la red social. En conversaciones políticas, esta minoría de cuentas influyentes se compone principalmente de políticos y medios de comunicación. Los políticos son los mas mencionados ya que la gente les dirige y se refiere a ellos en sus tweets. Mientras que los medios de comunicación son las fuentes desde las que la gente propaga información. En un mundo en el que los datos personales quedan registrados y son cada día mas abundantes y precisos, los resultados del modelo presentado en esta tesis pueden ser usados para fomentar medidas que promuevan la meritocracia. Además, los resultados de los estudios empíricos sobre Twitter que se presentan en la segunda parte de esta tesis son de vital importancia para entender la nueva "sociedad digital" que emerge. En concreto hemos presentado resultados relevantes que caracterizan el comportamiento humano en Internet y que pueden ser usados para crear futuros modelos. Abstract Society can be defined as a complex system that emerges from the cooperation and coordination of billions of individuals and hundreds of countries. Thus, we do not live in social vacuum and the social networks in which we are embedded inevitably shapes our behavior. Here, we present an analytical model and several empirical studies in which we analyze dynamical social systems through a network science perspective. First, we introduce a model to explore how the structure of the social networks underlying society can limit the meritocracy of the economies. Conversely to meritocracy, in this work we introduce the term topocracy. We say that a system is topocratic if the compensation and power available to an individual is determined primarily by her position in a network. Our model is perfectly meritocratic for fully connected networks but becomes topocratic for sparse networks-like the ones in society. In the model, individuals produce and sell content, but also distribute the content produced by others when they belong to the shortest path connecting a buyer and a seller. The production and distribution of content defines two channels of compensation: a meritocratic channel, where individuals are compensated for the content they produce, and a topocratic channel, where individual compensation is based on the number of shortest paths that go through them in the network. We solve the model analytically and show that the distribution of payoffs is meritocratic only if the average degree of the nodes is larger than a root of the total number of nodes. Hence, in the light of our model, the sparsity and structure of networks represents a fundamental constraint to the meritocracy of societies. Next, we present several empirical studies that use data gathered from Twitter to analyze online human behavioral patterns. In particular, we focus on political conversations such as electoral campaigns. We found that the collective attention is highly heterogeneously distributed, as there is a minority of extremely influential accounts. In fact, the ability of individuals to propagate messages or ideas through the platform is constrained by the structure of the follower network underlying the social media and the position they occupy on it. Hence, although people have argued that social media can allow more voices to be heard, our results suggest that Twitter is highly topocratic, as only the minority of well positioned users are widely heard. This minority of influential accounts belong mostly to politicians and traditional media. Politicians tend to be the most mentioned, while media are the sources of information from which people propagate messages. We also propose a methodology to study and measure the emergence of political polarization from social interactions. To this end, we first propose a model to estimate opinions in which a minority of influential individuals propagate their opinions through a social network. The result of the model is an opinion probability density function. Next, we propose an index to quantify the extent to which the resulting distribution is polarized. Finally, we illustrate our methodology by applying it to Twitter data. In a world where personal data is increasingly available, the results of the analytical model introduced in this work can be used to enhance meritocracy and promote policies that help to build more meritocratic societies. Moreover, the results obtained in the latter part, where we have analyzed Twitter, are key to understand the new data-driven society that is emerging. In particular, we have presented relevant information that can be used to benchmark future models for online communication systems or can be used as empirical rules characterizing our online behavior.
Resumo:
The monkey anterior intraparietal area (AIP) encodes visual information about three-dimensional object shape that is used to shape the hand for grasping. In robotics a similar role has been played by modules that fit point cloud data to the superquadric family of shapes and its various extensions. We developed a model of shape tuning in AIP based on cosine tuning to superquadric parameters. However, the model did not fit the data well, and we also found that it was difficult to accurately reproduce these parameters using neural networks with the appropriate inputs (modelled on the caudal intraparietal area, CIP). The latter difficulty was related to the fact that there are large discontinuities in the superquadric parameters between very similar shapes. To address these limitations we adopted an alternative shape parameterization based on an Isomap nonlinear dimension reduction. The Isomap was built using gradients and curvatures of object surface depth. This alternative parameterization was low-dimensional (like superquadrics), but data-driven (similar to an alternative clustering approach that is also sometimes used in robotics) and lacked large discontinuities. Isomaps with 16 or more dimensions reproduced the AIP data fairly well. Moreover, we found that the Isomap parameters could be approximated from CIP-like input much more accurately than the superquadric parameters. We conclude that Isomaps, or perhaps alternative dimension reductions of CIP signals, provide a promising model of AIP tuning. We have now started to integrate our model with a robot hand, to explore the efficacy of Isomap shape reductions in grasp planning. Future work will consider dynamics of spike responses and integration with related visual and motor area models.
Resumo:
We are witnessing a fundamental transformation in how Internet of Things (IoT) is having an impact on the experience users have with data-driven devices, smart appliances, and connected products. The experience of any place is commonly defined as the result of a series of user engagements with a surrounding place in order to carry out daily activities (Golledge, 2002). Knowing about users? experiences becomes vital to the process of designing a map. In the near future, a user will be able to interact directly with any IoT device placed in his surrounding place and very little is known on what kinds of interactions and experiences a map might offer (Roth, 2015). The main challenge is to develop an experience design process to devise maps capable of supporting different user experience dimensions such as cognitive, sensory-physical, affective, and social (Tussyadiah and Zach, 2012). For example, in a smart city of the future, the IoT devices allowing a multimodal interaction with a map could help tourists in the assimilation of their knowledge about points of interest (cognitive experience), their association of sounds and smells to these places (sensory-physical experience), their emotional connection to them (affective experience) and their relationships with other nearby tourists (social experience). This paper aims to describe a conceptual framework for developing a Mapping Experience Design (MXD) process for building maps for smart connected places of the future. Our MXD process is focussed on the cognitive dimension of an experience in which a person perceives a place as a "living entity" that uses and feeds through his experiences. We want to help people to undergo a meaningful experience of a place through mapping what is being communicated during their interactions with the IoT devices situated in this place. Our purpose is to understand how maps can support a person?s experience in making better decisions in real-time.
Resumo:
It is easy to get frustrated at spoken conversational agents (SCAs), perhaps because they seem to be callous. By and large, the quality of human-computer interaction is affected due to the inability of the SCAs to recognise and adapt to user emotional state. Now with the mass appeal of artificially-mediated communication, there has been an increasing need for SCAs to be socially and emotionally intelligent, that is, to infer and adapt to their human interlocutors’ emotions on the fly, in order to ascertain an affective, empathetic and naturalistic interaction. An enhanced quality of interaction would reduce users’ frustrations and consequently increase their satisfactions. These reasons have motivated the development of SCAs towards including socio-emotional elements, turning them into affective and socially-sensitive interfaces. One barrier to the creation of such interfaces has been the lack of methods for modelling emotions in a task-independent environment. Most emotion models for spoken dialog systems are task-dependent and thus cannot be used “as-is” in different applications. This Thesis focuses on improving this, in which it concerns computational modeling of emotion, personality and their interrelationship for task-independent autonomous SCAs. The generation of emotion is driven by needs, inspired by human’s motivational systems. The work in this Thesis is organised in three stages, each one with its own contribution. The first stage involved defining, integrating and quantifying the psychological-based motivational and emotional models sourced from. Later these were transformed into a computational model by implementing them into software entities. The computational model was then incorporated and put to test with an existing SCA host, a HiFi-control agent. The second stage concerned automatic prediction of affect, which has been the main challenge towards the greater aim of infusing social intelligence into the HiFi agent. In recent years, studies on affect detection from voice have moved on to using realistic, non-acted data, which is subtler. However, it is more challenging to perceive subtler emotions and this is demonstrated in tasks such as labelling and machine prediction. In this stage, we attempted to address part of this challenge by considering the roles of user satisfaction ratings and conversational/dialog features as the respective target and predictors in discriminating contentment and frustration, two types of emotions that are known to be prevalent within spoken human-computer interaction. The final stage concerned the evaluation of the emotional model through the HiFi agent. A series of user studies with 70 subjects were conducted in a real-time environment, each in a different phase and with its own conditions. All the studies involved the comparisons between the baseline non-modified and the modified agent. The findings have gone some way towards enhancing our understanding of the utility of emotion in spoken dialog systems in several ways; first, an SCA should not express its emotions blindly, albeit positive. Rather, it should adapt its emotions to user states. Second, low performance in an SCA may be compensated by the exploitation of emotion. Third, the expression of emotion through the exploitation of prosody could better improve users’ perceptions of an SCA compared to exploiting emotions through just lexical contents. Taken together, these findings not only support the success of the emotional model, but also provide substantial evidences with respect to the benefits of adding emotion in an SCA, especially in mitigating users’ frustrations and ultimately improving their satisfactions. Resumen Es relativamente fácil experimentar cierta frustración al interaccionar con agentes conversacionales (Spoken Conversational Agents, SCA), a menudo porque parecen ser un poco insensibles. En general, la calidad de la interacción persona-agente se ve en cierto modo afectada por la incapacidad de los SCAs para identificar y adaptarse al estado emocional de sus usuarios. Actualmente, y debido al creciente atractivo e interés de dichos agentes, surge la necesidad de hacer de los SCAs unos seres cada vez más sociales y emocionalmente inteligentes, es decir, con capacidad para inferir y adaptarse a las emociones de sus interlocutores humanos sobre la marcha, de modo que la interacción resulte más afectiva, empática y, en definitiva, natural. Una interacción mejorada en este sentido permitiría reducir la posible frustración de los usuarios y, en consecuencia, mejorar el nivel de satisfacción alcanzado por los mismos. Estos argumentos justifican y motivan el desarrollo de nuevos SCAs con capacidades socio-emocionales, dotados de interfaces afectivas y socialmente sensibles. Una de las barreras para la creación de tales interfaces ha sido la falta de métodos de modelado de emociones en entornos independientes de tarea. La mayoría de los modelos emocionales empleados por los sistemas de diálogo hablado actuales son dependientes de tarea y, por tanto, no pueden utilizarse "tal cual" en diferentes dominios o aplicaciones. Esta tesis se centra precisamente en la mejora de este aspecto, la definición de modelos computacionales de las emociones, la personalidad y su interrelación para SCAs autónomos e independientes de tarea. Inspirada en los sistemas motivacionales humanos en el ámbito de la psicología, la tesis propone un modelo de generación/producción de la emoción basado en necesidades. El trabajo realizado en la presente tesis está organizado en tres etapas diferenciadas, cada una con su propia contribución. La primera etapa incluyó la definición, integración y cuantificación de los modelos motivacionales de partida y de los modelos emocionales derivados a partir de éstos. Posteriormente, dichos modelos emocionales fueron plasmados en un modelo computacional mediante su implementación software. Este modelo computacional fue incorporado y probado en un SCA anfitrión ya existente, un agente con capacidad para controlar un equipo HiFi, de alta fidelidad. La segunda etapa se orientó hacia el reconocimiento automático de la emoción, aspecto que ha constituido el principal desafío en relación al objetivo mayor de infundir inteligencia social en el agente HiFi. En los últimos años, los estudios sobre reconocimiento de emociones a partir de la voz han pasado de emplear datos actuados a usar datos reales en los que la presencia u observación de emociones se produce de una manera mucho más sutil. El reconocimiento de emociones bajo estas condiciones resulta mucho más complicado y esta dificultad se pone de manifiesto en tareas tales como el etiquetado y el aprendizaje automático. En esta etapa, se abordó el problema del reconocimiento de las emociones del usuario a partir de características o métricas derivadas del propio diálogo usuario-agente. Gracias a dichas métricas, empleadas como predictores o indicadores del grado o nivel de satisfacción alcanzado por el usuario, fue posible discriminar entre satisfacción y frustración, las dos emociones prevalentes durante la interacción usuario-agente. La etapa final corresponde fundamentalmente a la evaluación del modelo emocional por medio del agente Hifi. Con ese propósito se llevó a cabo una serie de estudios con usuarios reales, 70 sujetos, interaccionando con diferentes versiones del agente Hifi en tiempo real, cada uno en una fase diferente y con sus propias características o capacidades emocionales. En particular, todos los estudios realizados han profundizado en la comparación entre una versión de referencia del agente no dotada de ningún comportamiento o característica emocional, y una versión del agente modificada convenientemente con el modelo emocional propuesto. Los resultados obtenidos nos han permitido comprender y valorar mejor la utilidad de las emociones en los sistemas de diálogo hablado. Dicha utilidad depende de varios aspectos. En primer lugar, un SCA no debe expresar sus emociones a ciegas o arbitrariamente, incluso aunque éstas sean positivas. Más bien, debe adaptar sus emociones a los diferentes estados de los usuarios. En segundo lugar, un funcionamiento relativamente pobre por parte de un SCA podría compensarse, en cierto modo, dotando al SCA de comportamiento y capacidades emocionales. En tercer lugar, aprovechar la prosodia como vehículo para expresar las emociones, de manera complementaria al empleo de mensajes con un contenido emocional específico tanto desde el punto de vista léxico como semántico, ayuda a mejorar la percepción por parte de los usuarios de un SCA. Tomados en conjunto, los resultados alcanzados no sólo confirman el éxito del modelo emocional, sino xv que constituyen además una evidencia decisiva con respecto a los beneficios de incorporar emociones en un SCA, especialmente en cuanto a reducir el nivel de frustración de los usuarios y, en última instancia, mejorar su satisfacción.