8 resultados para text vector space model
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Social interactions have been the focus of social science research for a century, but their study has recently been revolutionized by novel data sources and by methods from computer science, network science, and complex systems science. The study of social interactions is crucial for understanding complex societal behaviours. Social interactions are naturally represented as networks, which have emerged as a unifying mathematical language to understand structural and dynamical aspects of socio-technical systems. Networks are, however, highly dimensional objects, especially when considering the scales of real-world systems and the need to model the temporal dimension. Hence the study of empirical data from social systems is challenging both from a conceptual and a computational standpoint. A possible approach to tackling such a challenge is to use dimensionality reduction techniques that represent network entities in a low-dimensional feature space, preserving some desired properties of the original data. Low-dimensional vector space representations, also known as network embeddings, have been extensively studied, also as a way to feed network data to machine learning algorithms. Network embeddings were initially developed for static networks and then extended to incorporate temporal network data. We focus on dimensionality reduction techniques for time-resolved social interaction data modelled as temporal networks. We introduce a novel embedding technique that models the temporal and structural similarities of events rather than nodes. Using empirical data on social interactions, we show that this representation captures information relevant for the study of dynamical processes unfolding over the network, such as epidemic spreading. We then turn to another large-scale dataset on social interactions: a popular Web-based crowdfunding platform. We show that tensor-based representations of the data and dimensionality reduction techniques such as tensor factorization allow us to uncover the structural and temporal aspects of the system and to relate them to geographic and temporal activity patterns.
Resumo:
Multi-phase electrical drives are potential candidates for the employment in innovative electric vehicle powertrains, in response to the request for high efficiency and reliability of this type of application. In addition to the multi-phase technology, in the last decades also, multilevel technology has been developed. These two technologies are somewhat complementary since both allow increasing the power rating of the system without increasing the current and voltage ratings of the single power switches of the inverter. In this thesis, some different topics concerning the inverter, the motor and the fault diagnosis of an electric vehicle powertrain are addressed. In particular, the attention is focused on multi-phase and multilevel technologies and their potential advantages with respect to traditional technologies. First of all, the mathematical models of two multi-phase machines, a five-phase induction machine and an asymmetrical six-phase permanent magnet synchronous machines are developed using the Vector Space Decomposition approach. Then, a new modulation technique for multi-phase multilevel T-type inverters, which solves the voltage balancing problem of the DC-link capacitors, ensuring flexible management of the capacitor voltages, is developed. The technique is based on the proper selection of the zero-sequence component of the modulating signals. Subsequently, a diagnostic technique for detecting the state of health of the rotor magnets in a six-phase permanent magnet synchronous machine is established. The technique is based on analysing the electromotive force induced in the stator windings by the rotor magnets. Furthermore, an innovative algorithm able to extend the linear modulation region for five-phase inverters, taking advantage of the multiple degrees of freedom available in multi-phase systems is presented. Finally, the mathematical model of an eighteen-phase squirrel cage induction motor is defined. This activity aims to develop a motor drive able to change the number of poles of the machine during the machine operation.
Resumo:
The study of random probability measures is a lively research topic that has attracted interest from different fields in recent years. In this thesis, we consider random probability measures in the context of Bayesian nonparametrics, where the law of a random probability measure is used as prior distribution, and in the context of distributional data analysis, where the goal is to perform inference given avsample from the law of a random probability measure. The contributions contained in this thesis can be subdivided according to three different topics: (i) the use of almost surely discrete repulsive random measures (i.e., whose support points are well separated) for Bayesian model-based clustering, (ii) the proposal of new laws for collections of random probability measures for Bayesian density estimation of partially exchangeable data subdivided into different groups, and (iii) the study of principal component analysis and regression models for probability distributions seen as elements of the 2-Wasserstein space. Specifically, for point (i) above we propose an efficient Markov chain Monte Carlo algorithm for posterior inference, which sidesteps the need of split-merge reversible jump moves typically associated with poor performance, we propose a model for clustering high-dimensional data by introducing a novel class of anisotropic determinantal point processes, and study the distributional properties of the repulsive measures, shedding light on important theoretical results which enable more principled prior elicitation and more efficient posterior simulation algorithms. For point (ii) above, we consider several models suitable for clustering homogeneous populations, inducing spatial dependence across groups of data, extracting the characteristic traits common to all the data-groups, and propose a novel vector autoregressive model to study of growth curves of Singaporean kids. Finally, for point (iii), we propose a novel class of projected statistical methods for distributional data analysis for measures on the real line and on the unit-circle.
Resumo:
The study of ancient, undeciphered scripts presents unique challenges, that depend both on the nature of the problem and on the peculiarities of each writing system. In this thesis, I present two computational approaches that are tailored to two different tasks and writing systems. The first of these methods is aimed at the decipherment of the Linear A afraction signs, in order to discover their numerical values. This is achieved with a combination of constraint programming, ad-hoc metrics and paleographic considerations. The second main contribution of this thesis regards the creation of an unsupervised deep learning model which uses drawings of signs from ancient writing system to learn to distinguish different graphemes in the vector space. This system, which is based on techniques used in the field of computer vision, is adapted to the study of ancient writing systems by incorporating information about sequences in the model, mirroring what is often done in natural language processing. In order to develop this model, the Cypriot Greek Syllabary is used as a target, since this is a deciphered writing system. Finally, this unsupervised model is adapted to the undeciphered Cypro-Minoan and it is used to answer open questions about this script. In particular, by reconstructing multiple allographs that are not agreed upon by paleographers, it supports the idea that Cypro-Minoan is a single script and not a collection of three script like it was proposed in the literature. These results on two different tasks shows that computational methods can be applied to undeciphered scripts, despite the relatively low amount of available data, paving the way for further advancement in paleography using these methods.
Resumo:
Context-aware computing is currently considered the most promising approach to overcome information overload and to speed up access to relevant information and services. Context-awareness may be derived from many sources, including user profile and preferences, network information, sensor analysis; usually context-awareness relies on the ability of computing devices to interact with the physical world, i.e. with the natural and artificial objects hosted within the "environment”. Ideally, context-aware applications should not be intrusive and should be able to react according to user’s context, with minimum user effort. Context is an application dependent multidimensional space and the location is an important part of it since the very beginning. Location can be used to guide applications, in providing information or functions that are most appropriate for a specific position. Hence location systems play a crucial role. There are several technologies and systems for computing location to a vary degree of accuracy and tailored for specific space model, i.e. indoors or outdoors, structured spaces or unstructured spaces. The research challenge faced by this thesis is related to pedestrian positioning in heterogeneous environments. Particularly, the focus will be on pedestrian identification, localization, orientation and activity recognition. This research was mainly carried out within the “mobile and ambient systems” workgroup of EPOCH, a 6FP NoE on the application of ICT to Cultural Heritage. Therefore applications in Cultural Heritage sites were the main target of the context-aware services discussed. Cultural Heritage sites are considered significant test-beds in Context-aware computing for many reasons. For example building a smart environment in museums or in protected sites is a challenging task, because localization and tracking are usually based on technologies that are difficult to hide or harmonize within the environment. Therefore it is expected that the experience made with this research may be useful also in domains other than Cultural Heritage. This work presents three different approaches to the pedestrian identification, positioning and tracking: Pedestrian navigation by means of a wearable inertial sensing platform assisted by the vision based tracking system for initial settings an real-time calibration; Pedestrian navigation by means of a wearable inertial sensing platform augmented with GPS measurements; Pedestrian identification and tracking, combining the vision based tracking system with WiFi localization. The proposed localization systems have been mainly used to enhance Cultural Heritage applications in providing information and services depending on the user’s actual context, in particular depending on the user’s location.
Resumo:
Two of the main features of today complex software systems like pervasive computing systems and Internet-based applications are distribution and openness. Distribution revolves around three orthogonal dimensions: (i) distribution of control|systems are characterised by several independent computational entities and devices, each representing an autonomous and proactive locus of control; (ii) spatial distribution|entities and devices are physically distributed and connected in a global (such as the Internet) or local network; and (iii) temporal distribution|interacting system components come and go over time, and are not required to be available for interaction at the same time. Openness deals with the heterogeneity and dynamism of system components: complex computational systems are open to the integration of diverse components, heterogeneous in terms of architecture and technology, and are dynamic since they allow components to be updated, added, or removed while the system is running. The engineering of open and distributed computational systems mandates for the adoption of a software infrastructure whose underlying model and technology could provide the required level of uncoupling among system components. This is the main motivation behind current research trends in the area of coordination middleware to exploit tuple-based coordination models in the engineering of complex software systems, since they intrinsically provide coordinated components with communication uncoupling and further details in the references therein. An additional daunting challenge for tuple-based models comes from knowledge-intensive application scenarios, namely, scenarios where most of the activities are based on knowledge in some form|and where knowledge becomes the prominent means by which systems get coordinated. Handling knowledge in tuple-based systems induces problems in terms of syntax - e.g., two tuples containing the same data may not match due to differences in the tuple structure - and (mostly) of semantics|e.g., two tuples representing the same information may not match based on a dierent syntax adopted. Till now, the problem has been faced by exploiting tuple-based coordination within a middleware for knowledge intensive environments: e.g., experiments with tuple-based coordination within a Semantic Web middleware (surveys analogous approaches). However, they appear to be designed to tackle the design of coordination for specic application contexts like Semantic Web and Semantic Web Services, and they result in a rather involved extension of the tuple space model. The main goal of this thesis was to conceive a more general approach to semantic coordination. In particular, it was developed the model and technology of semantic tuple centres. It is adopted the tuple centre model as main coordination abstraction to manage system interactions. A tuple centre can be seen as a programmable tuple space, i.e. an extension of a Linda tuple space, where the behaviour of the tuple space can be programmed so as to react to interaction events. By encapsulating coordination laws within coordination media, tuple centres promote coordination uncoupling among coordinated components. Then, the tuple centre model was semantically enriched: a main design choice in this work was to try not to completely redesign the existing syntactic tuple space model, but rather provide a smooth extension that { although supporting semantic reasoning { keep the simplicity of tuple and tuple matching as easier as possible. By encapsulating the semantic representation of the domain of discourse within coordination media, semantic tuple centres promote semantic uncoupling among coordinated components. The main contributions of the thesis are: (i) the design of the semantic tuple centre model; (ii) the implementation and evaluation of the model based on an existent coordination infrastructure; (iii) a view of the application scenarios in which semantic tuple centres seem to be suitable as coordination media.
Resumo:
This thesis analysis micro and macro aspect of applied fiscal policy issues. The first chapter investigates the extent to which local budget spending composition reacts to fiscal rules variations. I consider the budget of Italian municipalities and exploit specific changes in the Domestic Stability Pact’s rules, to perform a difference-in-discontinuities analysis. The results show that imposing a cap on the total amount of consumption and investment is not as binding as two caps, one for consumption and a different one for investment. More specifically, consumption is triggered by changes in wages and services spending, while investment relies on infrastructure movements. In addition, there is evidence that when an increase in investment is achieved, there is also a higher budget deficit level. The second chapter intends to analyze the extent to which fiscal policy shocks are able to affect macrovariables during business cycle fluctuations, differentiating among three intervention channels: public taxation, consumption and investment. The econometric methodology implemented is a Panel Vector Autoregressive model with a structural characterization. The results show that fiscal shocks have different multipliers in relation to expansion or contraction periods: output does not react during good times while there are significant effects in bad ones. The third chapter evaluates the effects of fiscal policy announcements by the Italian government on the long-term sovereign bond spread of Italy relative to Germany. After collecting data on relevant fiscal policy announcements, we perform an econometric comparative analysis between the three cabinets that followed one another during the period 2009-2013. The results suggest that only fiscal policy announcements made by members of Monti’s cabinet have been effective in influencing significantly the Italian spread in the expected direction, revealing a remarkable credibility gap between Berlusconi’s and Letta’s governments with respect to Monti’s administration.
Resumo:
The need for a convergence between semi-structured data management and Information Retrieval techniques is manifest to the scientific community. In order to fulfil this growing request, W3C has recently proposed XQuery Full Text, an IR-oriented extension of XQuery. However, the issue of query optimization requires the study of important properties like query equivalence and containment; to this aim, a formal representation of document and queries is needed. The goal of this thesis is to establish such formal background. We define a data model for XML documents and propose an algebra able to represent most of XQuery Full-Text expressions. We show how an XQuery Full-Text expression can be translated into an algebraic expression and how an algebraic expression can be optimized.