10 resultados para Constrained clustering
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The present work proposes a method based on CLV (Clustering around Latent Variables) for identifying groups of consumers in L-shape data. This kind of datastructure is very common in consumer studies where a panel of consumers is asked to assess the global liking of a certain number of products and then, preference scores are arranged in a two-way table Y. External information on both products (physicalchemical description or sensory attributes) and consumers (socio-demographic background, purchase behaviours or consumption habits) may be available in a row descriptor matrix X and in a column descriptor matrix Z respectively. The aim of this method is to automatically provide a consumer segmentation where all the three matrices play an active role in the classification, getting homogeneous groups from all points of view: preference, products and consumer characteristics. The proposed clustering method is illustrated on data from preference studies on food products: juices based on berry fruits and traditional cheeses from Trentino. The hedonic ratings given by the consumer panel on the products under study were explained with respect to the product chemical compounds, sensory evaluation and consumer socio-demographic information, purchase behaviour and consumption habits.
Resumo:
The intensity of regional specialization in specific activities, and conversely, the level of industrial concentration in specific locations, has been used as a complementary evidence for the existence and significance of externalities. Additionally, economists have mainly focused the debate on disentangling the sources of specialization and concentration processes according to three vectors: natural advantages, internal, and external scale economies. The arbitrariness of partitions plays a key role in capturing these effects, while the selection of the partition would have to reflect the actual characteristics of the economy. Thus, the identification of spatial boundaries to measure specialization becomes critical, since most likely the model will be adapted to different scales of distance, and be influenced by different types of externalities or economies of agglomeration, which are based on the mechanisms of interaction with particular requirements of spatial proximity. This work is based on the analysis of the spatial aspect of economic specialization supported by the manufacturing industry case. The main objective is to propose, for discrete and continuous space: i) a measure of global specialization; ii) a local disaggregation of the global measure; and iii) a spatial clustering method for the identification of specialized agglomerations.
Resumo:
There are different ways to do cluster analysis of categorical data in the literature and the choice among them is strongly related to the aim of the researcher, if we do not take into account time and economical constraints. Main approaches for clustering are usually distinguished into model-based and distance-based methods: the former assume that objects belonging to the same class are similar in the sense that their observed values come from the same probability distribution, whose parameters are unknown and need to be estimated; the latter evaluate distances among objects by a defined dissimilarity measure and, basing on it, allocate units to the closest group. In clustering, one may be interested in the classification of similar objects into groups, and one may be interested in finding observations that come from the same true homogeneous distribution. But do both of these aims lead to the same clustering? And how good are clustering methods designed to fulfil one of these aims in terms of the other? In order to answer, two approaches, namely a latent class model (mixture of multinomial distributions) and a partition around medoids one, are evaluated and compared by Adjusted Rand Index, Average Silhouette Width and Pearson-Gamma indexes in a fairly wide simulation study. Simulation outcomes are plotted in bi-dimensional graphs via Multidimensional Scaling; size of points is proportional to the number of points that overlap and different colours are used according to the cluster membership.
Resumo:
Power electronic converters are extensively adopted for the solution of timely issues, such as power quality improvement in industrial plants, energy management in hybrid electrical systems, and control of electrical generators for renewables. Beside nonlinearity, this systems are typically characterized by hard constraints on the control inputs, and sometimes the state variables. In this respect, control laws able to handle input saturation are crucial to formally characterize the systems stability and performance properties. From a practical viewpoint, a proper saturation management allows to extend the systems transient and steady-state operating ranges, improving their reliability and availability. The main topic of this thesis concern saturated control methodologies, based on modern approaches, applied to power electronics and electromechanical systems. The pursued objective is to provide formal results under any saturation scenario, overcoming the drawbacks of the classic solution commonly applied to cope with saturation of power converters, and enhancing performance. For this purpose two main approaches are exploited and extended to deal with power electronic applications: modern anti-windup strategies, providing formal results and systematic design rules for the anti-windup compensator, devoted to handle control saturation, and “one step” saturated feedback design techniques, relying on a suitable characterization of the saturation nonlinearity and less conservative extensions of standard absolute stability theory results. The first part of the thesis is devoted to present and develop a novel general anti-windup scheme, which is then specifically applied to a class of power converters adopted for power quality enhancement in industrial plants. In the second part a polytopic differential inclusion representation of saturation nonlinearity is presented and extended to deal with a class of multiple input power converters, used to manage hybrid electrical energy sources. The third part regards adaptive observers design for robust estimation of the parameters required for high performance control of power systems.
Resumo:
Bioinformatics, in the last few decades, has played a fundamental role to give sense to the huge amount of data produced. Obtained the complete sequence of a genome, the major problem of knowing as much as possible of its coding regions, is crucial. Protein sequence annotation is challenging and, due to the size of the problem, only computational approaches can provide a feasible solution. As it has been recently pointed out by the Critical Assessment of Function Annotations (CAFA), most accurate methods are those based on the transfer-by-homology approach and the most incisive contribution is given by cross-genome comparisons. In the present thesis it is described a non-hierarchical sequence clustering method for protein automatic large-scale annotation, called “The Bologna Annotation Resource Plus” (BAR+). The method is based on an all-against-all alignment of more than 13 millions protein sequences characterized by a very stringent metric. BAR+ can safely transfer functional features (Gene Ontology and Pfam terms) inside clusters by means of a statistical validation, even in the case of multi-domain proteins. Within BAR+ clusters it is also possible to transfer the three dimensional structure (when a template is available). This is possible by the way of cluster-specific HMM profiles that can be used to calculate reliable template-to-target alignments even in the case of distantly related proteins (sequence identity < 30%). Other BAR+ based applications have been developed during my doctorate including the prediction of Magnesium binding sites in human proteins, the ABC transporters superfamily classification and the functional prediction (GO terms) of the CAFA targets. Remarkably, in the CAFA assessment, BAR+ placed among the ten most accurate methods. At present, as a web server for the functional and structural protein sequence annotation, BAR+ is freely available at http://bar.biocomp.unibo.it/bar2.0.
Resumo:
This dissertation studies the geometric static problem of under-constrained cable-driven parallel robots (CDPRs) supported by n cables, with n ≤ 6. The task consists of determining the overall robot configuration when a set of n variables is assigned. When variables relating to the platform posture are assigned, an inverse geometric static problem (IGP) must be solved; whereas, when cable lengths are given, a direct geometric static problem (DGP) must be considered. Both problems are challenging, as the robot continues to preserve some degrees of freedom even after n variables are assigned, with the final configuration determined by the applied forces. Hence, kinematics and statics are coupled and must be resolved simultaneously. In this dissertation, a general methodology is presented for modelling the aforementioned scenario with a set of algebraic equations. An elimination procedure is provided, aimed at solving the governing equations analytically and obtaining a least-degree univariate polynomial in the corresponding ideal for any value of n. Although an analytical procedure based on elimination is important from a mathematical point of view, providing an upper bound on the number of solutions in the complex field, it is not practical to compute these solutions as it would be very time-consuming. Thus, for the efficient computation of the solution set, a numerical procedure based on homotopy continuation is implemented. A continuation algorithm is also applied to find a set of robot parameters with the maximum number of real assembly modes for a given DGP. Finally, the end-effector pose depends on the applied load and may change due to external disturbances. An investigation into equilibrium stability is therefore performed.
Resumo:
This thesis work deals, principally, with the development of different chemical protocols ranging from environmental sustainability peptide synthesis to asymmetric synthesis of modified tryptophans to a series of straightforward procedures for constraining peptide backbones without the need for a pre-formed scaffold. Much efforts have been dedicated to the structural analysis in a biomimetic environment, fundamental for predicting the in vivo conformation of compounds, as well as for giving a rationale to the experimentally determined bioactivity. The conformational analyses in solution has been done mostly by NMR (2D gCosy, Roesy, VT, titration experiments, molecular dynamics, etc.), FT-IR and ECD spectroscopy. As a practical application, 3D rigid scaffolds have been employed for the synthesis of biological active compounds based on peptidomimetic and retro-mimetic structures. These mimics have been investigated for their potential as antiflammatory agents and actually the results obtained are very promising. Moreover, the synthesis of Amo ring permitted the development of an alternative high effective synthetic pathway for obtaining Linezolid antibiotic. The final section is, instead, dedicated to the construction of a new biosensor based on zeolite L SAMs functionalized with the integrin ligand c[RGDfK], that has showed high efficiency for the selective detection of tumor cells. Such kind of sensor could, in fact, enable the convenient, non-invasive detection and diagnosis of cancer in early stages, from a few drops of a patient's blood or other biological fluids. In conclusion, the researches described herein demonstrate that the peptidomimetic approach to 3D definite structures, allows unambiguous investigation of the structure-activity relationships, giving an access to a wide range bioactive compounds of pharmaceutical interest to use not only as potential drugs but also for diagnostic and theranostic applications.
Resumo:
In the framework of industrial problems, the application of Constrained Optimization is known to have overall very good modeling capability and performance and stands as one of the most powerful, explored, and exploited tool to address prescriptive tasks. The number of applications is huge, ranging from logistics to transportation, packing, production, telecommunication, scheduling, and much more. The main reason behind this success is to be found in the remarkable effort put in the last decades by the OR community to develop realistic models and devise exact or approximate methods to solve the largest variety of constrained or combinatorial optimization problems, together with the spread of computational power and easily accessible OR software and resources. On the other hand, the technological advancements lead to a data wealth never seen before and increasingly push towards methods able to extract useful knowledge from them; among the data-driven methods, Machine Learning techniques appear to be one of the most promising, thanks to its successes in domains like Image Recognition, Natural Language Processes and playing games, but also the amount of research involved. The purpose of the present research is to study how Machine Learning and Constrained Optimization can be used together to achieve systems able to leverage the strengths of both methods: this would open the way to exploiting decades of research on resolution techniques for COPs and constructing models able to adapt and learn from available data. In the first part of this work, we survey the existing techniques and classify them according to the type, method, or scope of the integration; subsequently, we introduce a novel and general algorithm devised to inject knowledge into learning models through constraints, Moving Target. In the last part of the thesis, two applications stemming from real-world projects and done in collaboration with Optit will be presented.
Resumo:
This doctoral thesis focuses on the study of historical shallow landslide activity over time in response to anthropogenic forcing on land use, through the compilation of multi-temporal landslide inventories. The study areas, located in contrasting settings and characterized by different history of land-cover changes, include the Sillaro River basin (Italy) and the Tsitika and Eve River basins (coastal British Columbia). The Sillaro River basin belongs to clay-dominated settings, characterized by extensive badland development, and dominated by earth slides and earthflows. Here, forest removal began in the Roman period and has been followed by agricultural land abandonment and natural revegetation in recent time. By contrast, the Tsitika-Eve River basins are characterized by granitic and basaltic lithologies, and dominated by debris slides, debris flows and debris avalanches. In this setting, anthropogenic impacts started in 1960’s and have involved logging operation. The thesis begins with an introductory chapter, followed by a methodological section, where a multi-temporal mapping approach is proposed and tested at four landslide sites of the Sillaro River basin. Results, in terms of inventory completeness in time and space, are compared against the existing region-wide Emilia-Romagna inventory. This approach is then applied at the Sillaro River basin scale, where the multi-temporal inventory obtained is used to investigate the landslide activity in relation to historical land cover changes across geologic domains and in relation to hydro-meteorological forcing. Then, the impact of timber harvesting and road construction on landslide activity and sediment transfer in the Tsitika-Eve River basins is investigated, with a focus on the controls that interactions between landscape morphometry and cutblock location may have on landslide size-frequency relations. The thesis ends with a summary of the main findings and discusses advantages and limitations associated with the compilation of multi-temporal inventories in the two settings during different periods of human-driven, land-cover dynamics.
Resumo:
The thesis aims to present a comprehensive and holistic overview on cybersecurity and privacy & data protection aspects related to IoT resource-constrained devices. Chapter 1 introduces the current technical landscape by providing a working definition and architecture taxonomy of ‘Internet of Things’ and ‘resource-constrained devices’, coupled with a threat landscape where each specific attack is linked to a layer of the taxonomy. Chapter 2 lays down the theoretical foundations for an interdisciplinary approach and a unified, holistic vision of cybersecurity, safety and privacy justified by the ‘IoT revolution’ through the so-called infraethical perspective. Chapter 3 investigates whether and to what extent the fast-evolving European cybersecurity regulatory framework addresses the security challenges brought about by the IoT by allocating legal responsibilities to the right parties. Chapters 4 and 5 focus, on the other hand, on ‘privacy’ understood by proxy as to include EU data protection. In particular, Chapter 4 addresses three legal challenges brought about by the ubiquitous IoT data and metadata processing to EU privacy and data protection legal frameworks i.e., the ePrivacy Directive and the GDPR. Chapter 5 casts light on the risk management tool enshrined in EU data protection law, that is, Data Protection Impact Assessment (DPIA) and proposes an original DPIA methodology for connected devices, building on the CNIL (French data protection authority) model.