10 resultados para CLASSIFIER
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
One of the problems in the analysis of nucleus-nucleus collisions is to get information on the value of the impact parameter b. This work consists in the application of pattern recognition techniques aimed at associating values of b to groups of events. To this end, a support vec- tor machine (SVM) classifier is adopted to analyze multifragmentation reactions. This method allows to backtracing the values of b through a particular multidimensional analysis. The SVM classification con- sists of two main phase. In the first one, known as training phase, the classifier learns to discriminate the events that are generated by two different model:Classical Molecular Dynamics (CMD) and Heavy- Ion Phase-Space Exploration (HIPSE) for the reaction: 58Ni +48 Ca at 25 AMeV. To check the classification of events in the second one, known as test phase, what has been learned is tested on new events generated by the same models. These new results have been com- pared to the ones obtained through others techniques of backtracing the impact parameter. Our tests show that, following this approach, the central collisions and peripheral collisions, for the CMD events, are always better classified with respect to the classification by the others techniques of backtracing. We have finally performed the SVM classification on the experimental data measured by NUCL-EX col- laboration with CHIMERA apparatus for the previous reaction.
Resumo:
In the present study we are using multi variate analysis techniques to discriminate signal from background in the fully hadronic decay channel of ttbar events. We give a brief introduction to the role of the Top quark in the standard model and a general description of the CMS Experiment at LHC. We have used the CMS experiment computing and software infrastructure to generate and prepare the data samples used in this analysis. We tested the performance of three different classifiers applied to our data samples and used the selection obtained with the Multi Layer Perceptron classifier to give an estimation of the statistical and systematical uncertainty on the cross section measurement.
Resumo:
Water distribution networks optimization is a challenging problem due to the dimension and the complexity of these systems. Since the last half of the twentieth century this field has been investigated by many authors. Recently, to overcome discrete nature of variables and non linearity of equations, the research has been focused on the development of heuristic algorithms. This algorithms do not require continuity and linearity of the problem functions because they are linked to an external hydraulic simulator that solve equations of mass continuity and of energy conservation of the network. In this work, a NSGA-II (Non-dominating Sorting Genetic Algorithm) has been used. This is a heuristic multi-objective genetic algorithm based on the analogy of evolution in nature. Starting from an initial random set of solutions, called population, it evolves them towards a front of solutions that minimize, separately and contemporaneously, all the objectives. This can be very useful in practical problems where multiple and discordant goals are common. Usually, one of the main drawback of these algorithms is related to time consuming: being a stochastic research, a lot of solutions must be analized before good ones are found. Results of this thesis about the classical optimal design problem shows that is possible to improve results modifying the mathematical definition of objective functions and the survival criterion, inserting good solutions created by a Cellular Automata and using rules created by classifier algorithm (C4.5). This part has been tested using the version of NSGA-II supplied by Centre for Water Systems (University of Exeter, UK) in MATLAB® environment. Even if orientating the research can constrain the algorithm with the risk of not finding the optimal set of solutions, it can greatly improve the results. Subsequently, thanks to CINECA help, a version of NSGA-II has been implemented in C language and parallelized: results about the global parallelization show the speed up, while results about the island parallelization show that communication among islands can improve the optimization. Finally, some tests about the optimization of pump scheduling have been carried out. In this case, good results are found for a small network, while the solutions of a big problem are affected by the lack of constraints on the number of pump switches. Possible future research is about the insertion of further constraints and the evolution guide. In the end, the optimization of water distribution systems is still far from a definitive solution, but the improvement in this field can be very useful in reducing the solutions cost of practical problems, where the high number of variables makes their management very difficult from human point of view.
Resumo:
Machine learning comprises a series of techniques for automatic extraction of meaningful information from large collections of noisy data. In many real world applications, data is naturally represented in structured form. Since traditional methods in machine learning deal with vectorial information, they require an a priori form of preprocessing. Among all the learning techniques for dealing with structured data, kernel methods are recognized to have a strong theoretical background and to be effective approaches. They do not require an explicit vectorial representation of the data in terms of features, but rely on a measure of similarity between any pair of objects of a domain, the kernel function. Designing fast and good kernel functions is a challenging problem. In the case of tree structured data two issues become relevant: kernel for trees should not be sparse and should be fast to compute. The sparsity problem arises when, given a dataset and a kernel function, most structures of the dataset are completely dissimilar to one another. In those cases the classifier has too few information for making correct predictions on unseen data. In fact, it tends to produce a discriminating function behaving as the nearest neighbour rule. Sparsity is likely to arise for some standard tree kernel functions, such as the subtree and subset tree kernel, when they are applied to datasets with node labels belonging to a large domain. A second drawback of using tree kernels is the time complexity required both in learning and classification phases. Such a complexity can sometimes prevents the kernel application in scenarios involving large amount of data. This thesis proposes three contributions for resolving the above issues of kernel for trees. A first contribution aims at creating kernel functions which adapt to the statistical properties of the dataset, thus reducing its sparsity with respect to traditional tree kernel functions. Specifically, we propose to encode the input trees by an algorithm able to project the data onto a lower dimensional space with the property that similar structures are mapped similarly. By building kernel functions on the lower dimensional representation, we are able to perform inexact matchings between different inputs in the original space. A second contribution is the proposal of a novel kernel function based on the convolution kernel framework. Convolution kernel measures the similarity of two objects in terms of the similarities of their subparts. Most convolution kernels are based on counting the number of shared substructures, partially discarding information about their position in the original structure. The kernel function we propose is, instead, especially focused on this aspect. A third contribution is devoted at reducing the computational burden related to the calculation of a kernel function between a tree and a forest of trees, which is a typical operation in the classification phase and, for some algorithms, also in the learning phase. We propose a general methodology applicable to convolution kernels. Moreover, we show an instantiation of our technique when kernels such as the subtree and subset tree kernels are employed. In those cases, Direct Acyclic Graphs can be used to compactly represent shared substructures in different trees, thus reducing the computational burden and storage requirements.
Resumo:
Ambient Intelligence (AmI) envisions a world where smart, electronic environments are aware and responsive to their context. People moving into these settings engage many computational devices and systems simultaneously even if they are not aware of their presence. AmI stems from the convergence of three key technologies: ubiquitous computing, ubiquitous communication and natural interfaces. The dependence on a large amount of fixed and mobile sensors embedded into the environment makes of Wireless Sensor Networks one of the most relevant enabling technologies for AmI. WSN are complex systems made up of a number of sensor nodes, simple devices that typically embed a low power computational unit (microcontrollers, FPGAs etc.), a wireless communication unit, one or more sensors and a some form of energy supply (either batteries or energy scavenger modules). Low-cost, low-computational power, low energy consumption and small size are characteristics that must be taken into consideration when designing and dealing with WSNs. In order to handle the large amount of data generated by a WSN several multi sensor data fusion techniques have been developed. The aim of multisensor data fusion is to combine data to achieve better accuracy and inferences than could be achieved by the use of a single sensor alone. In this dissertation we present our results in building several AmI applications suitable for a WSN implementation. The work can be divided into two main areas: Multimodal Surveillance and Activity Recognition. Novel techniques to handle data from a network of low-cost, low-power Pyroelectric InfraRed (PIR) sensors are presented. Such techniques allow the detection of the number of people moving in the environment, their direction of movement and their position. We discuss how a mesh of PIR sensors can be integrated with a video surveillance system to increase its performance in people tracking. Furthermore we embed a PIR sensor within the design of a Wireless Video Sensor Node (WVSN) to extend its lifetime. Activity recognition is a fundamental block in natural interfaces. A challenging objective is to design an activity recognition system that is able to exploit a redundant but unreliable WSN. We present our activity in building a novel activity recognition architecture for such a dynamic system. The architecture has a hierarchical structure where simple nodes performs gesture classification and a high level meta classifiers fuses a changing number of classifier outputs. We demonstrate the benefit of such architecture in terms of increased recognition performance, and fault and noise robustness. Furthermore we show how we can extend network lifetime by performing a performance-power trade-off. Smart objects can enhance user experience within smart environments. We present our work in extending the capabilities of the Smart Micrel Cube (SMCube), a smart object used as tangible interface within a tangible computing framework, through the development of a gesture recognition algorithm suitable for this limited computational power device. Finally the development of activity recognition techniques can greatly benefit from the availability of shared dataset. We report our experience in building a dataset for activity recognition. Such dataset is freely available to the scientific community for research purposes and can be used as a testbench for developing, testing and comparing different activity recognition techniques.
Resumo:
Healthcare, Human Computer Interfaces (HCI), Security and Biometry are the most promising application scenario directly involved in the Body Area Networks (BANs) evolution. Both wearable devices and sensors directly integrated in garments envision a word in which each of us is supervised by an invisible assistant monitoring our health and daily-life activities. New opportunities are enabled because improvements in sensors miniaturization and transmission efficiency of the wireless protocols, that achieved the integration of high computational power aboard independent, energy-autonomous, small form factor devices. Application’s purposes are various: (I) data collection to achieve off-line knowledge discovery; (II) user notification of his/her activities or in case a danger occurs; (III) biofeedback rehabilitation; (IV) remote alarm activation in case the subject need assistance; (V) introduction of a more natural interaction with the surrounding computerized environment; (VI) users identification by physiological or behavioral characteristics. Telemedicine and mHealth [1] are two of the leading concepts directly related to healthcare. The capability to borne unobtrusiveness objects supports users’ autonomy. A new sense of freedom is shown to the user, not only supported by a psychological help but a real safety improvement. Furthermore, medical community aims the introduction of new devices to innovate patient treatments. In particular, the extension of the ambulatory analysis in the real life scenario by proving continuous acquisition. The wide diffusion of emerging wellness portable equipment extended the usability of wearable devices also for fitness and training by monitoring user performance on the working task. The learning of the right execution techniques related to work, sport, music can be supported by an electronic trainer furnishing the adequate aid. HCIs made real the concept of Ubiquitous, Pervasive Computing and Calm Technology introduced in the 1988 by Marc Weiser and John Seeley Brown. They promotes the creation of pervasive environments, enhancing the human experience. Context aware, adaptive and proactive environments serve and help people by becoming sensitive and reactive to their presence, since electronics is ubiquitous and deployed everywhere. In this thesis we pay attention to the integration of all the aspects involved in a BAN development. Starting from the choice of sensors we design the node, configure the radio network, implement real-time data analysis and provide a feedback to the user. We present algorithms to be implemented in wearable assistant for posture and gait analysis and to provide assistance on different walking conditions, preventing falls. Our aim, expressed by the idea to contribute at the development of a non proprietary solutions, driven us to integrate commercial and standard solutions in our devices. We use sensors available on the market and avoided to design specialized sensors in ASIC technologies. We employ standard radio protocol and open source projects when it was achieved. The specific contributions of the PhD research activities are presented and discussed in the following. • We have designed and build several wireless sensor node providing both sensing and actuator capability making the focus on the flexibility, small form factor and low power consumption. The key idea was to develop a simple and general purpose architecture for rapid analysis, prototyping and deployment of BAN solutions. Two different sensing units are integrated: kinematic (3D accelerometer and 3D gyroscopes) and kinetic (foot-floor contact pressure forces). Two kind of feedbacks were implemented: audio and vibrotactile. • Since the system built is a suitable platform for testing and measuring the features and the constraints of a sensor network (radio communication, network protocols, power consumption and autonomy), we made a comparison between Bluetooth and ZigBee performance in terms of throughput and energy efficiency. Test in the field evaluate the usability in the fall detection scenario. • To prove the flexibility of the architecture designed, we have implemented a wearable system for human posture rehabilitation. The application was developed in conjunction with biomedical engineers who provided the audio-algorithms to furnish a biofeedback to the user about his/her stability. • We explored off-line gait analysis of collected data, developing an algorithm to detect foot inclination in the sagittal plane, during walk. • In collaboration with the Wearable Lab – ETH, Zurich, we developed an algorithm to monitor the user during several walking condition where the user carry a load. The remainder of the thesis is organized as follows. Chapter I gives an overview about Body Area Networks (BANs), illustrating the relevant features of this technology and the key challenges still open. It concludes with a short list of the real solutions and prototypes proposed by academic research and manufacturers. The domain of the posture and gait analysis, the methodologies, and the technologies used to provide real-time feedback on detected events, are illustrated in Chapter II. The Chapter III and IV, respectively, shown BANs developed with the purpose to detect fall and monitor the gait taking advantage by two inertial measurement unit and baropodometric insoles. Chapter V reports an audio-biofeedback system to improve balance on the information provided by the use centre of mass. A walking assistant based on the KNN classifier to detect walking alteration on load carriage, is described in Chapter VI.
Resumo:
Tracking activities during daily life and assessing movement parameters is essential for complementing the information gathered in confined environments such as clinical and physical activity laboratories for the assessment of mobility. Inertial measurement units (IMUs) are used as to monitor the motion of human movement for prolonged periods of time and without space limitations. The focus in this study was to provide a robust, low-cost and an unobtrusive solution for evaluating human motion using a single IMU. First part of the study focused on monitoring and classification of the daily life activities. A simple method that analyses the variations in signal was developed to distinguish two types of activity intervals: active and inactive. Neural classifier was used to classify active intervals; the angle with respect to gravity was used to classify inactive intervals. Second part of the study focused on extraction of gait parameters using a single inertial measurement unit (IMU) attached to the pelvis. Two complementary methods were proposed for gait parameters estimation. First method was a wavelet based method developed for the estimation of gait events. Second method was developed for estimating step and stride length during level walking using the estimations of the previous method. A special integration algorithm was extended to operate on each gait cycle using a specially designed Kalman filter. The developed methods were also applied on various scenarios. Activity monitoring method was used in a PRIN’07 project to assess the mobility levels of individuals living in a urban area. The same method was applied on volleyball players to analyze the fitness levels of them by monitoring their daily life activities. The methods proposed in these studies provided a simple, unobtrusive and low-cost solution for monitoring and assessing activities outside of controlled environments.
Resumo:
This thesis aimed at addressing some of the issues that, at the state of the art, avoid the P300-based brain computer interface (BCI) systems to move from research laboratories to end users’ home. An innovative asynchronous classifier has been defined and validated. It relies on the introduction of a set of thresholds in the classifier, and such thresholds have been assessed considering the distributions of score values relating to target, non-target stimuli and epochs of voluntary no-control. With the asynchronous classifier, a P300-based BCI system can adapt its speed to the current state of the user and can automatically suspend the control when the user diverts his attention from the stimulation interface. Since EEG signals are non-stationary and show inherent variability, in order to make long-term use of BCI possible, it is important to track changes in ongoing EEG activity and to adapt BCI model parameters accordingly. To this aim, the asynchronous classifier has been subsequently improved by introducing a self-calibration algorithm for the continuous and unsupervised recalibration of the subjective control parameters. Finally an index for the online monitoring of the EEG quality has been defined and validated in order to detect potential problems and system failures. This thesis ends with the description of a translational work involving end users (people with amyotrophic lateral sclerosis-ALS). Focusing on the concepts of the user centered design approach, the phases relating to the design, the development and the validation of an innovative assistive device have been described. The proposed assistive technology (AT) has been specifically designed to meet the needs of people with ALS during the different phases of the disease (i.e. the degree of motor abilities impairment). Indeed, the AT can be accessed with several input devices either conventional (mouse, touchscreen) or alterative (switches, headtracker) up to a P300-based BCI.
Resumo:
Intelligent systems are currently inherent to the society, supporting a synergistic human-machine collaboration. Beyond economical and climate factors, energy consumption is strongly affected by the performance of computing systems. The quality of software functioning may invalidate any improvement attempt. In addition, data-driven machine learning algorithms are the basis for human-centered applications, being their interpretability one of the most important features of computational systems. Software maintenance is a critical discipline to support automatic and life-long system operation. As most software registers its inner events by means of logs, log analysis is an approach to keep system operation. Logs are characterized as Big data assembled in large-flow streams, being unstructured, heterogeneous, imprecise, and uncertain. This thesis addresses fuzzy and neuro-granular methods to provide maintenance solutions applied to anomaly detection (AD) and log parsing (LP), dealing with data uncertainty, identifying ideal time periods for detailed software analyses. LP provides deeper semantics interpretation of the anomalous occurrences. The solutions evolve over time and are general-purpose, being highly applicable, scalable, and maintainable. Granular classification models, namely, Fuzzy set-Based evolving Model (FBeM), evolving Granular Neural Network (eGNN), and evolving Gaussian Fuzzy Classifier (eGFC), are compared considering the AD problem. The evolving Log Parsing (eLP) method is proposed to approach the automatic parsing applied to system logs. All the methods perform recursive mechanisms to create, update, merge, and delete information granules according with the data behavior. For the first time in the evolving intelligent systems literature, the proposed method, eLP, is able to process streams of words and sentences. Essentially, regarding to AD accuracy, FBeM achieved (85.64+-3.69)%; eGNN reached (96.17+-0.78)%; eGFC obtained (92.48+-1.21)%; and eLP reached (96.05+-1.04)%. Besides being competitive, eLP particularly generates a log grammar, and presents a higher level of model interpretability.
Resumo:
In this work, we explore and demonstrate the potential for modeling and classification using quantile-based distributions, which are random variables defined by their quantile function. In the first part we formalize a least squares estimation framework for the class of linear quantile functions, leading to unbiased and asymptotically normal estimators. Among the distributions with a linear quantile function, we focus on the flattened generalized logistic distribution (fgld), which offers a wide range of distributional shapes. A novel naïve-Bayes classifier is proposed that utilizes the fgld estimated via least squares, and through simulations and applications, we demonstrate its competitiveness against state-of-the-art alternatives. In the second part we consider the Bayesian estimation of quantile-based distributions. We introduce a factor model with independent latent variables, which are distributed according to the fgld. Similar to the independent factor analysis model, this approach accommodates flexible factor distributions while using fewer parameters. The model is presented within a Bayesian framework, an MCMC algorithm for its estimation is developed, and its effectiveness is illustrated with data coming from the European Social Survey. The third part focuses on depth functions, which extend the concept of quantiles to multivariate data by imposing a center-outward ordering in the multivariate space. We investigate the recently introduced integrated rank-weighted (IRW) depth function, which is based on the distribution of random spherical projections of the multivariate data. This depth function proves to be computationally efficient and to increase its flexibility we propose different methods to explicitly model the projected univariate distributions. Its usefulness is shown in classification tasks: the maximum depth classifier based on the IRW depth is proven to be asymptotically optimal under certain conditions, and classifiers based on the IRW depth are shown to perform well in simulated and real data experiments.