904 resultados para Supervector kernel
Resumo:
Fast spreading unknown viruses have caused major damage on computer systems upon their initial release. Current detection methods have lacked capabilities to detect unknown viruses quickly enough to avoid mass spreading and damage. This dissertation has presented a behavior based approach to detecting known and unknown viruses based on their attempt to replicate. Replication is the qualifying fundamental characteristic of a virus and is consistently present in all viruses making this approach applicable to viruses belonging to many classes and executing under several conditions. A form of replication called self-reference replication, (SR-replication), has been formalized as one main type of replication which specifically replicates by modifying or creating other files on a system to include the virus itself. This replication type was used to detect viruses attempting replication by referencing themselves which is a necessary step to successfully replicate files. The approach does not require a priori knowledge about known viruses. Detection was accomplished at runtime by monitoring currently executing processes attempting to replicate. Two implementation prototypes of the detection approach called SRRAT were created and tested on the Microsoft Windows operating systems focusing on the tracking of user mode Win32 API system calls and Kernel mode system services. The research results showed SR-replication capable of distinguishing between file infecting viruses and benign processes with little or no false positives and false negatives. ^
Resumo:
In this dissertation, I investigate three related topics on asset pricing: the consumption-based asset pricing under long-run risks and fat tails, the pricing of VIX (CBOE Volatility Index) options and the market price of risk embedded in stock returns and stock options. These three topics are fully explored in Chapter II through IV. Chapter V summarizes the main conclusions. In Chapter II, I explore the effects of fat tails on the equilibrium implications of the long run risks model of asset pricing by introducing innovations with dampened power law to consumption and dividends growth processes. I estimate the structural parameters of the proposed model by maximum likelihood. I find that the stochastic volatility model with fat tails can, without resorting to high risk aversion, generate implied risk premium, expected risk free rate and their volatilities comparable to the magnitudes observed in data. In Chapter III, I examine the pricing performance of VIX option models. The contention that simpler-is-better is supported by the empirical evidence using actual VIX option market data. I find that no model has small pricing errors over the entire range of strike prices and times to expiration. In general, Whaley’s Black-like option model produces the best overall results, supporting the simpler-is-better contention. However, the Whaley model does under/overprice out-of-the-money call/put VIX options, which is contrary to the behavior of stock index option pricing models. In Chapter IV, I explore risk pricing through a model of time-changed Lvy processes based on the joint evidence from individual stock options and underlying stocks. I specify a pricing kernel that prices idiosyncratic and systematic risks. This approach to examining risk premia on stocks deviates from existing studies. The empirical results show that the market pays positive premia for idiosyncratic and market jump-diffusion risk, and idiosyncratic volatility risk. However, there is no consensus on the premium for market volatility risk. It can be positive or negative. The positive premium on idiosyncratic risk runs contrary to the implications of traditional capital asset pricing theory.
Resumo:
An iterative travel time forecasting scheme, named the Advanced Multilane Prediction based Real-time Fastest Path (AMPRFP) algorithm, is presented in this dissertation. This scheme is derived from the conventional kernel estimator based prediction model by the association of real-time nonlinear impacts that caused by neighboring arcs’ traffic patterns with the historical traffic behaviors. The AMPRFP algorithm is evaluated by prediction of the travel time of congested arcs in the urban area of Jacksonville City. Experiment results illustrate that the proposed scheme is able to significantly reduce both the relative mean error (RME) and the root-mean-squared error (RMSE) of the predicted travel time. To obtain high quality real-time traffic information, which is essential to the performance of the AMPRFP algorithm, a data clean scheme enhanced empirical learning (DCSEEL) algorithm is also introduced. This novel method investigates the correlation between distance and direction in the geometrical map, which is not considered in existing fingerprint localization methods. Specifically, empirical learning methods are applied to minimize the error that exists in the estimated distance. A direction filter is developed to clean joints that have negative influence to the localization accuracy. Synthetic experiments in urban, suburban and rural environments are designed to evaluate the performance of DCSEEL algorithm in determining the cellular probe’s position. The results show that the cellular probe’s localization accuracy can be notably improved by the DCSEEL algorithm. Additionally, a new fast correlation technique for overcoming the time efficiency problem of the existing correlation algorithm based floating car data (FCD) technique is developed. The matching process is transformed into a 1-dimensional (1-D) curve matching problem and the Fast Normalized Cross-Correlation (FNCC) algorithm is introduced to supersede the Pearson product Moment Correlation Co-efficient (PMCC) algorithm in order to achieve the real-time requirement of the FCD method. The fast correlation technique shows a significant improvement in reducing the computational cost without affecting the accuracy of the matching process.
Resumo:
Fast spreading unknown viruses have caused major damage on computer systems upon their initial release. Current detection methods have lacked capabilities to detect unknown virus quickly enough to avoid mass spreading and damage. This dissertation has presented a behavior based approach to detecting known and unknown viruses based on their attempt to replicate. Replication is the qualifying fundamental characteristic of a virus and is consistently present in all viruses making this approach applicable to viruses belonging to many classes and executing under several conditions. A form of replication called self-reference replication, (SR-replication), has been formalized as one main type of replication which specifically replicates by modifying or creating other files on a system to include the virus itself. This replication type was used to detect viruses attempting replication by referencing themselves which is a necessary step to successfully replicate files. The approach does not require a priori knowledge about known viruses. Detection was accomplished at runtime by monitoring currently executing processes attempting to replicate. Two implementation prototypes of the detection approach called SRRAT were created and tested on the Microsoft Windows operating systems focusing on the tracking of user mode Win32 API system calls and Kernel mode system services. The research results showed SR-replication capable of distinguishing between file infecting viruses and benign processes with little or no false positives and false negatives.
Resumo:
In this dissertation, I investigate three related topics on asset pricing: the consumption-based asset pricing under long-run risks and fat tails, the pricing of VIX (CBOE Volatility Index) options and the market price of risk embedded in stock returns and stock options. These three topics are fully explored in Chapter II through IV. Chapter V summarizes the main conclusions. In Chapter II, I explore the effects of fat tails on the equilibrium implications of the long run risks model of asset pricing by introducing innovations with dampened power law to consumption and dividends growth processes. I estimate the structural parameters of the proposed model by maximum likelihood. I find that the stochastic volatility model with fat tails can, without resorting to high risk aversion, generate implied risk premium, expected risk free rate and their volatilities comparable to the magnitudes observed in data. In Chapter III, I examine the pricing performance of VIX option models. The contention that simpler-is-better is supported by the empirical evidence using actual VIX option market data. I find that no model has small pricing errors over the entire range of strike prices and times to expiration. In general, Whaley’s Black-like option model produces the best overall results, supporting the simpler-is-better contention. However, the Whaley model does under/overprice out-of-the-money call/put VIX options, which is contrary to the behavior of stock index option pricing models. In Chapter IV, I explore risk pricing through a model of time-changed Lévy processes based on the joint evidence from individual stock options and underlying stocks. I specify a pricing kernel that prices idiosyncratic and systematic risks. This approach to examining risk premia on stocks deviates from existing studies. The empirical results show that the market pays positive premia for idiosyncratic and market jump-diffusion risk, and idiosyncratic volatility risk. However, there is no consensus on the premium for market volatility risk. It can be positive or negative. The positive premium on idiosyncratic risk runs contrary to the implications of traditional capital asset pricing theory.
Resumo:
The aim of this study was to develop a practical, versatile and fast dosimetry and radiobiological model for calculation of the 3D dose distribution and radiobiological effectiveness of radioactive stents. The algorithm was written in Matlab 6.5 programming language and is based on the dose point kernel convolution. The dosimetry and radiobiological model was applied for evaluation of the 3D dose distribution of 32P, 90Y, 188Re and 177Lu stents. Of the four, 32P delivers the highest dose, while 90Y, 188Re and 177Lu require high levels of activity to deliver a significant therapeutic dose in the range of 15-30 Gy. Results of the radiobiological model demonstrated that the same physical dose delivered by different radioisotopes produces significantly different radiobiological effects. This type of theoretical dose calculation can be useful in the development of new stent designs, the planning of animal studies and clinical trials, and clinical decisions involving individualized treatment plans.
Resumo:
The great interest in nonlinear system identification is mainly due to the fact that a large amount of real systems are complex and need to have their nonlinearities considered so that their models can be successfully used in applications of control, prediction, inference, among others. This work evaluates the application of Fuzzy Wavelet Neural Networks (FWNN) to identify nonlinear dynamical systems subjected to noise and outliers. Generally, these elements cause negative effects on the identification procedure, resulting in erroneous interpretations regarding the dynamical behavior of the system. The FWNN combines in a single structure the ability to deal with uncertainties of fuzzy logic, the multiresolution characteristics of wavelet theory and learning and generalization abilities of the artificial neural networks. Usually, the learning procedure of these neural networks is realized by a gradient based method, which uses the mean squared error as its cost function. This work proposes the replacement of this traditional function by an Information Theoretic Learning similarity measure, called correntropy. With the use of this similarity measure, higher order statistics can be considered during the FWNN training process. For this reason, this measure is more suitable for non-Gaussian error distributions and makes the training less sensitive to the presence of outliers. In order to evaluate this replacement, FWNN models are obtained in two identification case studies: a real nonlinear system, consisting of a multisection tank, and a simulated system based on a model of the human knee joint. The results demonstrate that the application of correntropy as the error backpropagation algorithm cost function makes the identification procedure using FWNN models more robust to outliers. However, this is only achieved if the gaussian kernel width of correntropy is properly adjusted.
Resumo:
My thesis examines fine-scale habitat use and movement patterns of age 1 Greenland cod (Gadus macrocephalus ogac) tracked using acoustic telemetry. Recent advances in tracking technologies such as GPS and acoustic telemetry have led to increasingly large and detailed datasets that present new opportunities for researchers to address fine-scale ecological questions regarding animal movement and spatial distribution. There is a growing demand for home range models that will not only work with massive quantities of autocorrelated data, but that can also exploit the added detail inherent in these high-resolution datasets. Most published home range studies use radio-telemetry or satellite data from terrestrial mammals or avian species, and most studies that evaluate the relative performance of home range models use simulated data. In Chapter 2, I used actual field-collected data from age-1 Greenland cod tracked with acoustic telemetry to evaluate the accuracy and precision of six home range models: minimum convex polygons, kernel densities with plug-in bandwidth selection and the reference bandwidth, adaptive local convex hulls, Brownian bridges, and dynamic Brownian bridges. I then applied the most appropriate model to two years (2010-2012) of tracking data collected from 82 tagged Greenland cod tracked in Newman Sound, Newfoundland, Canada, to determine diel and seasonal differences in habitat use and movement patterns (Chapter 3). Little is known of juvenile cod ecology, so resolving these relationships will provide valuable insight into activity patterns, habitat use, and predator-prey dynamics, while filling a knowledge gap regarding the use of space by age 1 Greenland cod in a coastal nursery habitat. By doing so, my thesis demonstrates an appropriate technique for modelling the spatial use of fish from acoustic telemetry data that can be applied to high-resolution, high-frequency tracking datasets collected from mobile organisms in any environment.
Resumo:
This thesis stems from the project with real-time environmental monitoring company EMSAT Corporation. They were looking for methods to automatically ag spikes and other anomalies in their environmental sensor data streams. The problem presents several challenges: near real-time anomaly detection, absence of labeled data and time-changing data streams. Here, we address this problem using both a statistical parametric approach as well as a non-parametric approach like Kernel Density Estimation (KDE). The main contribution of this thesis is extending the KDE to work more effectively for evolving data streams, particularly in presence of concept drift. To address that, we have developed a framework for integrating Adaptive Windowing (ADWIN) change detection algorithm with KDE. We have tested this approach on several real world data sets and received positive feedback from our industry collaborator. Some results appearing in this thesis have been presented at ECML PKDD 2015 Doctoral Consortium.
Resumo:
Rapid development in industry have contributed to more complex systems that are prone to failure. In applications where the presence of faults may lead to premature failure, fault detection and diagnostics tools are often implemented. The goal of this research is to improve the diagnostic ability of existing FDD methods. Kernel Principal Component Analysis has good fault detection capability, however it can only detect the fault and identify few variables that have contribution on occurrence of fault and thus not precise in diagnosing. Hence, KPCA was used to detect abnormal events and the most contributed variables were taken out for more analysis in diagnosis phase. The diagnosis phase was done in both qualitative and quantitative manner. In qualitative mode, a networked-base causality analysis method was developed to show the causal effect between the most contributing variables in occurrence of the fault. In order to have more quantitative diagnosis, a Bayesian network was constructed to analyze the problem in probabilistic perspective.
Resumo:
We provide new evidence on sea surface temperature (SST) variations and paleoceanographic/paleoenvironmental changes over the past 1500 years for the north Aegean Sea (NE Mediterranean). The reconstructions are based on multiproxy analyses, obtained from the high resolution (decadal to multi-decadal) marine record M2 retrieved from the Athos basin. Reconstructed SSTs show an increase from ca. 850 to 950 AD and from ca. 1100 to 1300 AD. A cooling phase of almost 1.5 °C is observed from ca. 1600 AD to 1700 AD. This seems to have been the starting point of a continuous SST warming trend until the end of the reconstructed period, interrupted by two prominent cooling events at 1832 ± 15 AD and 1995 ± 1 AD. Application of an adaptive Kernel smoothing suggests that the current warming in the reconstructed SSTs of the north Aegean might be unprecedented in the context of the past 1500 years. Internal variability in atmospheric/oceanic circulations systems as well as external forcing as solar radiation and volcanic activity could have affected temperature variations in the north Aegean Sea over the past 1500 years. The marked temperature drop of approximately ~2 °C at 1832 ± 15 yr AD could be related to the 1809 ?D 'unknown' and the 1815 AD Tambora volcanic eruptions. Paleoenvironmental proxy-indices of the M2 record show enhanced riverine/continental inputs in the northern Aegean after ca. 1450 AD. The paleoclimatic evidence derived from the M2 record is combined with a socio-environmental study of the history of the north Aegean region. We show that the cultivation of temperature-sensitive crops, i.e. walnut, vine and olive, co-occurred with stable and warmer temperatures, while its end coincided with a significant episode of cooler temperatures. Periods of agricultural growth in Macedonia coincide with periods of warmer and more stable SSTs, but further exploration is required in order to identify the causal links behind the observed phenomena. The Black Death likely caused major changes in agricultural activity in the north Aegean region, as reflected in the pollen data from land sites of Macedonia and the M2 proxy-reconstructions. Finally, we conclude that the early modern peaks in mountain vegetation in the Rhodope and Macedonia highlands, visible also in the M2 record, were very likely climate-driven.
Resumo:
Software bug analysis is one of the most important activities in Software Quality. The rapid and correct implementation of the necessary repair influence both developers, who must leave the fully functioning software, and users, who need to perform their daily tasks. In this context, if there is an incorrect classification of bugs, there may be unwanted situations. One of the main factors to be assigned bugs in the act of its initial report is severity, which lives up to the urgency of correcting that problem. In this scenario, we identified in datasets with data extracted from five open source systems (Apache, Eclipse, Kernel, Mozilla and Open Office), that there is an irregular distribution of bugs with respect to existing severities, which is an early sign of misclassification. In the dataset analyzed, exists a rate of about 85% bugs being ranked with normal severity. Therefore, this classification rate can have a negative influence on software development context, where the misclassified bug can be allocated to a developer with little experience to solve it and thus the correction of the same may take longer, or even generate a incorrect implementation. Several studies in the literature have disregarded the normal bugs, working only with the portion of bugs considered severe or not severe initially. This work aimed to investigate this portion of the data, with the purpose of identifying whether the normal severity reflects the real impact and urgency, to investigate if there are bugs (initially classified as normal) that could be classified with other severity, and to assess if there are impacts for developers in this context. For this, an automatic classifier was developed, which was based on three algorithms (Näive Bayes, Max Ent and Winnow) to assess if normal severity is correct for the bugs categorized initially with this severity. The algorithms presented accuracy of about 80%, and showed that between 21% and 36% of the bugs should have been classified differently (depending on the algorithm), which represents somewhere between 70,000 and 130,000 bugs of the dataset.
Resumo:
A number of studies in the areas of Biomedical Engineering and Health Sciences have employed machine learning tools to develop methods capable of identifying patterns in different sets of data. Despite its extinction in many countries of the developed world, Hansen’s disease is still a disease that affects a huge part of the population in countries such as India and Brazil. In this context, this research proposes to develop a method that makes it possible to understand in the future how Hansen’s disease affects facial muscles. By using surface electromyography, a system was adapted so as to capture the signals from the largest possible number of facial muscles. We have first looked upon the literature to learn about the way researchers around the globe have been working with diseases that affect the peripheral neural system and how electromyography has acted to contribute to the understanding of these diseases. From these data, a protocol was proposed to collect facial surface electromyographic (sEMG) signals so that these signals presented a high signal to noise ratio. After collecting the signals, we looked for a method that would enable the visualization of this information in a way to make it possible to guarantee that the method used presented satisfactory results. After identifying the method's efficiency, we tried to understand which information could be extracted from the electromyographic signal representing the collected data. Once studies demonstrating which information could contribute to a better understanding of this pathology were not to be found in literature, parameters of amplitude, frequency and entropy were extracted from the signal and a feature selection was made in order to look for the features that better distinguish a healthy individual from a pathological one. After, we tried to identify the classifier that best discriminates distinct individuals from different groups, and also the set of parameters of this classifier that would bring the best outcome. It was identified that the protocol proposed in this study and the adaptation with disposable electrodes available in market proved their effectiveness and capability of being used in different studies whose intention is to collect data from facial electromyography. The feature selection algorithm also showed that not all of the features extracted from the signal are significant for data classification, with some more relevant than others. The classifier Support Vector Machine (SVM) proved itself efficient when the adequate Kernel function was used with the muscle from which information was to be extracted. Each investigated muscle presented different results when the classifier used linear, radial and polynomial kernel functions. Even though we have focused on Hansen’s disease, the method applied here can be used to study facial electromyography in other pathologies.
Resumo:
A number of studies in the areas of Biomedical Engineering and Health Sciences have employed machine learning tools to develop methods capable of identifying patterns in different sets of data. Despite its extinction in many countries of the developed world, Hansen’s disease is still a disease that affects a huge part of the population in countries such as India and Brazil. In this context, this research proposes to develop a method that makes it possible to understand in the future how Hansen’s disease affects facial muscles. By using surface electromyography, a system was adapted so as to capture the signals from the largest possible number of facial muscles. We have first looked upon the literature to learn about the way researchers around the globe have been working with diseases that affect the peripheral neural system and how electromyography has acted to contribute to the understanding of these diseases. From these data, a protocol was proposed to collect facial surface electromyographic (sEMG) signals so that these signals presented a high signal to noise ratio. After collecting the signals, we looked for a method that would enable the visualization of this information in a way to make it possible to guarantee that the method used presented satisfactory results. After identifying the method's efficiency, we tried to understand which information could be extracted from the electromyographic signal representing the collected data. Once studies demonstrating which information could contribute to a better understanding of this pathology were not to be found in literature, parameters of amplitude, frequency and entropy were extracted from the signal and a feature selection was made in order to look for the features that better distinguish a healthy individual from a pathological one. After, we tried to identify the classifier that best discriminates distinct individuals from different groups, and also the set of parameters of this classifier that would bring the best outcome. It was identified that the protocol proposed in this study and the adaptation with disposable electrodes available in market proved their effectiveness and capability of being used in different studies whose intention is to collect data from facial electromyography. The feature selection algorithm also showed that not all of the features extracted from the signal are significant for data classification, with some more relevant than others. The classifier Support Vector Machine (SVM) proved itself efficient when the adequate Kernel function was used with the muscle from which information was to be extracted. Each investigated muscle presented different results when the classifier used linear, radial and polynomial kernel functions. Even though we have focused on Hansen’s disease, the method applied here can be used to study facial electromyography in other pathologies.
Resumo:
En el mundo de la simulación existen varios tipos de sistemas reales, entre los que se encuentran los sistemas de eventos discretos. Para poder simular estos sistemas se pueden utilizar, entre otras, herramientas basadas en el formalismo DEVS (Discrete EVents system Specification), como la utilizada en este proyecto: xDEVS. La simulación posee una importancia muy elevada en campos como la educación y la ciencia, y en ocasiones es necesario incluir datos del medio físico o sacar información al exterior del simulador. Por ello es necesario contar con herramientas que puedan realizar simulaciones utilizando sensores, actuadores, circuitos externos, etc., o lo que es lo mismo, que puedan realizar co-simulaciones entre software y hardware. De esta forma se puede facilitar el desarrollo de sistemas por medio de modelado y simulación, pudiendo extraer el hardware gradualmente y analizar los resultados en cada etapa. Este proyecto es de carácter incremental, y trata de extender la funcionalidad de la plataforma xDEVS para poder realizar co-simulaciones entre hardware y software sobre una Raspberry Pi. Para ello se van a utilizar circuitos lógicos como hardware externo y se enlazarán al simulador a través de ficheros de dispositivo, gestionados por módulos del kernel de Linux. Como caso de estudio se desarrolla la co-simulación entre hardware y software completa de un ascensor de siete plantas para mostrar el uso y funcionamiento en xDEVS, extrayendo los circuitos integrados de uno en uno.