855 resultados para Statistics|Electrical engineering|Computer science
                                
Resumo:
The bispectrum and third-order moment can be viewed as equivalent tools for testing for the presence of nonlinearity in stationary time series. This is because the bispectrum is the Fourier transform of the third-order moment. An advantage of the bispectrum is that its estimator comprises terms that are asymptotically independent at distinct bifrequencies under the null hypothesis of linearity. An advantage of the third-order moment is that its values in any subset of joint lags can be used in the test, whereas when using the bispectrum the entire (or truncated) third-order moment is required to construct the Fourier transform. In this paper, we propose a test for nonlinearity based upon the estimated third-order moment. We use the phase scrambling bootstrap method to give a nonparametric estimate of the variance of our test statistic under the null hypothesis. Using a simulation study, we demonstrate that the test obtains its target significance level, with large power, when compared to an existing standard parametric test that uses the bispectrum. Further we show how the proposed test can be used to identify the source of nonlinearity due to interactions at specific frequencies. We also investigate implications for heuristic diagnosis of nonstationarity.
                                
Resumo:
All signals that appear to be periodic have some sort of variability from period to period regardless of how stable they appear to be in a data plot. A true sinusoidal time series is a deterministic function of time that never changes and thus has zero bandwidth around the sinusoid's frequency. A zero bandwidth is impossible in nature since all signals have some intrinsic variability over time. Deterministic sinusoids are used to model cycles as a mathematical convenience. Hinich [IEEE J. Oceanic Eng. 25 (2) (2000) 256-261] introduced a parametric statistical model, called the randomly modulated periodicity (RMP) that allows one to capture the intrinsic variability of a cycle. As with a deterministic periodic signal the RMP can have a number of harmonics. The likelihood ratio test for this model when the amplitudes and phases are known is given in [M.J. Hinich, Signal Processing 83 (2003) 1349-13521. A method for detecting a RMP whose amplitudes and phases are unknown random process plus a stationary noise process is addressed in this paper. The only assumption on the additive noise is that it has finite dependence and finite moments. Using simulations based on a simple RMP model we show a case where the new method can detect the signal when the signal is not detectable in a standard waterfall spectrograrn display. (c) 2005 Elsevier B.V. All rights reserved.
                                
Resumo:
A set of techniques referred to as circular statistics has been developed for the analysis of directional and orientational data. The unit of measure for such data is angular (usually in either degrees or radians), and the statistical distributions underlying the techniques are characterised by their cyclic nature-for example, angles of 359.9 degrees are considered close to angles of 0 degrees. In this paper, we assert that such approaches can be easily adapted to analyse time-of-day and time-of-week data, and in particular daily cycles in the numbers of incidents reported to the police. We begin the paper by describing circular statistics. We then discuss how these may be modified, and demonstrate the approach with some examples for reported incidents in the Cardiff area of Wales. (c) 2005 Elsevier Ltd. All rights reserved.
                                
Resumo:
Aston University offers a Foundation year in Engineering and Applied Science. The purpose of this programme is to prepare people with the necessary skills and knowledge required to enrol on an undergraduate programme in Engineering and Applied Science. It is acknowledged there are many misconceptions as to what engineering is. This is further compounded by the lack of knowledge of the different engineering disciplines both by pre-university students and careers teachers [1]. In order to ameliorate this lack of knowledge, Aston University offers a unique programme where students are given the opportunity to have a ?taste? of four Engineering Disciplines: Mechanical Engineering, Electrical Engineering, Chemical Engineering and Computer Science. Alongside these ?taster? sessions, the students study a Professional Skills module where they are expected to keep a portfolio of skills. In their portfolios they comment on their strengths and weakness in relation to six skill areas: independent enquirer, self-manager, effective participator, creative thinker, reflective learner and team worker. The portfolio gives them the opportunity to perform a self-skills audit and identify areas where they have strengths and areas which require work to improve to become a competent professional engineer. They also have talks from engineers who discuss with them their careers and the different aspects of engineering. The purpose of the ?taster? sessions, portfolio and the talks are to encourage the students to critically examine their career aspirations and choose an engineering undergraduate programme which best suits their ambitions and potential skills. The feedback from students has been very positive. The ?taster? sessions have enabled them to make an informed choice as to the undergraduate programme they would like to study. The programme has given them the technical skills and knowledge to enrol on an undergraduate programme and also the skills and knowledge to be a successful learner.
                                
Resumo:
The need to provide computers with the ability to distinguish the affective state of their users is a major requirement for the practical implementation of affective computing concepts. This dissertation proposes the application of signal processing methods on physiological signals to extract from them features that can be processed by learning pattern recognition systems to provide cues about a person's affective state. In particular, combining physiological information sensed from a user's left hand in a non-invasive way with the pupil diameter information from an eye-tracking system may provide a computer with an awareness of its user's affective responses in the course of human-computer interactions. In this study an integrated hardware-software setup was developed to achieve automatic assessment of the affective status of a computer user. A computer-based "Paced Stroop Test" was designed as a stimulus to elicit emotional stress in the subject during the experiment. Four signals: the Galvanic Skin Response (GSR), the Blood Volume Pulse (BVP), the Skin Temperature (ST) and the Pupil Diameter (PD), were monitored and analyzed to differentiate affective states in the user. Several signal processing techniques were applied on the collected signals to extract their most relevant features. These features were analyzed with learning classification systems, to accomplish the affective state identification. Three learning algorithms: Naïve Bayes, Decision Tree and Support Vector Machine were applied to this identification process and their levels of classification accuracy were compared. The results achieved indicate that the physiological signals monitored do, in fact, have a strong correlation with the changes in the emotional states of the experimental subjects. These results also revealed that the inclusion of pupil diameter information significantly improved the performance of the emotion recognition system. ^
                                
Resumo:
Subspaces and manifolds are two powerful models for high dimensional signals. Subspaces model linear correlation and are a good fit to signals generated by physical systems, such as frontal images of human faces and multiple sources impinging at an antenna array. Manifolds model sources that are not linearly correlated, but where signals are determined by a small number of parameters. Examples are images of human faces under different poses or expressions, and handwritten digits with varying styles. However, there will always be some degree of model mismatch between the subspace or manifold model and the true statistics of the source. This dissertation exploits subspace and manifold models as prior information in various signal processing and machine learning tasks.
A near-low-rank Gaussian mixture model measures proximity to a union of linear or affine subspaces. This simple model can effectively capture the signal distribution when each class is near a subspace. This dissertation studies how the pairwise geometry between these subspaces affects classification performance. When model mismatch is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the model mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. This linear transformation, termed TRAIT, also preserves some specific features in each class, being complementary to a recently developed Low Rank Transform (LRT). Moreover, when the model mismatch is more significant, TRAIT shows superior performance compared to LRT.
The manifold model enforces a constraint on the freedom of data variation. Learning features that are robust to data variation is very important, especially when the size of the training set is small. A learning machine with large numbers of parameters, e.g., deep neural network, can well describe a very complicated data distribution. However, it is also more likely to be sensitive to small perturbations of the data, and to suffer from suffer from degraded performance when generalizing to unseen (test) data.
From the perspective of complexity of function classes, such a learning machine has a huge capacity (complexity), which tends to overfit. The manifold model provides us with a way of regularizing the learning machine, so as to reduce the generalization error, therefore mitigate overfiting. Two different overfiting-preventing approaches are proposed, one from the perspective of data variation, the other from capacity/complexity control. In the first approach, the learning machine is encouraged to make decisions that vary smoothly for data points in local neighborhoods on the manifold. In the second approach, a graph adjacency matrix is derived for the manifold, and the learned features are encouraged to be aligned with the principal components of this adjacency matrix. Experimental results on benchmark datasets are demonstrated, showing an obvious advantage of the proposed approaches when the training set is small.
Stochastic optimization makes it possible to track a slowly varying subspace underlying streaming data. By approximating local neighborhoods using affine subspaces, a slowly varying manifold can be efficiently tracked as well, even with corrupted and noisy data. The more the local neighborhoods, the better the approximation, but the higher the computational complexity. A multiscale approximation scheme is proposed, where the local approximating subspaces are organized in a tree structure. Splitting and merging of the tree nodes then allows efficient control of the number of neighbourhoods. Deviation (of each datum) from the learned model is estimated, yielding a series of statistics for anomaly detection. This framework extends the classical {\em changepoint detection} technique, which only works for one dimensional signals. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.
                                
Resumo:
Stealthy attackers move patiently through computer networks - taking days, weeks or months to accomplish their objectives in order to avoid detection. As networks scale up in size and speed, monitoring for such attack attempts is increasingly a challenge. This paper presents an efficient monitoring technique for stealthy attacks. It investigates the feasibility of proposed method under number of different test cases and examines how design of the network affects the detection. A methodological way for tracing anonymous stealthy activities to their approximate sources is also presented. The Bayesian fusion along with traffic sampling is employed as a data reduction method. The proposed method has the ability to monitor stealthy activities using 10-20% size sampling rates without degrading the quality of detection.
                                
Resumo:
Finding rare events in multidimensional data is an important detection problem that has applications in many fields, such as risk estimation in insurance industry, finance, flood prediction, medical diagnosis, quality assurance, security, or safety in transportation. The occurrence of such anomalies is so infrequent that there is usually not enough training data to learn an accurate statistical model of the anomaly class. In some cases, such events may have never been observed, so the only information that is available is a set of normal samples and an assumed pairwise similarity function. Such metric may only be known up to a certain number of unspecified parameters, which would either need to be learned from training data, or fixed by a domain expert. Sometimes, the anomalous condition may be formulated algebraically, such as a measure exceeding a predefined threshold, but nuisance variables may complicate the estimation of such a measure. Change detection methods used in time series analysis are not easily extendable to the multidimensional case, where discontinuities are not localized to a single point. On the other hand, in higher dimensions, data exhibits more complex interdependencies, and there is redundancy that could be exploited to adaptively model the normal data. In the first part of this dissertation, we review the theoretical framework for anomaly detection in images and previous anomaly detection work done in the context of crack detection and detection of anomalous components in railway tracks. In the second part, we propose new anomaly detection algorithms. The fact that curvilinear discontinuities in images are sparse with respect to the frame of shearlets, allows us to pose this anomaly detection problem as basis pursuit optimization. Therefore, we pose the problem of detecting curvilinear anomalies in noisy textured images as a blind source separation problem under sparsity constraints, and propose an iterative shrinkage algorithm to solve it. Taking advantage of the parallel nature of this algorithm, we describe how this method can be accelerated using graphical processing units (GPU). Then, we propose a new method for finding defective components on railway tracks using cameras mounted on a train. We describe how to extract features and use a combination of classifiers to solve this problem. Then, we scale anomaly detection to bigger datasets with complex interdependencies. We show that the anomaly detection problem naturally fits in the multitask learning framework. The first task consists of learning a compact representation of the good samples, while the second task consists of learning the anomaly detector. Using deep convolutional neural networks, we show that it is possible to train a deep model with a limited number of anomalous examples. In sequential detection problems, the presence of time-variant nuisance parameters affect the detection performance. In the last part of this dissertation, we present a method for adaptively estimating the threshold of sequential detectors using Extreme Value Theory on a Bayesian framework. Finally, conclusions on the results obtained are provided, followed by a discussion of possible future work.
                                
Resumo:
Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdos-Renyi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabasi-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree k variation, decreasing its network recovery rate with the increase of k. The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.
                                
Resumo:
An implementation of a computational tool to generate new summaries from new source texts is presented, by means of the connectionist approach (artificial neural networks). Among other contributions that this work intends to bring to natural language processing research, the use of a more biologically plausible connectionist architecture and training for automatic summarization is emphasized. The choice relies on the expectation that it may bring an increase in computational efficiency when compared to the sa-called biologically implausible algorithms.
                                
Resumo:
An experimental study of the Polarization Dependent Loss (PDL) is performed in an Optical Recirculating Loop (RCL). The RCL enables to simulate the transmission through various optical links using just one optical fiber spool, one in line amplifier, some optical filters and devices in a low cost manner. The total amount of PDL in a Recirculating loop, due to its statistical nature, is different of the simple sum of each element of the recirculating loop because of the alignment variation of the PDL elements with time, depending on the environmental conditions such as fiber stress and temperature. In this paper theoretical studies are also performed using formalism of Jones and Mueller matrices in order to represent the different optical elements in the recirculating loop. The PDL must be correctly characterized in order to evaluate properly the impact on the performance of next generation DWDM systems. Theoretical and experimental results comparison shows that a depolarization of 7% occurs in the experimental setup, probably by the optical amplifier due to the depolarized nature of the amplified spontaneous emission.
                                
Resumo:
A fuzzy control strategy for voltage regulation in electric power distribution systems is introduced in this article. This real-time controller would act on power transformers equipped with under-load tap changers. The fuzzy system was employed to turn the voltage-control relays into adaptive devices. The scope of the present study has been limited to the power distribution substation, and both the voltage measurements and control actions are carried out on the secondary bus. The capacity of fuzzy systems to handle approximate data, together with their unique ability to interpret qualitative information, make it possible to design voltage control strategies that satisfy both the requirements of the Brazilian regulatory bodies and the real concerns of the electric power distribution companies. A prototype based on the fuzzy control strategy proposed in this paper has also been implemented for validation purposes and its experimental results were highly satisfactory.
                                
Resumo:
This paper presents an approach for the active transmission losses allocation between the agents of the system. The approach uses the primal and dual variable information of the Optimal Power Flow in the losses allocation strategy. The allocation coefficients are determined via Lagrange multipliers. The paper emphasizes the necessity to consider the operational constraints and parameters of the systems in the problem solution. An example, for a 3-bus system is presented in details, as well as a comparative test with the main allocation methods. Case studies on the IEEE 14-bus systems are carried out to verify the influence of the constraints and parameters of the system in the losses allocation.
                                
Resumo:
This paper describes the modeling of a weed infestation risk inference system that implements a collaborative inference scheme based on rules extracted from two Bayesian network classifiers. The first Bayesian classifier infers a categorical variable value for the weed-crop competitiveness using as input categorical variables for the total density of weeds and corresponding proportions of narrow and broad-leaved weeds. The inferred categorical variable values for the weed-crop competitiveness along with three other categorical variables extracted from estimated maps for the weed seed production and weed coverage are then used as input for a second Bayesian network classifier to infer categorical variables values for the risk of infestation. Weed biomass and yield loss data samples are used to learn the probability relationship among the nodes of the first and second Bayesian classifiers in a supervised fashion, respectively. For comparison purposes, two types of Bayesian network structures are considered, namely an expert-based Bayesian classifier and a naive Bayes classifier. The inference system focused on the knowledge interpretation by translating a Bayesian classifier into a set of classification rules. The results obtained for the risk inference in a corn-crop field are presented and discussed. (C) 2009 Elsevier Ltd. All rights reserved.
                                
Resumo:
This paper presents a small-area CMOS current-steering segmented digital-to-analog converter (DAC) design intended for RF transmitters in 2.45 GHz Bluetooth applications. The current-source design strategy is based on an iterative scheme whose variables are adjusted in a simple way, minimizing the area and the power consumption, and meeting the design specifications. A theoretical analysis of static-dynamic requirements and a new layout strategy to attain a small-area current-steering DAC are included. The DAC was designed and implemented in 0.35 mu m CMOS technology, requiring an active area of just 200 mu m x 200 mu m. Experimental results, with a full-scale output current of 700 mu A and a 3.3 V power supply, showed a spurious-free dynamic range of 58 dB for a 1 MHz output sine wave and sampling frequency of 50 MHz, with differential and integral nonlinearity of 0.3 and 0.37 LSB, respectively.
 
                    