53 resultados para Convolutional Neural Network (CNN)
Resumo:
One major drawback of coherent optical orthogonal frequency-division multiplexing (CO-OFDM) that hitherto remains unsolved is its vulnerability to nonlinear fiber effects due to its high peak-to-average power ratio. Several digital signal processing techniques have been investigated for the compensation of fiber nonlinearities, e.g., digital back-propagation, nonlinear pre- and post-compensation and nonlinear equalizers (NLEs) based on the inverse Volterra-series transfer function (IVSTF). Alternatively, nonlinearities can be mitigated using nonlinear decision classifiers such as artificial neural networks (ANNs) based on a multilayer perceptron. In this paper, ANN-NLE is presented for a 16QAM CO-OFDM system. The capability of the proposed approach to compensate the fiber nonlinearities is numerically demonstrated for up to 100-Gb/s and over 1000km and compared to the benchmark IVSTF-NLE. Results show that in terms of Q-factor, for 100-Gb/s at 1000km of transmission, ANN-NLE outperforms linear equalization and IVSTF-NLE by 3.2dB and 1dB, respectively.
Resumo:
Lifelong surveillance is not cost-effective after endovascular aneurysm repair (EVAR), but is required to detect aortic complications which are fatal if untreated (type 1/3 endoleak, sac expansion, device migration). Aneurysm morphology determines the probability of aortic complications and therefore the need for surveillance, but existing analyses have proven incapable of identifying patients at sufficiently low risk to justify abandoning surveillance. This study aimed to improve the prediction of aortic complications, through the application of machine-learning techniques. Patients undergoing EVAR at 2 centres were studied from 2004–2010. Aneurysm morphology had previously been studied to derive the SGVI Score for predicting aortic complications. Bayesian Neural Networks were designed using the same data, to dichotomise patients into groups at low- or high-risk of aortic complications. Network training was performed only on patients treated at centre 1. External validation was performed by assessing network performance independently of network training, on patients treated at centre 2. Discrimination was assessed by Kaplan-Meier analysis to compare aortic complications in predicted low-risk versus predicted high-risk patients. 761 patients aged 75 +/− 7 years underwent EVAR in 2 centres. Mean follow-up was 36+/− 20 months. Neural networks were created incorporating neck angu- lation/length/diameter/volume; AAA diameter/area/volume/length/tortuosity; and common iliac tortuosity/diameter. A 19-feature network predicted aor- tic complications with excellent discrimination and external validation (5-year freedom from aortic complications in predicted low-risk vs predicted high-risk patients: 97.9% vs. 63%; p < 0.0001). A Bayesian Neural-Network algorithm can identify patients in whom it may be safe to abandon surveillance after EVAR. This proposal requires prospective study.
Resumo:
We propose a novel low-complexity artificial neural network (ANN)-based nonlinear equalizer (NLE) for coherent optical orthogonal frequency-division multiplexing (CO-OFDM) and compare it with the recent inverse Volterra-series transfer function (IVSTF)-based NLE over up to 1000 km of uncompensated links. Demonstration of ANN-NLE at 80-Gb/s CO-OFDM using 16-quadrature amplitude modulation reveals a Q-factor improvement after 1000-km transmission of 3 and 1 dB with respect to the linear equalization and IVSTF-NLE, respectively.
Resumo:
A novel artificial neural network (ANN)-based nonlinear equalizer (NLE) of low complexity is demonstrated for 40-Gb/s CO-OFDM at 2000 km, revealing ∼1.5 dB enhancement in Q-factor compared to inverse Volterra-series transfer function based NLE.
Resumo:
In product reviews, it is observed that the distribution of polarity ratings over reviews written by different users or evaluated based on different products are often skewed in the real world. As such, incorporating user and product information would be helpful for the task of sentiment classification of reviews. However, existing approaches ignored the temporal nature of reviews posted by the same user or evaluated on the same product. We argue that the temporal relations of reviews might be potentially useful for learning user and product embedding and thus propose employing a sequence model to embed these temporal relations into user and product representations so as to improve the performance of document-level sentiment analysis. Specifically, we first learn a distributed representation of each review by a one-dimensional convolutional neural network. Then, taking these representations as pretrained vectors, we use a recurrent neural network with gated recurrent units to learn distributed representations of users and products. Finally, we feed the user, product and review representations into a machine learning classifier for sentiment classification. Our approach has been evaluated on three large-scale review datasets from the IMDB and Yelp. Experimental results show that: (1) sequence modeling for the purposes of distributed user and product representation learning can improve the performance of document-level sentiment classification; (2) the proposed approach achieves state-of-The-Art results on these benchmark datasets.
Resumo:
Intersubjectivity is an important concept in psychology and sociology. It refers to sharing conceptualizations through social interactions in a community and using such shared conceptualization as a resource to interpret things that happen in everyday life. In this work, we make use of intersubjectivity as the basis to model shared stance and subjectivity for sentiment analysis. We construct an intersubjectivity network which links review writers, terms they used, as well as the polarities of the terms. Based on this network model, we propose a method to learn writer embeddings which are subsequently incorporated into a convolutional neural network for sentiment analysis. Evaluations on the IMDB, Yelp 2013 and Yelp 2014 datasets show that the proposed approach has achieved the state-of-the-art performance.
Resumo:
In recent years, learning word vector representations has attracted much interest in Natural Language Processing. Word representations or embeddings learned using unsupervised methods help addressing the problem of traditional bag-of-word approaches which fail to capture contextual semantics. In this paper we go beyond the vector representations at the word level and propose a novel framework that learns higher-level feature representations of n-grams, phrases and sentences using a deep neural network built from stacked Convolutional Restricted Boltzmann Machines (CRBMs). These representations have been shown to map syntactically and semantically related n-grams to closeby locations in the hidden feature space. We have experimented to additionally incorporate these higher-level features into supervised classifier training for two sentiment analysis tasks: subjectivity classification and sentiment classification. Our results have demonstrated the success of our proposed framework with 4% improvement in accuracy observed for subjectivity classification and improved the results achieved for sentiment classification over models trained without our higher level features.
Resumo:
This paper presents results from the first use of neural networks for the real-time feedback control of high temperature plasmas in a Tokamak fusion experiment. The Tokamak is currently the principal experimental device for research into the magnetic confinement approach to controlled fusion. In the Tokamak, hydrogen plasmas, at temperatures of up to 100 Million K, are confined by strong magnetic fields. Accurate control of the position and shape of the plasma boundary requires real-time feedback control of the magnetic field structure on a time-scale of a few tens of microseconds. Software simulations have demonstrated that a neural network approach can give significantly better performance than the linear technique currently used on most Tokamak experiments. The practical application of the neural network approach requires high-speed hardware, for which a fully parallel implementation of the multi-layer perceptron, using a hybrid of digital and analogue technology, has been developed.
Resumo:
Introductory accounts of artificial neural networks often rely for motivation on analogies with models of information processing in biological networks. One limitation of such an approach is that it offers little guidance on how to find optimal algorithms, or how to verify the correct performance of neural network systems. A central goal of this paper is to draw attention to a quite different viewpoint in which neural networks are seen as algorithms for statistical pattern recognition based on a principled, i.e. theoretically well-founded, framework. We illustrate the concept of a principled viewpoint by considering a specific issue concerned with the interpretation of the outputs of a trained network. Finally, we discuss the relevance of such an approach to the issue of the validation and verification of neural network systems.
Resumo:
Neural networks have often been motivated by superficial analogy with biological nervous systems. Recently, however, it has become widely recognised that the effective application of neural networks requires instead a deeper understanding of the theoretical foundations of these models. Insight into neural networks comes from a number of fields including statistical pattern recognition, computational learning theory, statistics, information geometry and statistical mechanics. As an illustration of the importance of understanding the theoretical basis for neural network models, we consider their application to the solution of multi-valued inverse problems. We show how a naive application of the standard least-squares approach can lead to very poor results, and how an appreciation of the underlying statistical goals of the modelling process allows the development of a more general and more powerful formalism which can tackle the problem of multi-modality.
Resumo:
Deformable models are an attractive approach to recognizing objects which have considerable within-class variability such as handwritten characters. However, there are severe search problems associated with fitting the models to data which could be reduced if a better starting point for the search were available. We show that by training a neural network to predict how a deformable model should be instantiated from an input image, such improved starting points can be obtained. This method has been implemented for a system that recognizes handwritten digits using deformable models, and the results show that the search time can be significantly reduced without compromising recognition performance. © 1997 Academic Press.
Resumo:
This thesis is a study of the generation of topographic mappings - dimension reducing transformations of data that preserve some element of geometric structure - with feed-forward neural networks. As an alternative to established methods, a transformational variant of Sammon's method is proposed, where the projection is effected by a radial basis function neural network. This approach is related to the statistical field of multidimensional scaling, and from that the concept of a 'subjective metric' is defined, which permits the exploitation of additional prior knowledge concerning the data in the mapping process. This then enables the generation of more appropriate feature spaces for the purposes of enhanced visualisation or subsequent classification. A comparison with established methods for feature extraction is given for data taken from the 1992 Research Assessment Exercise for higher educational institutions in the United Kingdom. This is a difficult high-dimensional dataset, and illustrates well the benefit of the new topographic technique. A generalisation of the proposed model is considered for implementation of the classical multidimensional scaling (¸mds}) routine. This is related to Oja's principal subspace neural network, whose learning rule is shown to descend the error surface of the proposed ¸mds model. Some of the technical issues concerning the design and training of topographic neural networks are investigated. It is shown that neural network models can be less sensitive to entrapment in the sub-optimal global minima that badly affect the standard Sammon algorithm, and tend to exhibit good generalisation as a result of implicit weight decay in the training process. It is further argued that for ideal structure retention, the network transformation should be perfectly smooth for all inter-data directions in input space. Finally, there is a critique of optimisation techniques for topographic mappings, and a new training algorithm is proposed. A convergence proof is given, and the method is shown to produce lower-error mappings more rapidly than previous algorithms.
Resumo:
It is well known that one of the obstacles to effective forecasting of exchange rates is heteroscedasticity (non-stationary conditional variance). The autoregressive conditional heteroscedastic (ARCH) model and its variants have been used to estimate a time dependent variance for many financial time series. However, such models are essentially linear in form and we can ask whether a non-linear model for variance can improve results just as non-linear models (such as neural networks) for the mean have done. In this paper we consider two neural network models for variance estimation. Mixture Density Networks (Bishop 1994, Nix and Weigend 1994) combine a Multi-Layer Perceptron (MLP) and a mixture model to estimate the conditional data density. They are trained using a maximum likelihood approach. However, it is known that maximum likelihood estimates are biased and lead to a systematic under-estimate of variance. More recently, a Bayesian approach to parameter estimation has been developed (Bishop and Qazaz 1996) that shows promise in removing the maximum likelihood bias. However, up to now, this model has not been used for time series prediction. Here we compare these algorithms with two other models to provide benchmark results: a linear model (from the ARIMA family), and a conventional neural network trained with a sum-of-squares error function (which estimates the conditional mean of the time series with a constant variance noise model). This comparison is carried out on daily exchange rate data for five currencies.
Resumo:
In developing neural network techniques for real world applications it is still very rare to see estimates of confidence placed on the neural network predictions. This is a major deficiency, especially in safety-critical systems. In this paper we explore three distinct methods of producing point-wise confidence intervals using neural networks. We compare and contrast Bayesian, Gaussian Process and Predictive error bars evaluated on real data. The problem domain is concerned with the calibration of a real automotive engine management system for both air-fuel ratio determination and on-line ignition timing. This problem requires real-time control and is a good candidate for exploring the use of confidence predictions due to its safety-critical nature.
Resumo:
The ERS-1 Satellite was launched in July 1991 by the European Space Agency into a polar orbit at about km800, carrying a C-band scatterometer. A scatterometer measures the amount of radar back scatter generated by small ripples on the ocean surface induced by instantaneous local winds. Operational methods that extract wind vectors from satellite scatterometer data are based on the local inversion of a forward model, mapping scatterometer observations to wind vectors, by the minimisation of a cost function in the scatterometer measurement space.par This report uses mixture density networks, a principled method for modelling conditional probability density functions, to model the joint probability distribution of the wind vectors given the satellite scatterometer measurements in a single cell (the `inverse' problem). The complexity of the mapping and the structure of the conditional probability density function are investigated by varying the number of units in the hidden layer of the multi-layer perceptron and the number of kernels in the Gaussian mixture model of the mixture density network respectively. The optimal model for networks trained per trace has twenty hidden units and four kernels. Further investigation shows that models trained with incidence angle as an input have results comparable to those models trained by trace. A hybrid mixture density network that incorporates geophysical knowledge of the problem confirms other results that the conditional probability distribution is dominantly bimodal.par The wind retrieval results improve on previous work at Aston, but do not match other neural network techniques that use spatial information in the inputs, which is to be expected given the ambiguity of the inverse problem. Current work uses the local inverse model for autonomous ambiguity removal in a principled Bayesian framework. Future directions in which these models may be improved are given.