30 resultados para Nonparametric Bayes
em Indian Institute of Science - Bangalore - Índia
Resumo:
Simultaneous recordings of spike trains from multiple single neurons are becoming commonplace. Understanding the interaction patterns among these spike trains remains a key research area. A question of interest is the evaluation of information flow between neurons through the analysis of whether one spike train exerts causal influence on another. For continuous-valued time series data, Granger causality has proven an effective method for this purpose. However, the basis for Granger causality estimation is autoregressive data modeling, which is not directly applicable to spike trains. Various filtering options distort the properties of spike trains as point processes. Here we propose a new nonparametric approach to estimate Granger causality directly from the Fourier transforms of spike train data. We validate the method on synthetic spike trains generated by model networks of neurons with known connectivity patterns and then apply it to neurons limultaneously recorded from the thalamus and the primary somatosensory cortex of a squirrel monkey undergoing tactile stimulation.
Resumo:
Multielectrode neurophysiological recording and high-resolution neuroimaging generate multivariate data that are the basis for understanding the patterns of neural interactions. How to extract directions of information flow in brain networks from these data remains a key challenge. Research over the last few years has identified Granger causality as a statistically principled technique to furnish this capability. The estimation of Granger causality currently requires autoregressive modeling of neural data. Here, we propose a nonparametric approach based on widely used Fourier and wavelet transforms to estimate both pairwise and conditional measures of Granger causality, eliminating the need of explicit autoregressive data modeling. We demonstrate the effectiveness of this approach by applying it to synthetic data generated by network models with known connectivity and to local field potentials recorded from monkeys performing a sensorimotor task.
Resumo:
Our ability to infer the protein quaternary structure automatically from atom and lattice information is inadequate, especially for weak complexes, and heteromeric quaternary structures. Several approaches exist, but they have limited performance. Here, we present a new scheme to infer protein quaternary structure from lattice and protein information, with all-around coverage for strong, weak and very weak affinity homomeric and heteromeric complexes. The scheme combines naive Bayes classifier and point group symmetry under Boolean framework to detect quaternary structures in crystal lattice. It consistently produces >= 90% coverage across diverse benchmarking data sets, including a notably superior 95% coverage for recognition heteromeric complexes, compared with 53% on the same data set by current state-of-the-art method. The detailed study of a limited number of prediction-failed cases offers interesting insights into the intriguing nature of protein contacts in lattice. The findings have implications for accurate inference of quaternary states of proteins, especially weak affinity complexes.
Resumo:
This paper considers sequential hypothesis testing in a decentralized framework. We start with two simple decentralized sequential hypothesis testing algorithms. One of which is later proved to be asymptotically Bayes optimal. We also consider composite versions of decentralized sequential hypothesis testing. A novel nonparametric version for decentralized sequential hypothesis testing using universal source coding theory is developed. Finally we design a simple decentralized multihypothesis sequential detection algorithm.
Resumo:
We consider nonparametric or universal sequential hypothesis testing when the distribution under the null hypothesis is fully known but the alternate hypothesis corresponds to some other unknown distribution. These algorithms are primarily motivated from spectrum sensing in Cognitive Radios and intruder detection in wireless sensor networks. We use easily implementable universal lossless source codes to propose simple algorithms for such a setup. The algorithms are first proposed for discrete alphabet. Their performance and asymptotic properties are studied theoretically. Later these are extended to continuous alphabets. Their performance with two well known universal source codes, Lempel-Ziv code and KT-estimator with Arithmetic Encoder are compared. These algorithms are also compared with the tests using various other nonparametric estimators. Finally a decentralized version utilizing spatial diversity is also proposed and analysed.
Resumo:
We consider nonparametric sequential hypothesis testing problem when the distribution under the null hypothesis is fully known but the alternate hypothesis corresponds to a general family of distributions. We propose a simple algorithm to address the problem. Its performance is analysed and asymptotic properties are proved. The simulated and analysed performance of the algorithm is compared with an earlier algorithm addressing the same problem with similar assumptions. Finally, we provide a justification for our model motivated by a Cognitive Radio scenario and modify the algorithm for optimizing performance when information about the prior probabilities of occurrence of the two hypotheses is available.
Resumo:
We propose a distributed sequential algorithm for quick detection of spectral holes in a Cognitive Radio set up. Two or more local nodes make decisions and inform the fusion centre (FC) over a reporting Multiple Access Channel (MAC), which then makes the final decision. The local nodes use energy detection and the FC uses mean detection in the presence of fading, heavy-tailed electromagnetic interference (EMI) and outliers. The statistics of the primary signal, channel gain and the EMI is not known. Different nonparametric sequential algorithms are compared to choose appropriate algorithms to be used at the local nodes and the Fe. Modification of a recently developed random walk test is selected for the local nodes for energy detection as well as at the fusion centre for mean detection. We show via simulations and analysis that the nonparametric distributed algorithm developed performs well in the presence of fading, EMI and outliers. The algorithm is iterative in nature making the computation and storage requirements minimal.
Resumo:
1 Species-accumulation curves for woody plants were calculated in three tropical forests, based on fully mapped 50-ha plots in wet, old-growth forest in Peninsular Malaysia, in moist, old-growth forest in central Panama, and in dry, previously logged forest in southern India. A total of 610 000 stems were identified to species and mapped to < Im accuracy. Mean species number and stem number were calculated in quadrats as small as 5 m x 5 m to as large as 1000 m x 500 m, for a variety of stem sizes above 10 mm in diameter. Species-area curves were generated by plotting species number as a function of quadrat size; species-individual curves were generated from the same data, but using stem number as the independent variable rather than area. 2 Species-area curves had different forms for stems of different diameters, but species-individual curves were nearly independent of diameter class. With < 10(4) stems, species-individual curves were concave downward on log-log plots, with curves from different forests diverging, but beyond about 104 stems, the log-log curves became nearly linear, with all three sites having a similar slope. This indicates an asymptotic difference in richness between forests: the Malaysian site had 2.7 times as many species as Panama, which in turn was 3.3 times as rich as India. 3 Other details of the species-accumulation relationship were remarkably similar between the three sites. Rectangular quadrats had 5-27% more species than square quadrats of the same area, with longer and narrower quadrats increasingly diverse. Random samples of stems drawn from the entire 50 ha had 10-30% more species than square quadrats with the same number of stems. At both Pasoh and BCI, but not Mudumalai. species richness was slightly higher among intermediate-sized stems (50-100mm in diameter) than in either smaller or larger sizes, These patterns reflect aggregated distributions of individual species, plus weak density-dependent forces that tend to smooth the species abundance distribution and 'loosen' aggregations as stems grow. 4 The results provide support for the view that within each tree community, many species have their abundance and distribution guided more by random drift than deterministic interactions. The drift model predicts that the species-accumulation curve will have a declining slope on a log-log plot, reaching a slope of O.1 in about 50 ha. No other model of community structure can make such a precise prediction. 5 The results demonstrate that diversity studies based on different stem diameters can be compared by sampling identical numbers of stems. Moreover, they indicate that stem counts < 1000 in tropical forests will underestimate the percentage difference in species richness between two diverse sites. Fortunately, standard diversity indices (Fisher's sc, Shannon-Wiener) captured diversity differences in small stem samples more effectively than raw species richness, but both were sample size dependent. Two nonparametric richness estimators (Chao. jackknife) performed poorly, greatly underestimating true species richness.
Resumo:
The concept of feature selection in a nonparametric unsupervised learning environment is practically undeveloped because no true measure for the effectiveness of a feature exists in such an environment. The lack of a feature selection phase preceding the clustering process seriously affects the reliability of such learning. New concepts such as significant features, level of significance of features, and immediate neighborhood are introduced which result in meeting implicitly the need for feature slection in the context of clustering techniques.
Resumo:
The concept of feature selection in a nonparametric unsupervised learning environment is practically undeveloped because no true measure for the effectiveness of a feature exists in such an environment. The lack of a feature selection phase preceding the clustering process seriously affects the reliability of such learning. New concepts such as significant features, level of significance of features, and immediate neighborhood are introduced which result in meeting implicitly the need for feature slection in the context of clustering techniques.
Resumo:
The problem of time variant reliability analysis of existing structures subjected to stationary random dynamic excitations is considered. The study assumes that samples of dynamic response of the structure, under the action of external excitations, have been measured at a set of sparse points on the structure. The utilization of these measurements m in updating reliability models, postulated prior to making any measurements, is considered. This is achieved by using dynamic state estimation methods which combine results from Markov process theory and Bayes' theorem. The uncertainties present in measurements as well as in the postulated model for the structural behaviour are accounted for. The samples of external excitations are taken to emanate from known stochastic models and allowance is made for ability (or lack of it) to measure the applied excitations. The future reliability of the structure is modeled using expected structural response conditioned on all the measurements made. This expected response is shown to have a time varying mean and a random component that can be treated as being weakly stationary. For linear systems, an approximate analytical solution for the problem of reliability model updating is obtained by combining theories of discrete Kalman filter and level crossing statistics. For the case of nonlinear systems, the problem is tackled by combining particle filtering strategies with data based extreme value analysis. In all these studies, the governing stochastic differential equations are discretized using the strong forms of Ito-Taylor's discretization schemes. The possibility of using conditional simulation strategies, when applied external actions are measured, is also considered. The proposed procedures are exemplifiedmby considering the reliability analysis of a few low-dimensional dynamical systems based on synthetically generated measurement data. The performance of the procedures developed is also assessed based on a limited amount of pertinent Monte Carlo simulations. (C) 2010 Elsevier Ltd. All rights reserved.
Resumo:
A method for determining the mutual nearest neighbours (MNN) and mutual neighbourhood value (mnv) of a sample point, using the conventional nearest neighbours, is suggested. A nonparametric, hierarchical, agglomerative clustering algorithm is developed using the above concepts. The algorithm is simple, deterministic, noniterative, requires low storage and is able to discern spherical and nonspherical clusters. The method is applicable to a wide class of data of arbitrary shape, large size and high dimensionality. The algorithm can discern mutually homogenous clusters. Strong or weak patterns can be discerned by properly choosing the neighbourhood width.
Resumo:
A nonparametric, hierarchical, disaggregative clustering algorithm is developed using a novel similarity measure, called the mutual neighborhood value (MNV), which takes into account the conventional nearest neighbor ranks of two samples with respect to each other. The algorithm is simple, noniterative, requires low storage, and needs no specification of the expected number of clusters. The algorithm appears very versatile as it is capable of discerning spherical and nonspherical clusters, linearly nonseparable clusters, clusters with unequal populations, and clusters with lowdensity bridges. Changing of the neighborhood size enables discernment of strong or weak patterns.
Resumo:
Relative geometric arrangements of the sample points, with reference to the structure of the imbedding space, produce clusters. Hence, if each sample point is imagined to acquire a volume of a small M-cube (called pattern-cell), depending on the ranges of its (M) features and number (N) of samples; then overlapping pattern-cells would indicate naturally closer sample-points. A chain or blob of such overlapping cells would mean a cluster and separate clusters would not share a common pattern-cell between them. The conditions and an analytic method to find such an overlap are developed. A simple, intuitive, nonparametric clustering procedure, based on such overlapping pattern-cells is presented. It may be classified as an agglomerative, hierarchical, linkage-type clustering procedure. The algorithm is fast, requires low storage and can identify irregular clusters. Two extensions of the algorithm, to separate overlapping clusters and to estimate the nature of pattern distributions in the sample space, are also indicated.
Resumo:
We are concerned with the situation in which a wireless sensor network is deployed in a region, for the purpose of detecting an event occurring at a random time and at a random location. The sensor nodes periodically sample their environment (e.g., for acoustic energy),process the observations (in our case, using a CUSUM-based algorithm) and send a local decision (which is binary in nature) to the fusion centre. The fusion centre collects these local decisions and uses a fusion rule to process the sensors’ local decisions and infer the state of nature, i.e., if an event has occurred or not. Our main contribution is in analyzing two local detection rules in combination with a simple fusion rule. The local detection algorithms are based on the nonparametric CUSUMprocedure from sequential statistics. We also propose two ways to operate the local detectors after an alarm. These alternatives when combined in various ways yield several approaches. Our contribution is to provide analytical techniques to calculate false alarm measures, by the use of which the local detector thresholds can be set. Simulation results are provided to evaluate the accuracy of our analysis. As an illustration we provide a design example. We also use simulations to compare the detection delays incurred in these algorithms.