930 resultados para sparse matrices


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper investigates the distribution of the condition number of complex Wishart matrices. Two closely related measures are considered: the standard condition number (SCN) and the Demmel condition number (DCN), both of which have important applications in the context of multiple-input multipleoutput (MIMO) communication systems, as well as in various branches of mathematics. We first present a novel generic framework for the SCN distribution which accounts for both central and non-central Wishart matrices of arbitrary dimension. This result is a simple unified expression which involves only a single scalar integral, and therefore allows for fast and efficient computation. For the case of dual Wishart matrices, we derive new exact polynomial expressions for both the SCN and DCN distributions. We also formulate a new closed-form expression for the tail SCN distribution which applies for correlated central Wishart matrices of arbitrary dimension and demonstrates an interesting connection to the maximum eigenvalue moments of Wishart matrices of smaller dimension. Based on our analytical results, we gain valuable insights into the statistical behavior of the channel conditioning for various MIMO fading scenarios, such as uncorrelated/semi-correlated Rayleigh fading and Ricean fading. © 2010 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we introduce an application of matrix factorization to produce corpus-derived, distributional
models of semantics that demonstrate cognitive plausibility. We find that word representations
learned by Non-Negative Sparse Embedding (NNSE), a variant of matrix factorization, are sparse,
effective, and highly interpretable. To the best of our knowledge, this is the first approach which
yields semantic representation of words satisfying these three desirable properties. Though extensive
experimental evaluations on multiple real-world tasks and datasets, we demonstrate the superiority
of semantic models learned by NNSE over other state-of-the-art baselines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A practical machine-vision-based system is developed for fast detection of defects occurring on the surface of bottle caps. This system can be used to extract the circular region as the region of interests (ROI) from the surface of a bottle cap, and then use the circular region projection histogram (CRPH) as the matching features. We establish two dictionaries for the template and possible defect, respectively. Due to the requirements of high-speed production as well as detecting quality, a fast algorithm based on a sparse representation is proposed to speed up the searching. In the sparse representation, non-zero elements in the sparse factors indicate the defect's size and position. Experimental results in industrial trials show that the proposed method outperforms the orientation code method (OCM) and is able to produce promising results for detecting defects on the surface of bottle caps.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a Bayesian learning setting, the posterior distribution of a predictive model arises from a trade-off between its prior distribution and the conditional likelihood of observed data. Such distribution functions usually rely on additional hyperparameters which need to be tuned in order to achieve optimum predictive performance; this operation can be efficiently performed in an Empirical Bayes fashion by maximizing the posterior marginal likelihood of the observed data. Since the score function of this optimization problem is in general characterized by the presence of local optima, it is necessary to resort to global optimization strategies, which require a large number of function evaluations. Given that the evaluation is usually computationally intensive and badly scaled with respect to the dataset size, the maximum number of observations that can be treated simultaneously is quite limited. In this paper, we consider the case of hyperparameter tuning in Gaussian process regression. A straightforward implementation of the posterior log-likelihood for this model requires O(N^3) operations for every iteration of the optimization procedure, where N is the number of examples in the input dataset. We derive a novel set of identities that allow, after an initial overhead of O(N^3), the evaluation of the score function, as well as the Jacobian and Hessian matrices, in O(N) operations. We prove how the proposed identities, that follow from the eigendecomposition of the kernel matrix, yield a reduction of several orders of magnitude in the computation time for the hyperparameter optimization problem. Notably, the proposed solution provides computational advantages even with respect to state of the art approximations that rely on sparse kernel matrices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classification methods with embedded feature selection capability are very appealing for the analysis of complex processes since they allow the analysis of root causes even when the number of input variables is high. In this work, we investigate the performance of three techniques for classification within a Monte Carlo strategy with the aim of root cause analysis. We consider the naive bayes classifier and the logistic regression model with two different implementations for controlling model complexity, namely, a LASSO-like implementation with a L1 norm regularization and a fully Bayesian implementation of the logistic model, the so called relevance vector machine. Several challenges can arise when estimating such models mainly linked to the characteristics of the data: a large number of input variables, high correlation among subsets of variables, the situation where the number of variables is higher than the number of available data points and the case of unbalanced datasets. Using an ecological and a semiconductor manufacturing dataset, we show advantages and drawbacks of each method, highlighting the superior performance in term of classification accuracy for the relevance vector machine with respect to the other classifiers. Moreover, we show how the combination of the proposed techniques and the Monte Carlo approach can be used to get more robust insights into the problem under analysis when faced with challenging modelling conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gels obtained by complexation of octablock star polyethylene oxide/polypropylene oxide copolymers (Tetronic 90R4) with -cyclodextrin (-CD) were evaluated as matrices for drug release. Both molecules are biocompatible so they can be potentially applied to drug delivery systems. Two different types of matrices of Tetronic 90R4 and -CD were evaluated: gels and tablets. These gels are capable to gelifying in situ and show sustained erosion kinetics in aqueous media. Tablets were prepared by freeze-drying and comprising the gels. Using these two different matrices, the release of two model molecules, L-tryptophan (Trp), and a protein, bovine serum albumin (BSA), was evaluated. The release profiles of these molecules from gels and tablets prove that they are suitable for sustained delivery. Mathematical models were applied to the release curves from tablets to elucidate the drug delivery mechanism. Good correlations were found for the fittings of the release curves to different equations. The results point that the release of Trp from different tablets is always governed by Fickian diffusion, whereas the release of BSA is governed by a combination of diffusion and tablet erosion. 

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we propose a sparse signal modulation (SSM) method for precoded orthogonal frequency division multiplexing (OFDM) systems and study the signal detection. Although a receiver is able to exploit a path diversity gain with random precoding in OFDM, the complexity of the receiver is usually high as the orthogonality is not retained due to precoding. However, with SSM, we can derive a low-complexity detector that can provide reasonably good performances with a low sparsity ratio based on the notion of compressive sensing (CS). An important feature of a CS detector is that it can estimate SSM signals with a small fraction of the received signals over sub-carriers. This feature can allow us to build a low cost receiver with a small number of demodulators.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ‘edible’, ‘fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem.

Can we accelerate any CMTF solver, so that it runs within a few minutes instead of tens of hours to a day, while maintaining good accuracy? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, by up to 200x, along with an up to 65 fold increase in sparsity, with comparable accuracy to the baseline.

We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy.




Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the study of complex genetic diseases, the identification of subgroups of patients sharing similar genetic characteristics represents a challenging task, for example, to improve treatment decision. One type of genetic lesion, frequently investigated in such disorders, is the change of the DNA copy number (CN) at specific genomic traits. Non-negative Matrix Factorization (NMF) is a standard technique to reduce the dimensionality of a data set and to cluster data samples, while keeping its most relevant information in meaningful components. Thus, it can be used to discover subgroups of patients from CN profiles. It is however computationally impractical for very high dimensional data, such as CN microarray data. Deciding the most suitable number of subgroups is also a challenging problem. The aim of this work is to derive a procedure to compact high dimensional data, in order to improve NMF applicability without compromising the quality of the clustering. This is particularly important for analyzing high-resolution microarray data. Many commonly used quality measures, as well as our own measures, are employed to decide the number of subgroups and to assess the quality of the results. Our measures are based on the idea of identifying robust subgroups, inspired by biologically/clinically relevance instead of simply aiming at well-separated clusters. We evaluate our procedure using four real independent data sets. In these data sets, our method was able to find accurate subgroups with individual molecular and clinical features and outperformed the standard NMF in terms of accuracy in the factorization fitness function. Hence, it can be useful for the discovery of subgroups of patients with similar CN profiles in the study of heterogeneous diseases.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Models of complex systems with n components typically have order n<sup>2</sup> parameters because each component can potentially interact with every other. When it is impractical to measure these parameters, one may choose random parameter values and study the emergent statistical properties at the system level. Many influential results in theoretical ecology have been derived from two key assumptions: that species interact with random partners at random intensities and that intraspecific competition is comparable between species. Under these assumptions, community dynamics can be described by a community matrix that is often amenable to mathematical analysis. We combine empirical data with mathematical theory to show that both of these assumptions lead to results that must be interpreted with caution. We examine 21 empirically derived community matrices constructed using three established, independent methods. The empirically derived systems are more stable by orders of magnitude than results from random matrices. This consistent disparity is not explained by existing results on predator-prey interactions. We investigate the key properties of empirical community matrices that distinguish them from random matrices. We show that network topology is less important than the relationship between a species’ trophic position within the food web and its interaction strengths. We identify key features of empirical networks that must be preserved if random matrix models are to capture the features of real ecosystems.