Biblioteca Digital

813 resultados para Euclidean Distance

Robust clustering of multi-type relational data via a heterogeneous manifold ensemble

Relevância:

60.00% 60.00%

Publicador:

Resumo:

High-Order Co-Clustering (HOCC) methods have attracted high attention in recent years because of their ability to cluster multiple types of objects simultaneously using all available information. During the clustering process, HOCC methods exploit object co-occurrence information, i.e., inter-type relationships amongst different types of objects as well as object affinity information, i.e., intra-type relationships amongst the same types of objects. However, it is difficult to learn accurate intra-type relationships in the presence of noise and outliers. Existing HOCC methods consider the p nearest neighbours based on Euclidean distance for the intra-type relationships, which leads to incomplete and inaccurate intra-type relationships. In this paper, we propose a novel HOCC method that incorporates multiple subspace learning with a heterogeneous manifold ensemble to learn complete and accurate intra-type relationships. Multiple subspace learning reconstructs the similarity between any pair of objects that belong to the same subspace. The heterogeneous manifold ensemble is created based on two-types of intra-type relationships learnt using p-nearest-neighbour graph and multiple subspaces learning. Moreover, in order to make sure the robustness of clustering process, we introduce a sparse error matrix into matrix decomposition and develop a novel iterative algorithm. Empirical experiments show that the proposed method achieves improved results over the state-of-art HOCC methods for FScore and NMI.

Efficient minimax strategies for square loss games

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We consider online prediction problems where the loss between the prediction and the outcome is measured by the squared Euclidean distance and its generalization, the squared Mahalanobis distance. We derive the minimax solutions for the case where the prediction and action spaces are the simplex (this setup is sometimes called the Brier game) and the \ell_2 ball (this setup is related to Gaussian density estimation). We show that in both cases the value of each sub-game is a quadratic function of a simple statistic of the state, with coefficients that can be efficiently computed using an explicit recurrence relation. The resulting deterministic minimax strategy and randomized maximin strategy are linear functions of the statistic.

Complex symbolic sequence clustering and multiple classifiers for predictive process monitoring

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper addresses the following predictive business process monitoring problem: Given the execution trace of an ongoing case,and given a set of traces of historical (completed) cases, predict the most likely outcome of the ongoing case. In this context, a trace refers to a sequence of events with corresponding payloads, where a payload consists of a set of attribute-value pairs. Meanwhile, an outcome refers to a label associated to completed cases, like, for example, a label indicating that a given case completed “on time” (with respect to a given desired duration) or “late”, or a label indicating that a given case led to a customer complaint or not. The paper tackles this problem via a two-phased approach. In the first phase, prefixes of historical cases are encoded using complex symbolic sequences and clustered. In the second phase, a classifier is built for each of the clusters. To predict the outcome of an ongoing case at runtime given its (uncompleted) trace, we select the closest cluster(s) to the trace in question and apply the respective classifier(s), taking into account the Euclidean distance of the trace from the center of the clusters. We consider two families of clustering algorithms – hierarchical clustering and k-medoids – and use random forests for classification. The approach was evaluated on four real-life datasets.

A Framework for Detecting Glaucomatous Progression in the Optic Nerve Head of an Eye Using Proper Orthogonal Decomposition

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Glaucoma is the second leading cause of blindness worldwide. Often, the optic nerve head (ONH) glaucomatous damage and ONH changes occur prior to visual field loss and are observable in vivo. Thus, digital image analysis is a promising choice for detecting the onset and/or progression of glaucoma. In this paper, we present a new framework for detecting glaucomatous changes in the ONH of an eye using the method of proper orthogonal decomposition (POD). A baseline topograph subspace was constructed for each eye to describe the structure of the ONH of the eye at a reference/baseline condition using POD. Any glaucomatous changes in the ONH of the eye present during a follow-up exam were estimated by comparing the follow-up ONH topography with its baseline topograph subspace representation. Image correspondence measures of L-1-norm and L-2-norm, correlation, and image Euclidean distance (IMED) were used to quantify the ONH changes. An ONH topographic library built from the Louisiana State University Experimental Glaucoma study was used to evaluate the performance of the proposed method. The area under the receiver operating characteristic curves (AUCs) was used to compare the diagnostic performance of the POD-induced parameters with the parameters of the topographic change analysis (TCA) method. The IMED and L-2-norm parameters in the POD framework provided the highest AUC of 0.94 at 10 degrees. field of imaging and 0.91 at 15 degrees. field of imaging compared to the TCA parameters with an AUC of 0.86 and 0.88, respectively. The proposed POD framework captures the instrument measurement variability and inherent structure variability and shows promise for improving our ability to detect glaucomatous change over time in glaucoma management.

Comparison of Prediction Based LSF Quantization Methods using Split VQ

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Further improvement in performance, to achieve near transparent quality LSF quantization, is shown to be possible by using a higher order two dimensional (2-D) prediction in the coefficient domain. The prediction is performed in a closed-loop manner so that the LSF reconstruction error is the same as the quantization error of the prediction residual. We show that an optimum 2-D predictor, exploiting both inter-frame and intra-frame correlations, performs better than existing predictive methods. Computationally efficient split vector quantization technique is used to implement the proposed 2-D prediction based method. We show further improvement in performance by using weighted Euclidean distance.

Low correlation sequences over AM-PSK and QAM alphabets

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A construction for a family of sequences over the 8-ary AM-PSK constellation that has maximum nontrivial correlation magnitude bounded as theta(max) less than or similar to root N is presented here. The famfly is asymptotically optimal with respect to the Welch bound on maximum magnitude of correlation. The 8-ary AM-PSK constellation is a subset of the 16-QAM constellation. We also construct two families of sequences over 16-QAM with theta(max) less than or similar to root 2 root N. These families are constructed by interleaving sets of sequences. A construction for a famBy of low-correlation sequences over QAM alphabet of size 2(2m) is presented with maximum nontrivial normalized correlation parameter bounded above by less than or similar to a root N, where N is the period of the sequences in the family and where a ranges from 1.61 in the case of 16-QAM modulation to 2.76 for large m. When used in a CDMA setting, the family will permit each user to modulate the code sequence with 2m bits of data. Interestingly, the construction permits users on the reverse link of the CDMA channel to communicate using varying data rates by switching between sequence famflies; associated to different values of the parameter m. Other features of the sequence families are improved Euclidean distance between different data symbols in comparison with PSK signaling and compatibility of the QAM sequence families with sequences belonging to the large quaternary sequence families {S(p)}.

Coding for Two-User Gaussian MAC with PSK and PAM Signal Sets

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Constellation Constrained (CC) capacity regions of a two-user Gaussian Multiple Access Channel(GMAC) have been recently reported. For such a channel, code pairs based on trellis coded modulation are proposed in this paper with MPSK and M-PAM alphabet pairs, for arbitrary values of M,toachieve sum rates close to the CC sum capacity of the GMAC. In particular, the structure of the sum alphabets of M-PSK and M-PAMmalphabet pairs are exploited to prove that, for certain angles of rotation between the alphabets, Ungerboeck labelling on the trellis of each user maximizes the guaranteed squared Euclidean distance of the sum trellis. Hence, such a labelling scheme can be used systematically,to construct trellis code pairs to achieve sum rates close to the CC sum capacity. More importantly, it is shown for the first time that ML decoding complexity at the destination is significantly reduced when M-PAM alphabet pairs are employed with almost no loss in the sum capacity.

Assessment of genetic diversity and identification of core collection in sandalwood germplasm using RAPDS

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Sandalwood is an economically important aromatic tree belonging to the family Santalaceae. The trees are used mainly for their fragrant heartwood and oil that have immense potential for foreign exchange. Very little information is available on the genetic diversity in this species. Hence studies were initiated and genetic diversity estimated using RAPD markers in 51 genotypes of Santalum album procured from different geographcial regions of India and three exotic lines of S. spicatum from Australia. Eleven selected Operon primers (10mer) generated a total of 156 consistent and unambiguous amplification products ranging from 200bp to 4kb. Rare and genotype specific bands were identified which could be effectively used to distinguish the genotypes. Genetic relationships within the genotypes were evaluated by generating a dissimilarity matrix based on Ward's method (Squared Euclidean distance). The phenetic dendrogram and the Principal Component Analysis generated, separated the 51 Indian genotypes from the three Australian lines. The cluster analysis indicated that sandalwood germplasm within India constitutes a broad genetic base with values of genetic dissimilarity ranging from 15 to 91 %. A core collection of 21 selected individuals revealed the same diversity of the entire population. The results show that RAPD analysis is an efficient marker technology for estimating genetic diversity and relatedness, thereby enabling the formulation of appropriate strategies for conservation, germplasm management, and selection of diverse parents for sandalwood improvement programmes.

A Low ML-Decoding Complexity, Full-Diversity, Full-Rate MIMO Precoder

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Precoding for multiple-input multiple-output (MIMO) antenna systems is considered with perfect channel knowledge available at both the transmitter and the receiver. For two transmit antennas and QAM constellations, a real-valued precoder which is approximately optimal (with respect to the minimum Euclidean distance between points in the received signal space) among real-valued precoders based on the singular value decomposition (SVD) of the channel is proposed. The proposed precoder is obtainable easily for arbitrary QAM constellations, unlike the known complex-valued optimal precoder by Collin et al. for two transmit antennas which is in existence for 4-QAM alone and is extremely hard to obtain for larger QAM constellations. The proposed precoding scheme is extended to higher number of transmit antennas on the lines of the E - d(min) precoder for 4-QAM by Vrigneau et al. which is an extension of the complex-valued optimal precoder for 4-QAM. The proposed precoder's ML-decoding complexity as a function of the constellation size M is only O(root M)while that of the E - d(min) precoder is O(M root M)(M = 4). Compared to the recently proposed X- and Y-precoders, the error performance of the proposed precoder is significantly better while being only marginally worse than that of the E - d(min) precoder for 4-QAM. It is argued that the proposed precoder provides full-diversity for QAM constellations and this is supported by simulation plots of the word error probability for 2 x 2, 4 x 4 and 8 x 8 systems.

Theory and Algorithms for Hop-Count-Based Localization with Random Geometric Graph Models of Dense Sensor Networks

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Wireless sensor networks can often be viewed in terms of a uniform deployment of a large number of nodes in a region of Euclidean space. Following deployment, the nodes self-organize into a mesh topology with a key aspect being self-localization. Having obtained a mesh topology in a dense, homogeneous deployment, a frequently used approximation is to take the hop distance between nodes to be proportional to the Euclidean distance between them. In this work, we analyze this approximation through two complementary analyses. We assume that the mesh topology is a random geometric graph on the nodes; and that some nodes are designated as anchors with known locations. First, we obtain high probability bounds on the Euclidean distances of all nodes that are h hops away from a fixed anchor node. In the second analysis, we provide a heuristic argument that leads to a direct approximation for the density function of the Euclidean distance between two nodes that are separated by a hop distance h. This approximation is shown, through simulation, to very closely match the true density function. Localization algorithms that draw upon the preceding analyses are then proposed and shown to perform better than some of the well-known algorithms present in the literature. Belief-propagation-based message-passing is then used to further enhance the performance of the proposed localization algorithms. To our knowledge, this is the first usage of message-passing for hop-count-based self-localization.

Fast Likelihood Computation in Speech Recognition using Matrices

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Acoustic modeling using mixtures of multivariate Gaussians is the prevalent approach for many speech processing problems. Computing likelihoods against a large set of Gaussians is required as a part of many speech processing systems and it is the computationally dominant phase for Large Vocabulary Continuous Speech Recognition (LVCSR) systems. We express the likelihood computation as a multiplication of matrices representing augmented feature vectors and Gaussian parameters. The computational gain of this approach over traditional methods is by exploiting the structure of these matrices and efficient implementation of their multiplication. In particular, we explore direct low-rank approximation of the Gaussian parameter matrix and indirect derivation of low-rank factors of the Gaussian parameter matrix by optimum approximation of the likelihood matrix. We show that both the methods lead to similar speedups but the latter leads to far lesser impact on the recognition accuracy. Experiments on 1,138 work vocabulary RM1 task and 6,224 word vocabulary TIMIT task using Sphinx 3.7 system show that, for a typical case the matrix multiplication based approach leads to overall speedup of 46 % on RM1 task and 115 % for TIMIT task. Our low-rank approximation methods provide a way for trading off recognition accuracy for a further increase in computational performance extending overall speedups up to 61 % for RM1 and 119 % for TIMIT for an increase of word error rate (WER) from 3.2 to 3.5 % for RM1 and for no increase in WER for TIMIT. We also express pairwise Euclidean distance computation phase in Dynamic Time Warping (DTW) in terms of matrix multiplication leading to saving of approximately of computational operations. In our experiments using efficient implementation of matrix multiplication, this leads to a speedup of 5.6 in computing the pairwise Euclidean distances and overall speedup up to 3.25 for DTW.

Antenna selection in spatial modulation systems

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Novel transmit antenna selection techniques are conceived for Spatial Modulation (SM) systems and their symbol error rate (SER) performance is investigated. Specifically, low-complexity Euclidean Distance optimized Antenna Selection (EDAS) and Capacity Optimized Antenna Selection (COAS) are studied. It is observed that the COAS scheme gives a better SER performance than the EDAS scheme. We show that the proposed antenna selection based SM systems are capable of attaining a significant gain in signal-to-noise ratio (SNR) compared to conventional SM systems, and also outperform the conventional MIMO systems employing antenna selection at both low and medium SNRs.

Adaptive constellation rotation scheme for two-user fading MAC with quantized fade state feedback

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With no Channel State Information (CSI) at the users, transmission over the two-user Gaussian Multiple Access Channel with fading and finite constellation at the input, will have high error rates due to multiple access interference (MAI). However, perfect CSI at the users is an unrealistic assumption in the wireless scenario, as it would involve extremely large feedback overheads. In this paper we propose a scheme which removes the adverse effect of MAI using only quantized knowledge of fade state at the transmitters such that the associated overhead is nominal. One of the users rotates its constellation relative to the other without varying the transmit power to adapt to the existing channel conditions, in order to meet certain predetermined minimum Euclidean distance requirement in the equivalent constellation at the destination. The optimal rotation scheme is described for the case when both the users use symmetric M-PSK constellations at the input, where M = 2(gimel), gimel being a positive integer. The strategy is illustrated by considering the example where both the users use QPSK signal sets at the input. The case when the users use PSK constellations of different sizes is also considered. It is shown that the proposed scheme has considerable better error performance compared to the conventional non-adaptive scheme, at the cost of a feedback overhead of just log log(2) (M-2/8 - M/4 + 2)] + 1 bits, for the M-PSK case.

An adaptive modulation scheme for two-user fading MAC with quantized fade state feedback

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For transmission over the two-user Gaussian Multiple Access Channel with fading and finite constellation at the inputs, we propose a scheme which uses only quantized knowledge of fade state at users with the feedback overhead being nominal. One of the users rotates its constellation without varying the transmit power to adapt to the existing channel conditions, in order to meet certain pre-determined minimum Euclidean distance requirement in the equivalent constellation at the destination. The optimal modulation scheme has been described for the case when both the users use symmetric M-PSK constellations at the input, where M = 2λ, λ being a positive integer. The strategy has been illustrated by considering examples where both the users use QPSK signal set at the input. It is shown that the proposed scheme has considerable better error performance compared to the conventional non-adaptive scheme, at the cost of a feedback overhead of just [log2 (M2/8 - M/4 + 2)] + 1 bits, for the M-PSK case.

End-to-End BER Analysis of Space Shift Keying in Decode-and-Forward Cooperative Relaying

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Space shift keying (SSK) is a special case of spatial modulation (SM), which is a relatively new modulation technique that is getting recognized to be attractive in multi-antenna communications. Our new contribution in this paper is an analytical derivation of exact closed-form expression for the end-to-end bit error rate (BER) performance of SSK in decode-and-forward (1)1,) cooperative relaying. An incremental relaying (IR) scheme with selection combining (SC) at the destination is considered. In SSK, since the information is carried by the transmit antenna index, traditional selection combining methods based on instantaneous SNRs can not be directly used. To overcome this problem, we propose to do selection between direct and relayed paths based on the Euclidean distance between columns of the channel matrix. With this selection metric, an exact analytical expression for the end-to-end BER is derived in closed-form. Analytical results are shown to match with simulation results.

«
1
2
3
4
5
6
7
8
...
54
55
»