47 resultados para Alphabet, 1783
em Indian Institute of Science - Bangalore - Índia
Resumo:
Encoding protein 3D structures into 1D string using short structural prototypes or structural alphabets opens a new front for structure comparison and analysis. Using the well-documented 16 motifs of Protein Blocks (PBs) as structural alphabet, we have developed a methodology to compare protein structures that are encoded as sequences of PBs by aligning them using dynamic programming which uses a substitution matrix for PBs. This methodology is implemented in the applications available in Protein Block Expert (PBE) server. PBE addresses common issues in the field of protein structure analysis such as comparison of proteins structures and identification of protein structures in structural databanks that resemble a given structure. PBE-T provides facility to transform any PDB file into sequences of PBs. PBE-ALIGNc performs comparison of two protein structures based on the alignment of their corresponding PB sequences. PBE-ALIGNm is a facility for mining SCOP database for similar structures based on the alignment of PBs. Besides, PBE provides an interface to a database (PBE-SAdb) of preprocessed PB sequences from SCOP culled at 95% and of all-against-all pairwise PB alignments at family and superfamily levels. PBE server is freely available at http://bioinformatics.univ-reunion.fr/ PBE/.
Resumo:
We consider the problem of compression via homomorphic encoding of a source having a group alphabet. This is motivated by the problem of distributed function computation, where it is known that if one is only interested in computing a function of several sources, then one can at times improve upon the compression rate required by the Slepian-Wolf bound. The functions of interest are those which could be represented by the binary operation in the group. We first consider the case when the source alphabet is the cyclic Abelian group, Zpr. In this scenario, we show that the set of achievable rates provided by Krithivasan and Pradhan [1], is indeed the best possible. In addition to that, we provide a simpler proof of their achievability result. In the case of a general Abelian group, an improved achievable rate region is presented than what was obtained by Krithivasan and Pradhan. We then consider the case when the source alphabet is a non-Abelian group. We show that if all the source symbols have non-zero probability and the center of the group is trivial, then it is impossible to compress such a source if one employs a homomorphic encoder. Finally, we present certain non-homomorphic encoders, which also are suitable in the context of function computation over non-Abelian group sources and provide rate regions achieved by these encoders.
Resumo:
The three dimensional structure of a protein provides major insights into its function. Protein structure comparison has implications in functional and evolutionary studies. A structural alphabet (SA) is a library of local protein structure prototypes that can abstract every part of protein main chain conformation. Protein Blocks (PBS) is a widely used SA, composed of 16 prototypes, each representing a pentapeptide backbone conformation defined in terms of dihedral angles. Through this description, the 3D structural information can be translated into a 1D sequence of PBs. In a previous study, we have used this approach to compare protein structures encoded in terms of PBs. A classical sequence alignment procedure based on dynamic programming was used, with a dedicated PB Substitution Matrix (SM). PB-based pairwise structural alignment method gave an excellent performance, when compared to other established methods for mining. In this study, we have (i) refined the SMs and (ii) improved the Protein Block Alignment methodology (named as iPBA). The SM was normalized in regards to sequence and structural similarity. Alignment of protein structures often involves similar structural regions separated by dissimilar stretches. A dynamic programming algorithm that weighs these local similar stretches has been designed. Amino acid substitutions scores were also coupled linearly with the PB substitutions. iPBA improves (i) the mining efficiency rate by 6.8% and (ii) more than 82% of the alignments have a better quality. A higher efficiency in aligning multi-domain proteins could be also demonstrated. The quality of alignment is better than DALI and MUSTANG in 81.3% of the cases. Thus our study has resulted in an impressive improvement in the quality of protein structural alignment. (C) 2011 Elsevier Masson SAS. All rights reserved.
Resumo:
In this paper, we investigate the achievable rate region of Gaussian multiple access channels (MAC) with finite input alphabet and quantized output. With finite input alphabet and an unquantized receiver, the two-user Gaussian MAC rate region was studied. In most high throughput communication systems based on digital signal processing, the analog received signal is quantized using a low precision quantizer. In this paper, we first derive the expressions for the achievable rate region of a two-user Gaussian MAC with finite input alphabet and quantized output. We show that, with finite input alphabet, the achievable rate region with the commonly used uniform receiver quantizer has a significant loss in the rate region compared. It is observed that this degradation is due to the fact that the received analog signal is densely distributed around the origin, and is therefore not efficiently quantized with a uniform quantizer which has equally spaced quantization intervals. It is also observed that the density of the received analog signal around the origin increases with increasing number of users. Hence, the loss in the achievable rate region due to uniform receiver quantization is expected to increase with increasing number of users. We, therefore, propose a novel non-uniform quantizer with finely spaced quantization intervals near the origin. For a two-user Gaussian MAC with a given finite input alphabet and low precision receiver quantization, we show that the proposed non-uniform quantizer has a significantly larger rate region compared to what is achieved with a uniform quantizer.
Resumo:
In this letter, we compute the secrecy rate of decode-and-forward (DF) relay beamforming with finite input alphabet of size M. Source and relays operate under a total power constraint. First, we observe that the secrecy rate with finite-alphabet input can go to zero as the total power increases, when we use the source power and the relay weights obtained assuming Gaussian input. This is because the capacity of an eavesdropper can approach the finite-alphabet capacity of 1/2 log(2) M with increasing total power, due to the inability to completely null in the direction of the eavesdropper. We then propose a transmit power control scheme where the optimum source power and relay weights are obtained by carrying out transmit power (source power plus relay power) control on DF with Gaussian input using semi-definite programming, and then obtaining the corresponding source power and relay weights which maximize the secrecy rate for DF with finite-alphabet input. The proposed power control scheme is shown to achieve increasing secrecy rates with increasing total power with a saturation behavior at high total powers.
Achievable rate region of gaussian broadcast channel with finite input alphabet and quantized output
Resumo:
In this paper, we study the achievable rate region of two-user Gaussian broadcast channel (GBC) when the messages to be transmitted to both the users take values from finite signal sets and the received signal is quantized at both the users. We refer to this channel as quantized broadcast channel (QBC). We first observe that the capacity region defined for a GBC does not carry over as such to QBC. Also, we show that the optimal decoding scheme for GBC (i.e., high SNR user doing successive decoding and low SNR user decoding its message alone) is not optimal for QBC. We then propose an achievable rate region for QBC based on two different schemes. We present achievable rate region results for the case of uniform quantization at the receivers. We find that rotation of one of the user's input alphabet with respect to the other user's alphabet marginally enlarges the achievable rate region of QBC when almost equal powers are allotted to both the users.
Resumo:
The increasing number of available protein structures requires efficient tools for multiple structure comparison. Indeed, multiple structural alignments are essential for the analysis of function, evolution and architecture of protein structures. For this purpose, we proposed a new web server called multiple Protein Block Alignment (mulPBA). This server implements a method based on a structural alphabet to describe the backbone conformation of a protein chain in terms of dihedral angles. This sequence-like' representation enables the use of powerful sequence alignment methods for primary structure comparison, followed by an iterative refinement of the structural superposition. This approach yields alignments superior to most of the rigid-body alignment methods and highly comparable with the flexible structure comparison approaches. We implement this method in a web server designed to do multiple structure superimpositions from a set of structures given by the user. Outputs are given as both sequence alignment and superposed 3D structures visualized directly by static images generated by PyMol or through a Jmol applet allowing dynamic interaction. Multiple global quality measures are given. Relatedness between structures is indicated by a distance dendogram. Superimposed structures in PDB format can be also downloaded, and the results are quickly obtained. mulPBA server can be accessed at www.dsimb.inserm.fr/dsimb_tools/mulpba/.
Resumo:
The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as Protein Blocks (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa.
Resumo:
In this paper, we consider the problem of power allocation in MIMO wiretap channel for secrecy in the presence of multiple eavesdroppers. Perfect knowledge of the destination channel state information (CSI) and only the statistical knowledge of the eavesdroppers CSI are assumed. We first consider the MIMO wiretap channel with Gaussian input. Using Jensen's inequality, we transform the secrecy rate max-min optimization problem to a single maximization problem. We use generalized singular value decomposition and transform the problem to a concave maximization problem which maximizes the sum secrecy rate of scalar wiretap channels subject to linear constraints on the transmit covariance matrix. We then consider the MIMO wiretap channel with finite-alphabet input. We show that the transmit covariance matrix obtained for the case of Gaussian input, when used in the MIMO wiretap channel with finite-alphabet input, can lead to zero secrecy rate at high transmit powers. We then propose a power allocation scheme with an additional power constraint which alleviates this secrecy rate loss problem, and gives non-zero secrecy rates at high transmit powers.
Resumo:
Lanthanide coordination complexes with unidentate and bidentate amide ligands have been widely reported in the literature[l].In contrast, however, coordination compounds with tridentate ligands and with ligands containing ether oxygen as donor atoms to lanthanides have received little attention. In this paper we report the preparation and characterization of complexes formed by the interaction of the lanthanide perchlorates with N, N, N', N'- tetramethyloxydiacetamide (TMODA). The new complexes have been characterized by analysis, conductance, IR and electronic spectra. In addition, ~H and '3C NMR spectra of the ligand and its diamagnetic La ~+ and y3+ complexes are also discussed.
Resumo:
In this paper, we present numerical evidence that supports the notion of minimization in the sequence space of proteins for a target conformation. We use the conformations of the real proteins in the Protein Data Bank (PDB) and present computationally efficient methods to identify the sequences with minimum energy. We use edge-weighted connectivity graph for ranking the residue sites with reduced amino acid alphabet and then use continuous optimization to obtain the energy-minimizing sequences. Our methods enable the computation of a lower bound as well as a tight upper bound for the energy of a given conformation. We validate our results by using three different inter-residue energy matrices for five proteins from protein data bank (PDB), and by comparing our energy-minimizing sequences with 80 million diverse sequences that are generated based on different considerations in each case. When we submitted some of our chosen energy-minimizing sequences to Basic Local Alignment Search Tool (BLAST), we obtained some sequences from non-redundant protein sequence database that are similar to ours with an E-value of the order of 10(-7). In summary, we conclude that proteins show a trend towards minimizing energy in the sequence space but do not seem to adopt the global energy-minimizing sequence. The reason for this could be either that the existing energy matrices are not able to accurately represent the inter-residue interactions in the context of the protein environment or that Nature does not push the optimization in the sequence space, once it is able to perform the function.
Resumo:
We consider the problem of transmission of correlated discrete alphabet sources over a Gaussian Multiple Access Channel (GMAC). A distributed bit-to-Gaussian mapping is proposed which yields jointly Gaussian codewords. This can guarantee lossless transmission or lossy transmission with given distortions, if possible. The technique can be extended to the system with side information at the encoders and decoder.
Resumo:
A construction for a family of sequences over the 8-ary AM-PSK constellation that has maximum nontrivial correlation magnitude bounded as theta(max) less than or similar to root N is presented here. The famfly is asymptotically optimal with respect to the Welch bound on maximum magnitude of correlation. The 8-ary AM-PSK constellation is a subset of the 16-QAM constellation. We also construct two families of sequences over 16-QAM with theta(max) less than or similar to root 2 root N. These families are constructed by interleaving sets of sequences. A construction for a famBy of low-correlation sequences over QAM alphabet of size 2(2m) is presented with maximum nontrivial normalized correlation parameter bounded above by less than or similar to a root N, where N is the period of the sequences in the family and where a ranges from 1.61 in the case of 16-QAM modulation to 2.76 for large m. When used in a CDMA setting, the family will permit each user to modulate the code sequence with 2m bits of data. Interestingly, the construction permits users on the reverse link of the CDMA channel to communicate using varying data rates by switching between sequence famflies; associated to different values of the parameter m. Other features of the sequence families are improved Euclidean distance between different data symbols in comparison with PSK signaling and compatibility of the QAM sequence families with sequences belonging to the large quaternary sequence families {S(p)}.
Resumo:
The stability of scheduled multiaccess communication with random coding and independent decoding of messages is investigated. The number of messages that may be scheduled for simultaneous transmission is limited to a given maximum value, and the channels from transmitters to receiver are quasistatic, flat, and have independent fades. Requests for message transmissions are assumed to arrive according to an i.i.d. arrival process. Then, we show the following: (1) in the limit of large message alphabet size, the stability region has an interference limited information-theoretic capacity interpretation, (2) state-independent scheduling policies achieve this asymptotic stability region, and (3) in the asymptotic limit corresponding to immediate access, the stability region for non-idling scheduling policies is shown to be identical irrespective of received signal powers.
Resumo:
We study the problem of guessing the realization of a finite alphabet source, when some side information is provided, in a setting where the only knowledge the guesser has about the source and the correlated side information is that the joint source is one among a family. We define a notion of redundancy, identify a quantity that measures this redundancy, and study its properties. We then identify good guessing strategies that minimize the supremum redundancy (over the family). The minimum value measures the richness of the uncertainty class.