58 resultados para Alphabets
Resumo:
We consider the problem of transmission of several discrete sources over a multiple access channel (MAC) with side information at the sources and the decoder. Source-channel separation does not hold for this channel. Sufficient conditions are provided for transmission of sources with a given distortion. The channel could have continuous alphabets (Gaussian MAC is a special case). Various previous results are obtained as special cases.
Resumo:
Encoding protein 3D structures into 1D string using short structural prototypes or structural alphabets opens a new front for structure comparison and analysis. Using the well-documented 16 motifs of Protein Blocks (PBs) as structural alphabet, we have developed a methodology to compare protein structures that are encoded as sequences of PBs by aligning them using dynamic programming which uses a substitution matrix for PBs. This methodology is implemented in the applications available in Protein Block Expert (PBE) server. PBE addresses common issues in the field of protein structure analysis such as comparison of proteins structures and identification of protein structures in structural databanks that resemble a given structure. PBE-T provides facility to transform any PDB file into sequences of PBs. PBE-ALIGNc performs comparison of two protein structures based on the alignment of their corresponding PB sequences. PBE-ALIGNm is a facility for mining SCOP database for similar structures based on the alignment of PBs. Besides, PBE provides an interface to a database (PBE-SAdb) of preprocessed PB sequences from SCOP culled at 95% and of all-against-all pairwise PB alignments at family and superfamily levels. PBE server is freely available at http://bioinformatics.univ-reunion.fr/ PBE/.
Resumo:
Constellation Constrained (CC) capacity regions of a two-user Gaussian Multiple Access Channel(GMAC) have been recently reported. For such a channel, code pairs based on trellis coded modulation are proposed in this paper with MPSK and M-PAM alphabet pairs, for arbitrary values of M,toachieve sum rates close to the CC sum capacity of the GMAC. In particular, the structure of the sum alphabets of M-PSK and M-PAMmalphabet pairs are exploited to prove that, for certain angles of rotation between the alphabets, Ungerboeck labelling on the trellis of each user maximizes the guaranteed squared Euclidean distance of the sum trellis. Hence, such a labelling scheme can be used systematically,to construct trellis code pairs to achieve sum rates close to the CC sum capacity. More importantly, it is shown for the first time that ML decoding complexity at the destination is significantly reduced when M-PAM alphabet pairs are employed with almost no loss in the sum capacity.
Resumo:
Capacity region for two-user Gaussian Broadcast Channels (GBC) is well known with the optimal input being Gaussian. In this paper we explore the capacity region for GBC when the users' symbols are taken from finite complex alphabets (like M-QAM, M-PSK). When the alphabets for both the users are the same we show that rotation of one of the alphabets enlarges the capacity region. We arrive at an optimal angle of rotation by simulation. The effect of rotation on the capacity region at different SNRs is also studied using simulation results. Using the setup of Fading Broadcast Channel (FBC) given by [Li and Goldsmith, 2001], we study the ergodic capacity region with inputs from finite complex alphabets. It is seen that, using the procedure for optimum power allocation obtained in [Li and Goldsmith, 2001] for Gaussian inputs, to allocate power to symbols from finite complex alphabets, relative rotation between the alphabets does not improve the capacity region. Simulation results for a modified heuristic power allocation procedure for finite-constellation case, show that Constellation Constrained capacity region enlarges with rotation.
Resumo:
Feature extraction in bilingual OCR is handicapped by the increase in the number of classes or characters to be handled. This is evident in the case of Indian languages whose alphabet set is large. It is expected that the complexity of the feature extraction process increases with the number of classes. Though the determination of the best set of features that could be used cannot be ascertained through any quantitative measures, the characteristics of the scripts can help decide on the feature extraction procedure. This paper describes a hierarchical feature extraction scheme for recognition of printed bilingual (Tamil and Roman) text. The scheme divides the combined alphabet set of both the scripts into subsets by the extraction of certain spatial and structural features. Three features viz geometric moments, DCT based features and Wavelet transform based features are extracted from the grouped symbols and a linear transformation is performed on them for the purpose of efficient representation in the feature space. The transformation is obtained by the maximization of certain criterion functions. Three techniques : Principal component analysis, maximization of Fisher's ratio and maximization of divergence measure have been employed to estimate the transformation matrix. It has been observed that the proposed hierarchical scheme allows for easier handling of the alphabets and there is an appreciable rise in the recognition accuracy as a result of the transformations.
Resumo:
The capacity region of a two-user Gaussian Multiple Access Channel (GMAC) with complex finite input alphabets and continuous output alphabet is studied. When both the users are equipped with the same code alphabet, it is shown that, rotation of one of the user’s alphabets by an appropriate angle can make the new pair of alphabets not only uniquely decodable, but will result in enlargement of the capacity region. For this set-up, we identify the primary problem to be finding appropriate angle(s) of rotation between the alphabets such that the capacity region is maximally enlarged. It is shown that the angle of rotation which provides maximum enlargement of the capacity region also minimizes the union bound on the probability of error of the sumalphabet and vice-verse. The optimum angle(s) of rotation varies with the SNR. Through simulations, optimal angle(s) of rotation that gives maximum enlargement of the capacity region of GMAC with some well known alphabets such as M-QAM and M-PSK for some M are presented for several values of SNR. It is shown that for large number of points in the alphabets, capacity gains due to rotations progressively reduce. As the number of points N tends to infinity, our results match the results in the literature wherein the capacity region of the Gaussian code alphabet doesn’t change with rotation for any SNR.
Resumo:
The constant increase in the number of solved protein structures is of great help in understanding the basic principles behind protein folding and evolution. 3-D structural knowledge is valuable in designing and developing methods for comparison, modelling and prediction of protein structures. These approaches for structure analysis can be directly implicated in studying protein function and for drug design. The backbone of a protein structure favours certain local conformations which include alpha-helices, beta-strands and turns. Libraries of limited number of local conformations (Structural Alphabets) were developed in the past to obtain a useful categorization of backbone conformation. Protein Block (PB) is one such Structural Alphabet that gave a reasonable structure approximation of 0.42 angstrom. In this study, we use PB description of local structures to analyse conformations that are preferred sites for structural variations and insertions, among group of related folds. This knowledge can be utilized in improving tools for structure comparison that work by analysing local structure similarities. Conformational differences between homologous proteins are known to occur often in the regions comprising turns and loops. Interestingly, these differences are found to have specific preferences depending upon the structural classes of proteins. Such class-specific preferences are mainly seen in the all-beta class with changes involving short helical conformations and hairpin turns. A test carried out on a benchmark dataset also indicates that the use of knowledge on the class specific variations can improve the performance of a PB based structure comparison approach. The preference for the indel sites also seem to be confined to a few backbone conformations involving beta-turns and helix C-caps. These are mainly associated with short loops joining the regular secondary structures that mediate a reversal in the chain direction. Rare beta-turns of type I' and II' are also identified as preferred sites for insertions.
Resumo:
In the two-user Gaussian Strong Interference Channel (GSIC) with finite constellation inputs, it is known that relative rotation between the constellations of the two users enlarges the Constellation Constrained (CC) capacity region. In this paper, a metric for finding the approximate angle of rotation to maximally enlarge the CC capacity is presented. It is shown that for some portion of the Strong Interference (SI) regime, with Gaussian input alphabets, the FDMA rate curve touches the capacity curve of the GSIC. Even as the Gaussian alphabet FDMA rate curve touches the capacity curve of the GSIC, at high powers, with both the users using the same finite constellation, we show that the CC FDMA rate curve lies strictly inside the CC capacity curve for the constellations BPSK, QPSK, 8-PSK, 16-QAM and 64-QAM. It is known that, with Gaussian input alphabets, the FDMA inner-bound at the optimum sum-rate point is always better than the simultaneous-decoding inner-bound throughout the Weak Interference (WI) regime. For a portion of the WI regime, it is shown that, with identical finite constellation inputs for both the users, the simultaneous-decoding inner-bound enlarged by relative rotation between the constellations can be strictly better than the FDMA inner-bound.
Resumo:
We consider nonparametric or universal sequential hypothesis testing when the distribution under the null hypothesis is fully known but the alternate hypothesis corresponds to some other unknown distribution. These algorithms are primarily motivated from spectrum sensing in Cognitive Radios and intruder detection in wireless sensor networks. We use easily implementable universal lossless source codes to propose simple algorithms for such a setup. The algorithms are first proposed for discrete alphabet. Their performance and asymptotic properties are studied theoretically. Later these are extended to continuous alphabets. Their performance with two well known universal source codes, Lempel-Ziv code and KT-estimator with Arithmetic Encoder are compared. These algorithms are also compared with the tests using various other nonparametric estimators. Finally a decentralized version utilizing spatial diversity is also proposed and analysed.
Resumo:
Conformational changes in proteins are extremely important for their biochemical functions. Correlation between inherent conformational variations in a protein and conformational differences in its homologues of known structure is still unclear. In this study, we have used a structural alphabet called Protein Blocks (PBs). PBs are used to perform abstraction of protein 3-D structures into a 1-D strings of 16 alphabets (a-p) based on dihedral angles of overlapping pentapeptides. We have analyzed the variations in local conformations in terms of PBs represented in the ensembles of 801 protein structures determined using NMR spectroscopy. In the analysis of concatenated data over all the residues in all the NMR ensembles, we observe that the overall nature of inherent local structural variations in NMR ensembles is similar to the nature of local structural differences in homologous proteins with a high correlation coefficient of .94. High correlation at the alignment positions corresponding to helical and beta-sheet regions is only expected. However, the correlation coefficient by considering only the loop regions is also quite high (.91). Surprisingly, segregated position-wise analysis shows that this high correlation does not hold true to loop regions at the structurally equivalent positions in NMR ensembles and their homologues of known structure. This suggests that the general nature of local structural changes is unique; however most of the local structural variations in loop regions of NMR ensembles do not correlate to their local structural differences at structurally equivalent positions in homologues.
Resumo:
In protein sequence alignment, residue similarity is usually evaluated by substitution matrix, which scores all possible exchanges of one amino acid with another. Several matrices are widely used in sequence alignment, including PAM matrices derived from homologous sequence and BLOSUM matrices derived from aligned segments of BLOCKS. However, most matrices have not addressed the high-order residue-residue interactions that are vital to the bioproperties of protein.With consideration for the inherent correlation in residue triplet, we present a new scoring scheme for sequence alignment. Protein sequence is treated as overlapping and successive 3-residue segments. Two edge residues of a triplet are clustered into hydrophobic or polar categories, respectively. Protein sequence is then rewritten into triplet sequence with 2 · 20 · 2 = 80 alphabets. Using a traditional approach, we construct a new scoring scheme named TLESUMhp (TripLEt SUbstitution Matrices with hydropobic and polar information) for pairwise substitution of triplets, which characterizes the similarity of residue triplets. The applications of this matrix led to marked improvements in multiple sequence alignment and in searching structurally alike residue segments. The reason for the occurrence of the ‘‘twilight zone,’’ i.e., structure explosion of lowidentity sequences, is also discussed.
Resumo:
Sims-Williams, P. (2002). The Celtic Inscriptions of Britain: Phonology and Chronology, c. 400-1200. Publications of the Philological Society, 37. Oxford: Blackwell Publishing. RAE2008
Resumo:
The origins of Sephardic press date back to the mid-20th century, when the influence of the Western world spread across the Sephardim communities of the East. The content of these newspapers was diverse: pieces of general interest, but also scientific, literary and humorous works, with various political orientations. These papers were published in different languages, writing styles and alphabets. Those to be analysed here, however, were published in aljamiado Judeo-Spanish: three papers from Smyrna and one from Salonica. Throughout this work we will focus on the different obstacles and difficulties the editors and publishers of this Ottoman Sephardic press had to face to bring their publications to light.
Resumo:
Based on an algorithm for pattern matching in character strings, we implement a pattern matching machine that searches for occurrences of patterns in multidimensional time series. Before the search process takes place, time series are encoded in user-designed alphabets. The patterns, on the other hand, are formulated as regular expressions that are composed of letters from these alphabets and operators. Furthermore, we develop a genetic algorithm to breed patterns that maximize a user-defined fitness function. In an application to financial data, we show that patterns bred to predict high exchange rates volatility in training samples retain statistically significant predictive power in validation samples.
Resumo:
Nabile Farès is a key author within postcolonial studies, due in particular to his uniquely expressive writing style. This article discusses writing style in his 1982 text, L’état perdu: précédé du discours pratique de l’immigré. Throughout the mainly French text, different alphabets are woven, alluding to the complexity of Algerian linguistic history and the importance of language in the construction and expression of identity. Meanwhile, the grammar and structure of the French language seems confused and at times illogical, raising further questions about use of a colonial language in a postcolonial context. Farès’s writing style is avant-garde in nature, and deliberate intertextuality with the Surrealists situates the text within an avant-garde tradition in the French language, developing new ideas surrounding the effect of this written genre in the aftermath of colonialism.