937 resultados para mtDNA COI sequences
Resumo:
This study examines the population genetic structure of Asian elephants (Elephas maximus) across India, which harbours over half the world's population of this endangered species. Mitochondrial DNA control region sequences and allele frequencies at six nuclear DNA microsatellite markers obtained from the dung of free-ranging elephants reveal low mtDNA and typical microsatellite diversity. Both known divergent clades of mtDNA haplotypes in the Asian elephant are present in India, with southern and central India exhibiting exclusively the β clade of Fernando et al. (2000), northern India exhibiting exclusively the α clade and northeastern India exhibiting both, but predominantly the α clade. A nested clade analysis revealed isolation by distance as the principal mechanism responsible for the observed haplotype distributions within the α and β clades. Analyses of molecular variance and pairwise population FST tests based on both mitochondrial and microsatellite DNA suggest that northern-northeastern India, central India, Nilgiris (in southern India) and Anamalai-Periyar (in southern India) are four demographically autonomous population units and should be managed separately. In addition, evidence for female philopatry, male-mediated gene flow and two possible historical biogeographical barriers is described.
Resumo:
A low correlation interleaved QAM sequence family is presented here. In a CDMA setting, these sequences have the ability to transport a large amount of data as well as enable variable-rate signaling on the reverse link. The new interleaved selected family INQ has period N, normalized maximum correlation parameter thetasmacrmax bounded above by lsim a radicN, where a ranges from 1.17 in the 16-QAM case to 1.99 for large M2-QAM, where M = 2m, m ges 2. Each user is enabled to transfer m + 1 bits of data per period of the spreading sequence. These constructions have the lowest known value of maximum correlation of any sequence family with the same alphabet.
Resumo:
We present a improved language modeling technique for Lempel-Ziv-Welch (LZW) based LID scheme. The previous approach to LID using LZW algorithm prepares the language pattern table using LZW algorithm. Because of the sequential nature of the LZW algorithm, several language specific patterns of the language were missing in the pattern table. To overcome this, we build a universal pattern table, which contains all patterns of different length. For each language it's corresponding language specific pattern table is constructed by retaining the patterns of the universal table whose frequency of appearance in the training data is above the threshold.This approach reduces the classification score (Compression Ratio [LZW-CR] or the weighted discriminant score[LZW-WDS]) for non native languages and increases the LID performance considerably.
Resumo:
In phase-encoded optical CDMA (OCDMA) spreading is achieved by encoding the phase of signal spectrum. Here, a mathematical model for the output signal of a phase-encoded OCDMA system is first derived. This is shown to lead to a performance metric for the design of spreading sequences for asynchronous transmission. Generalized bent functions are used to construct a family of efficient phase-encoding sequences. It is shown how M-ary modulation of these spreading sequences is possible. The problem of designing efficient phaseencoded sequences is then related to the problem of minimizing PMEPR (peak-to-mean envelope power ratio) in an OFDM communication system.
Resumo:
The sum capacity on a symbol-synchronous CDMA system having processing gain N and supporting K power constrained users is achieved by employing any set of N orthogonal sequences if a few users are allowed to signal along multiple dimensions. Analogously, the minimum received power (energy-per-chip) on the symbolsynchronous CDMA system supporting K users that demand specified data rates is attained by employing any set of N orthogonal sequences. At most (N - 1) users need to be split and if there are no oversized users, these split users need to signal only in two dimensions each. These results show that sum capacity or minimum sum power can be achieved with minimal downlink signaling.
Resumo:
In phase encoding optical CDMA (OCDMA) the spreading is achieved by encoding the phase of signal spectrum. In this paper we first derive a mathematical model for the output of phase encoding OCDMA systems. Based on this model we introduce a metric to design spreading sequences for asynchronous transmission. Then we connect the phase encoding sequence design problem to OFDM PMEPR (peak to mean envelope power ratio) problem. Using this connection we conclude that designing sequences with good properties for samples of timing delay guarantees that the same sequence to be good for all timing delays. Finally using generalized bent function we manage to construct a family of sequences which are good for asynchronous phase encoding OCDMA systems and using these sequences we introduce an M-ary modulation scheme for phase encoding OCDMA
Resumo:
A three-level inverter produces six active vectors, each of normalized magnitudes 1, 0.866, and 0.5, besides a zero vector. The vectors of relative length 0.5 are termed pivot vectors.The three nearest voltage vectors are usually used to synthesize the reference vector. In most continuous pulsewidth-modulation(PWM) schemes, the switching sequence begins from a pivot vector and ends with the same pivot vector. Thus, the pivot vector is applied twice in a subcycle or half-carrier cycle. This paper proposes and investigates alternative switching sequences, which use the pivot vector only once but employ one of the other two vectors twice within the subcycle. The total harmonic distortion(THD) in the fundamental line current pertaining to these novel sequences is studied theoretically as well as experimentally over the whole range of modulation. Compared with centered space vector PWM, two of the proposed sequences lead to reduced THD at high modulation indices at a given average switching frequency.
Resumo:
Superscalar processors currently have the potential to fetch multiple basic blocks per cycle by employing one of several recently proposed instruction fetch mechanisms. However, this increased fetch bandwidth cannot be exploited unless pipeline stages further downstream correspondingly improve. In particular,register renaming a large number of instructions per cycle is diDcult. A large instruction window, needed to receive multiple basic blocks per cycle, will slow down dependence resolution and instruction issue. This paper addresses these and related issues by proposing (i) partitioning of the instruction window into multiple blocks, each holding a dynamic code sequence; (ii) logical partitioning of the registerjle into a global file and several local jles, the latter holding registers local to a dynamic code sequence; (iii) the dynamic recording and reuse of register renaming information for registers local to a dynamic code sequence. Performance studies show these mechanisms improve performance over traditional superscalar processors by factors ranging from 1.5 to a little over 3 for the SPEC Integer programs. Next, it is observed that several of the loops in the benchmarks display vector-like behavior during execution, even if the static loop bodies are likely complex for compile-time vectorization. A dynamic loop vectorization mechanism that builds on top of the above mechanisms is briefly outlined. The mechanism vectorizes up to 60% of the dynamic instructions for some programs, albeit the average number of iterations per loop is quite small.
Resumo:
In this paper we consider the process of discovering frequent episodes in event sequences. The most computationally intensive part of this process is that of counting the frequencies of a set of candidate episodes. We present two new frequency counting algorithms for speeding up this part. These, referred to as non-overlapping and non-inteleaved frequency counts, are based on directly counting suitable subsets of the occurrences of an episode. Hence they are different from the frequency counts of Mannila et al [1], where they count the number of windows in which the episode occurs. Our new frequency counts offer a speed-up factor of 7 or more on real and synthetic datasets. We also show how the new frequency counts can be used when the events in episodes have time-durations as well.
Resumo:
Discovering patterns in temporal data is an important task in Data Mining. A successful method for this was proposed by Mannila et al. [1] in 1997. In their framework, mining for temporal patterns in a database of sequences of events is done by discovering the so called frequent episodes. These episodes characterize interesting collections of events occurring relatively close to each other in some partial order. However, in this framework(and in many others for finding patterns in event sequences), the ordering of events in an event sequence is the only allowed temporal information. But there are many applications where the events are not instantaneous; they have time durations. Interesting episodesthat we want to discover may need to contain information regarding event durations etc. In this paper we extend Mannila et al.’s framework to tackle such issues. In our generalized formulation, episodes are defined so that much more temporal information about events can be incorporated into the structure of an episode. This significantly enhances the expressive capability of the rules that can be discovered in the frequent episode framework. We also present algorithms for discovering such generalized frequent episodes.
Resumo:
Over the past two decades, many ingenious efforts have been made in protein remote homology detection. Because homologous proteins often diversify extensively in sequence, it is challenging to demonstrate such relatedness through entirely sequence-driven searches. Here, we describe a computational method for the generation of `protein-like' sequences that serves to bridge gaps in protein sequence space. Sequence profile information, as embodied in a position-specific scoring matrix of multiply aligned sequences of bona fide family members, serves as the starting point in this algorithm. The observed amino acid propensity and the selection of a random number dictate the selection of a residue for each position in the sequence. In a systematic manner, and by applying a `roulette-wheel' selection approach at each position, we generate parent family-like sequences and thus facilitate an enlargement of sequence space around the family. When generated for a large number of families, we demonstrate that they expand the utility of natural intermediately related sequences in linking distant proteins. In 91% of the assessed examples, inclusion of designed sequences improved fold coverage by 5-10% over searches made in their absence. Furthermore, with several examples from proteins adopting folds such as TIM, globin, lipocalin and others, we demonstrate that the success of including designed sequences in a database positively sensitized methods such as PSI-BLAST and Cascade PSI-BLAST and is a promising opportunity for enormously improved remote homology recognition using sequence information alone.
Resumo:
Learning your αβγ's: The diversity of hydrogen-bonding patterns in backbone-expanded hybrid helices is shown by crystal-structure determination of several oligomeric peptides (see scheme; C=gray; H=white; O=red; N=blue). C 12 helices were observed in the αγ peptide series for n=2-8. In comparison, the αα peptide and αβ peptide sequences show C 10 and mixed C 14/C 15 helices, respectively. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Resumo:
Receive antenna selection (AS) has been shown to maintain the diversity benefits of multiple antennas while potentially reducing hardware costs. However, the promised diversity gains of receive AS depend on the assumptions of perfect channel knowledge at the receiver and slowly time-varying fading. By explicitly accounting for practical constraints imposed by the next-generation wireless standards such as training, packetization and antenna switching time, we propose a single receive AS method for time-varying fading channels. The method exploits the low training overhead and accuracy possible from the use of discrete prolate spheroidal (DPS) sequences based reduced rank subspace projection techniques. It only requires knowledge of the Doppler bandwidth, and does not require detailed correlation knowledge. Closed-form expressions for the channel prediction and estimation error as well as symbol error probability (SEP) of M-ary phase-shift keying (MPSK) for symbol-by-symbol receive AS are also derived. It is shown that the proposed AS scheme, after accounting for the practical limitations mentioned above, outperforms the ideal conventional single-input single-output (SISO) system with perfect CSI and no AS at the receiver and AS with conventional estimation based on complex exponential basis functions.
Suite of tools for statistical N-gram language modeling for pattern mining in whole genome sequences
Resumo:
Genome sequences contain a number of patterns that have biomedical significance. Repetitive sequences of various kinds are a primary component of most of the genomic sequence patterns. We extended the suffix-array based Biological Language Modeling Toolkit to compute n-gram frequencies as well as n-gram language-model based perplexity in windows over the whole genome sequence to find biologically relevant patterns. We present the suite of tools and their application for analysis on whole human genome sequence.