2 resultados para Match analysis

em Illinois Digital Environment for Access to Learning and Scholarship Repository


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Microsecond long Molecular Dynamics (MD) trajectories of biomolecular processes are now possible due to advances in computer technology. Soon, trajectories long enough to probe dynamics over many milliseconds will become available. Since these timescales match the physiological timescales over which many small proteins fold, all atom MD simulations of protein folding are now becoming popular. To distill features of such large folding trajectories, we must develop methods that can both compress trajectory data to enable visualization, and that can yield themselves to further analysis, such as the finding of collective coordinates and reduction of the dynamics. Conventionally, clustering has been the most popular MD trajectory analysis technique, followed by principal component analysis (PCA). Simple clustering used in MD trajectory analysis suffers from various serious drawbacks, namely, (i) it is not data driven, (ii) it is unstable to noise and change in cutoff parameters, and (iii) since it does not take into account interrelationships amongst data points, the separation of data into clusters can often be artificial. Usually, partitions generated by clustering techniques are validated visually, but such validation is not possible for MD trajectories of protein folding, as the underlying structural transitions are not well understood. Rigorous cluster validation techniques may be adapted, but it is more crucial to reduce the dimensions in which MD trajectories reside, while still preserving their salient features. PCA has often been used for dimension reduction and while it is computationally inexpensive, being a linear method, it does not achieve good data compression. In this thesis, I propose a different method, a nonmetric multidimensional scaling (nMDS) technique, which achieves superior data compression by virtue of being nonlinear, and also provides a clear insight into the structural processes underlying MD trajectories. I illustrate the capabilities of nMDS by analyzing three complete villin headpiece folding and six norleucine mutant (NLE) folding trajectories simulated by Freddolino and Schulten [1]. Using these trajectories, I make comparisons between nMDS, PCA and clustering to demonstrate the superiority of nMDS. The three villin headpiece trajectories showed great structural heterogeneity. Apart from a few trivial features like early formation of secondary structure, no commonalities between trajectories were found. There were no units of residues or atoms found moving in concert across the trajectories. A flipping transition, corresponding to the flipping of helix 1 relative to the plane formed by helices 2 and 3 was observed towards the end of the folding process in all trajectories, when nearly all native contacts had been formed. However, the transition occurred through a different series of steps in all trajectories, indicating that it may not be a common transition in villin folding. The trajectories showed competition between local structure formation/hydrophobic collapse and global structure formation in all trajectories. Our analysis on the NLE trajectories confirms the notion that a tight hydrophobic core inhibits correct 3-D rearrangement. Only one of the six NLE trajectories folded, and it showed no flipping transition. All the other trajectories get trapped in hydrophobically collapsed states. The NLE residues were found to be buried deeply into the core, compared to the corresponding lysines in the villin headpiece, thereby making the core tighter and harder to undo for 3-D rearrangement. Our results suggest that the NLE may not be a fast folder as experiments suggest. The tightness of the hydrophobic core may be a very important factor in the folding of larger proteins. It is likely that chaperones like GroEL act to undo the tight hydrophobic core of proteins, after most secondary structure elements have been formed, so that global rearrangement is easier. I conclude by presenting facts about chaperone-protein complexes and propose further directions for the study of protein folding.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many existing encrypted Internet protocols leak information through packet sizes and timing. Though seemingly innocuous, prior work has shown that such leakage can be used to recover part or all of the plaintext being encrypted. The prevalence of encrypted protocols as the underpinning of such critical services as e-commerce, remote login, and anonymity networks and the increasing feasibility of attacks on these services represent a considerable risk to communications security. Existing mechanisms for preventing traffic analysis focus on re-routing and padding. These prevention techniques have considerable resource and overhead requirements. Furthermore, padding is easily detectable and, in some cases, can introduce its own vulnerabilities. To address these shortcomings, we propose embedding real traffic in synthetically generated encrypted cover traffic. Novel to our approach is our use of realistic network protocol behavior models to generate cover traffic. The observable traffic we generate also has the benefit of being indistinguishable from other real encrypted traffic further thwarting an adversary's ability to target attacks. In this dissertation, we introduce the design of a proxy system called TrafficMimic that implements realistic cover traffic tunneling and can be used alone or integrated with the Tor anonymity system. We describe the cover traffic generation process including the subtleties of implementing a secure traffic generator. We show that TrafficMimic cover traffic can fool a complex protocol classification attack with 91% of the accuracy of real traffic. TrafficMimic cover traffic is also not detected by a binary classification attack specifically designed to detect TrafficMimic. We evaluate the performance of tunneling with independent cover traffic models and find that they are comparable, and, in some cases, more efficient than generic constant-rate defenses. We then use simulation and analytic modeling to understand the performance of cover traffic tunneling more deeply. We find that we can take measurements from real or simulated traffic with no tunneling and use them to estimate parameters for an accurate analytic model of the performance impact of cover traffic tunneling. Once validated, we use this model to better understand how delay, bandwidth, tunnel slowdown, and stability affect cover traffic tunneling. Finally, we take the insights from our simulation study and develop several biasing techniques that we can use to match the cover traffic to the real traffic while simultaneously bounding external information leakage. We study these bias methods using simulation and evaluate their security using a Bayesian inference attack. We find that we can safely improve performance with biasing while preventing both traffic analysis and defense detection attacks. We then apply these biasing methods to the real TrafficMimic implementation and evaluate it on the Internet. We find that biasing can provide 3-5x improvement in bandwidth for bulk transfers and 2.5-9.5x speedup for Web browsing over tunneling without biasing.