16 resultados para Data Systems

em CaltechTHESIS


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The LIGO and Virgo gravitational-wave observatories are complex and extremely sensitive strain detectors that can be used to search for a wide variety of gravitational waves from astrophysical and cosmological sources. In this thesis, I motivate the search for the gravitational wave signals from coalescing black hole binary systems with total mass between 25 and 100 solar masses. The mechanisms for formation of such systems are not well-understood, and we do not have many observational constraints on the parameters that guide the formation scenarios. Detection of gravitational waves from such systems — or, in the absence of detection, the tightening of upper limits on the rate of such coalescences — will provide valuable information that can inform the astrophysics of the formation of these systems. I review the search for these systems and place upper limits on the rate of black hole binary coalescences with total mass between 25 and 100 solar masses. I then show how the sensitivity of this search can be improved by up to 40% by the the application of the multivariate statistical classifier known as a random forest of bagged decision trees to more effectively discriminate between signal and non-Gaussian instrumental noise. I also discuss the use of this classifier in the search for the ringdown signal from the merger of two black holes with total mass between 50 and 450 solar masses and present upper limits. I also apply multivariate statistical classifiers to the problem of quantifying the non-Gaussianity of LIGO data. Despite these improvements, no gravitational-wave signals have been detected in LIGO data so far. However, the use of multivariate statistical classification can significantly improve the sensitivity of the Advanced LIGO detectors to such signals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation is concerned with the problem of determining the dynamic characteristics of complicated engineering systems and structures from the measurements made during dynamic tests or natural excitations. Particular attention is given to the identification and modeling of the behavior of structural dynamic systems in the nonlinear hysteretic response regime. Once a model for the system has been identified, it is intended to use this model to assess the condition of the system and to predict the response to future excitations.

A new identification methodology based upon a generalization of the method of modal identification for multi-degree-of-freedom dynaimcal systems subjected to base motion is developed. The situation considered herein is that in which only the base input and the response of a small number of degrees-of-freedom of the system are measured. In this method, called the generalized modal identification method, the response is separated into "modes" which are analogous to those of a linear system. Both parametric and nonparametric models can be employed to extract the unknown nature, hysteretic or nonhysteretic, of the generalized restoring force for each mode.

In this study, a simple four-term nonparametric model is used first to provide a nonhysteretic estimate of the nonlinear stiffness and energy dissipation behavior. To extract the hysteretic nature of nonlinear systems, a two-parameter distributed element model is then employed. This model exploits the results of the nonparametric identification as an initial estimate for the model parameters. This approach greatly improves the convergence of the subsequent optimization process.

The capability of the new method is verified using simulated response data from a three-degree-of-freedom system. The new method is also applied to the analysis of response data obtained from the U.S.-Japan cooperative pseudo-dynamic test of a full-scale six-story steel-frame structure.

The new system identification method described has been found to be both accurate and computationally efficient. It is believed that it will provide a useful tool for the analysis of structural response data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis discusses various methods for learning and optimization in adaptive systems. Overall, it emphasizes the relationship between optimization, learning, and adaptive systems; and it illustrates the influence of underlying hardware upon the construction of efficient algorithms for learning and optimization. Chapter 1 provides a summary and an overview.

Chapter 2 discusses a method for using feed-forward neural networks to filter the noise out of noise-corrupted signals. The networks use back-propagation learning, but they use it in a way that qualifies as unsupervised learning. The networks adapt based only on the raw input data-there are no external teachers providing information on correct operation during training. The chapter contains an analysis of the learning and develops a simple expression that, based only on the geometry of the network, predicts performance.

Chapter 3 explains a simple model of the piriform cortex, an area in the brain involved in the processing of olfactory information. The model was used to explore the possible effect of acetylcholine on learning and on odor classification. According to the model, the piriform cortex can classify odors better when acetylcholine is present during learning but not present during recall. This is interesting since it suggests that learning and recall might be separate neurochemical modes (corresponding to whether or not acetylcholine is present). When acetylcholine is turned off at all times, even during learning, the model exhibits behavior somewhat similar to Alzheimer's disease, a disease associated with the degeneration of cells that distribute acetylcholine.

Chapters 4, 5, and 6 discuss algorithms appropriate for adaptive systems implemented entirely in analog hardware. The algorithms inject noise into the systems and correlate the noise with the outputs of the systems. This allows them to estimate gradients and to implement noisy versions of gradient descent, without having to calculate gradients explicitly. The methods require only noise generators, adders, multipliers, integrators, and differentiators; and the number of devices needed scales linearly with the number of adjustable parameters in the adaptive systems. With the exception of one global signal, the algorithms require only local information exchange.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Vortex rings constitute the main structure in the wakes of a wide class of swimming and flying animals, as well as in cardiac flows and in the jets generated by some moss and fungi. However, there is a physical limit, determined by an energy maximization principle called the Kelvin-Benjamin principle, to the size that axisymmetric vortex rings can achieve. The existence of this limit is known to lead to the separation of a growing vortex ring from the shear layer feeding it, a process known as `vortex pinch-off', and characterized by the dimensionless vortex formation number. The goal of this thesis is to improve our understanding of vortex pinch-off as it relates to biological propulsion, and to provide future researchers with tools to assist in identifying and predicting pinch-off in biological flows.

To this end, we introduce a method for identifying pinch-off in starting jets using the Lagrangian coherent structures in the flow, and apply this criterion to an experimentally generated starting jet. Since most naturally occurring vortex rings are not circular, we extend the definition of the vortex formation number to include non-axisymmetric vortex rings, and find that the formation number for moderately non-axisymmetric vortices is similar to that of circular vortex rings. This suggests that naturally occurring vortex rings may be modeled as axisymmetric vortex rings. Therefore, we consider the perturbation response of the Norbury family of axisymmetric vortex rings. This family is chosen to model vortex rings of increasing thickness and circulation, and their response to prolate shape perturbations is simulated using contour dynamics. Finally, the response of more realistic models for vortex rings, constructed from experimental data using nested contours, to perturbations which resemble those encountered by forming vortices more closely, is simulated using contour dynamics. In both families of models, a change in response analogous to pinch-off is found as members of the family with progressively thicker cores are considered. We posit that this analogy may be exploited to understand and predict pinch-off in complex biological flows, where current methods are not applicable in practice, and criteria based on the properties of vortex rings alone are necessary.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The AM CVn systems are a rare class of ultra-compact astrophysical binaries. With orbital periods of under an hour and as short as five minutes, they are among the closest known binary star systems and their evolution has direct relevance to the type Ia supernova rate and the white dwarf binary population. However, their faint and rare nature has made population studies of these systems difficult and several studies have found conflicting results.

I undertook a survey for AM CVn systems using the Palomar Transient Factory (PTF) astrophysical synoptic survey by exploiting the "outbursts" these systems undergo. Such events result in an increase in luminosity by a factor of up to two-hundred and are detectable in time-domain photometric data of AM CVn systems. My search resulted in the discovery of eight new systems, over 20% of the current known population. More importantly, this search was done in a systematic fashion, which allows for a population study properly accounting for biases.

Apart from the discovery of new systems, I used the time-domain data from the PTF and other synoptic surveys to better understand the long-term behavior of these systems. This analysis of the photometric behavior of the majority of known AM CVn systems has shown changes in their behavior at longer time scales than have previously been observed. This has allowed me to find relationships between the outburst properties of an individual system and its orbital period.

Even more importantly, the systematically selected sample together with these properties have allowed me to conduct a population study of the AM CVn systems. I have shown that the latest published estimates of the AM CVn system population, a factor of fifty below theoretical estimates, are consistent with the sample of systems presented here. This is particularly noteworthy since my population study is most sensitive to a different orbital period regime than earlier surveys. This confirmation of the population density will allow the AM CVn systems population to be used in the study of other areas of astrophysics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A central objective in signal processing is to infer meaningful information from a set of measurements or data. While most signal models have an overdetermined structure (the number of unknowns less than the number of equations), traditionally very few statistical estimation problems have considered a data model which is underdetermined (number of unknowns more than the number of equations). However, in recent times, an explosion of theoretical and computational methods have been developed primarily to study underdetermined systems by imposing sparsity on the unknown variables. This is motivated by the observation that inspite of the huge volume of data that arises in sensor networks, genomics, imaging, particle physics, web search etc., their information content is often much smaller compared to the number of raw measurements. This has given rise to the possibility of reducing the number of measurements by down sampling the data, which automatically gives rise to underdetermined systems.

In this thesis, we provide new directions for estimation in an underdetermined system, both for a class of parameter estimation problems and also for the problem of sparse recovery in compressive sensing. There are two main contributions of the thesis: design of new sampling and statistical estimation algorithms for array processing, and development of improved guarantees for sparse reconstruction by introducing a statistical framework to the recovery problem.

We consider underdetermined observation models in array processing where the number of unknown sources simultaneously received by the array can be considerably larger than the number of physical sensors. We study new sparse spatial sampling schemes (array geometries) as well as propose new recovery algorithms that can exploit priors on the unknown signals and unambiguously identify all the sources. The proposed sampling structure is generic enough to be extended to multiple dimensions as well as to exploit different kinds of priors in the model such as correlation, higher order moments, etc.

Recognizing the role of correlation priors and suitable sampling schemes for underdetermined estimation in array processing, we introduce a correlation aware framework for recovering sparse support in compressive sensing. We show that it is possible to strictly increase the size of the recoverable sparse support using this framework provided the measurement matrix is suitably designed. The proposed nested and coprime arrays are shown to be appropriate candidates in this regard. We also provide new guarantees for convex and greedy formulations of the support recovery problem and demonstrate that it is possible to strictly improve upon existing guarantees.

This new paradigm of underdetermined estimation that explicitly establishes the fundamental interplay between sampling, statistical priors and the underlying sparsity, leads to exciting future research directions in a variety of application areas, and also gives rise to new questions that can lead to stand-alone theoretical results in their own right.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Quasi Delay-Insensitive (QDI) systems must be reset into a valid initial state before normal operation can start. Otherwise, deadlock may occur due to wrong handshake communication between processes. This thesis first reviews the traditional Global Reset Schemes (GRS). It then proposes a new Wave Reset Schemes (WRS). By utilizing the third possible value of QDI data codes - reset value, WRS propagates the data with reset value and triggers Local Reset (LR) sequentially. The global reset network for GRS can be removed and all reset signals are generated locally for each process. Circuits templates as well as some special blocks are modified to accommodate the reset value in WRS. An algorithm is proposed to choose the proper Local Reset Input (LRI) in order to shorten reset time. WRS is then applied to an iterative multiplier. The multiplier is proved working under different operating conditions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The applicability of the white-noise method to the identification of a nonlinear system is investigated. Subsequently, the method is applied to certain vertebrate retinal neuronal systems and nonlinear, dynamic transfer functions are derived which describe quantitatively the information transformations starting with the light-pattern stimulus and culminating in the ganglion response which constitutes the visually-derived input to the brain. The retina of the catfish, Ictalurus punctatus, is used for the experiments.

The Wiener formulation of the white-noise theory is shown to be impractical and difficult to apply to a physical system. A different formulation based on crosscorrelation techniques is shown to be applicable to a wide range of physical systems provided certain considerations are taken into account. These considerations include the time-invariancy of the system, an optimum choice of the white-noise input bandwidth, nonlinearities that allow a representation in terms of a small number of characterizing kernels, the memory of the system and the temporal length of the characterizing experiment. Error analysis of the kernel estimates is made taking into account various sources of error such as noise at the input and output, bandwidth of white-noise input and the truncation of the gaussian by the apparatus.

Nonlinear transfer functions are obtained, as sets of kernels, for several neuronal systems: Light → Receptors, Light → Horizontal, Horizontal → Ganglion, Light → Ganglion and Light → ERG. The derived models can predict, with reasonable accuracy, the system response to any input. Comparison of model and physical system performance showed close agreement for a great number of tests, the most stringent of which is comparison of their responses to a white-noise input. Other tests include step and sine responses and power spectra.

Many functional traits are revealed by these models. Some are: (a) the receptor and horizontal cell systems are nearly linear (small signal) with certain "small" nonlinearities, and become faster (latency-wise and frequency-response-wise) at higher intensity levels, (b) all ganglion systems are nonlinear (half-wave rectification), (c) the receptive field center to ganglion system is slower (latency-wise and frequency-response-wise) than the periphery to ganglion system, (d) the lateral (eccentric) ganglion systems are just as fast (latency and frequency response) as the concentric ones, (e) (bipolar response) = (input from receptors) - (input from horizontal cell), (f) receptive field center and periphery exert an antagonistic influence on the ganglion response, (g) implications about the origin of ERG, and many others.

An analytical solution is obtained for the spatial distribution of potential in the S-space, which fits very well experimental data. Different synaptic mechanisms of excitation for the external and internal horizontal cells are implied.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The proliferation of smartphones and other internet-enabled, sensor-equipped consumer devices enables us to sense and act upon the physical environment in unprecedented ways. This thesis considers Community Sense-and-Response (CSR) systems, a new class of web application for acting on sensory data gathered from participants' personal smart devices. The thesis describes how rare events can be reliably detected using a decentralized anomaly detection architecture that performs client-side anomaly detection and server-side event detection. After analyzing this decentralized anomaly detection approach, the thesis describes how weak but spatially structured events can be detected, despite significant noise, when the events have a sparse representation in an alternative basis. Finally, the thesis describes how the statistical models needed for client-side anomaly detection may be learned efficiently, using limited space, via coresets.

The Caltech Community Seismic Network (CSN) is a prototypical example of a CSR system that harnesses accelerometers in volunteers' smartphones and consumer electronics. Using CSN, this thesis presents the systems and algorithmic techniques to design, build and evaluate a scalable network for real-time awareness of spatial phenomena such as dangerous earthquakes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The assembly history of massive galaxies is one of the most important aspects of galaxy formation and evolution. Although we have a broad idea of what physical processes govern the early phases of galaxy evolution, there are still many open questions. In this thesis I demonstrate the crucial role that spectroscopy can play in a physical understanding of galaxy evolution. I present deep near-infrared spectroscopy for a sample of high-redshift galaxies, from which I derive important physical properties and their evolution with cosmic time. I take advantage of the recent arrival of efficient near-infrared detectors to target the rest-frame optical spectra of z > 1 galaxies, from which many physical quantities can be derived. After illustrating the applications of near-infrared deep spectroscopy with a study of star-forming galaxies, I focus on the evolution of massive quiescent systems.

Most of this thesis is based on two samples collected at the W. M. Keck Observatory that represent a significant step forward in the spectroscopic study of z > 1 quiescent galaxies. All previous spectroscopic samples at this redshift were either limited to a few objects, or much shallower in terms of depth. Our first sample is composed of 56 quiescent galaxies at 1 < z < 1.6 collected using the upgraded red arm of the Low Resolution Imaging Spectrometer (LRIS). The second consists of 24 deep spectra of 1.5 < z < 2.5 quiescent objects observed with the Multi-Object Spectrometer For Infra-Red Exploration (MOSFIRE). Together, these spectra span the critical epoch 1 < z < 2.5, where most of the red sequence is formed, and where the sizes of quiescent systems are observed to increase significantly.

We measure stellar velocity dispersions and dynamical masses for the largest number of z > 1 quiescent galaxies to date. By assuming that the velocity dispersion of a massive galaxy does not change throughout its lifetime, as suggested by theoretical studies, we match galaxies in the local universe with their high-redshift progenitors. This allows us to derive the physical growth in mass and size experienced by individual systems, which represents a substantial advance over photometric inferences based on the overall galaxy population. We find a significant physical growth among quiescent galaxies over 0 < z < 2.5 and, by comparing the slope of growth in the mass-size plane dlogRe/dlogM with the results of numerical simulations, we can constrain the physical process responsible for the evolution. Our results show that the slope of growth becomes steeper at higher redshifts, yet is broadly consistent with minor mergers being the main process by which individual objects evolve in mass and size.

By fitting stellar population models to the observed spectroscopy and photometry we derive reliable ages and other stellar population properties. We show that the addition of the spectroscopic data helps break the degeneracy between age and dust extinction, and yields significantly more robust results compared to fitting models to the photometry alone. We detect a clear relation between size and age, where larger galaxies are younger. Therefore, over time the average size of the quiescent population will increase because of the contribution of large galaxies recently arrived to the red sequence. This effect, called progenitor bias, is different from the physical size growth discussed above, but represents another contribution to the observed difference between the typical sizes of low- and high-redshift quiescent galaxies. By reconstructing the evolution of the red sequence starting at z ∼ 1.25 and using our stellar population histories to infer the past behavior to z ∼ 2, we demonstrate that progenitor bias accounts for only half of the observed growth of the population. The remaining size evolution must be due to physical growth of individual systems, in agreement with our dynamical study.

Finally, we use the stellar population properties to explore the earliest periods which led to the formation of massive quiescent galaxies. We find tentative evidence for two channels of star formation quenching, which suggests the existence of two independent physical mechanisms. We also detect a mass downsizing, where more massive galaxies form at higher redshift, and then evolve passively. By analyzing in depth the star formation history of the brightest object at z > 2 in our sample, we are able to put constraints on the quenching timescale and on the properties of its progenitor.

A consistent picture emerges from our analyses: massive galaxies form at very early epochs, are quenched on short timescales, and then evolve passively. The evolution is passive in the sense that no new stars are formed, but significant mass and size growth is achieved by accreting smaller, gas-poor systems. At the same time the population of quiescent galaxies grows in number due to the quenching of larger star-forming galaxies. This picture is in agreement with other observational studies, such as measurements of the merger rate and analyses of galaxy evolution at fixed number density.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We are at the cusp of a historic transformation of both communication system and electricity system. This creates challenges as well as opportunities for the study of networked systems. Problems of these systems typically involve a huge number of end points that require intelligent coordination in a distributed manner. In this thesis, we develop models, theories, and scalable distributed optimization and control algorithms to overcome these challenges.

This thesis focuses on two specific areas: multi-path TCP (Transmission Control Protocol) and electricity distribution system operation and control. Multi-path TCP (MP-TCP) is a TCP extension that allows a single data stream to be split across multiple paths. MP-TCP has the potential to greatly improve reliability as well as efficiency of communication devices. We propose a fluid model for a large class of MP-TCP algorithms and identify design criteria that guarantee the existence, uniqueness, and stability of system equilibrium. We clarify how algorithm parameters impact TCP-friendliness, responsiveness, and window oscillation and demonstrate an inevitable tradeoff among these properties. We discuss the implications of these properties on the behavior of existing algorithms and motivate a new algorithm Balia (balanced linked adaptation) which generalizes existing algorithms and strikes a good balance among TCP-friendliness, responsiveness, and window oscillation. We have implemented Balia in the Linux kernel. We use our prototype to compare the new proposed algorithm Balia with existing MP-TCP algorithms.

Our second focus is on designing computationally efficient algorithms for electricity distribution system operation and control. First, we develop efficient algorithms for feeder reconfiguration in distribution networks. The feeder reconfiguration problem chooses the on/off status of the switches in a distribution network in order to minimize a certain cost such as power loss. It is a mixed integer nonlinear program and hence hard to solve. We propose a heuristic algorithm that is based on the recently developed convex relaxation of the optimal power flow problem. The algorithm is efficient and can successfully computes an optimal configuration on all networks that we have tested. Moreover we prove that the algorithm solves the feeder reconfiguration problem optimally under certain conditions. We also propose a more efficient algorithm and it incurs a loss in optimality of less than 3% on the test networks.

Second, we develop efficient distributed algorithms that solve the optimal power flow (OPF) problem on distribution networks. The OPF problem determines a network operating point that minimizes a certain objective such as generation cost or power loss. Traditionally OPF is solved in a centralized manner. With increasing penetration of volatile renewable energy resources in distribution systems, we need faster and distributed solutions for real-time feedback control. This is difficult because power flow equations are nonlinear and kirchhoff's law is global. We propose solutions for both balanced and unbalanced radial distribution networks. They exploit recent results that suggest solving for a globally optimal solution of OPF over a radial network through a second-order cone program (SOCP) or semi-definite program (SDP) relaxation. Our distributed algorithms are based on the alternating direction method of multiplier (ADMM), but unlike standard ADMM-based distributed OPF algorithms that require solving optimization subproblems using iterative methods, the proposed solutions exploit the problem structure that greatly reduce the computation time. Specifically, for balanced networks, our decomposition allows us to derive closed form solutions for these subproblems and it speeds up the convergence by 1000x times in simulations. For unbalanced networks, the subproblems reduce to either closed form solutions or eigenvalue problems whose size remains constant as the network scales up and computation time is reduced by 100x compared with iterative methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Systems-level studies of biological systems rely on observations taken at a resolution lower than the essential unit of biology, the cell. Recent technical advances in DNA sequencing have enabled measurements of the transcriptomes in single cells excised from their environment, but it remains a daunting technical problem to reconstruct in situ gene expression patterns from sequencing data. In this thesis I develop methods for the routine, quantitative in situ measurement of gene expression using fluorescence microscopy.

The number of molecular species that can be measured simultaneously by fluorescence microscopy is limited by the pallet of spectrally distinct fluorophores. Thus, fluorescence microscopy is traditionally limited to the simultaneous measurement of only five labeled biomolecules at a time. The two methods described in this thesis, super-resolution barcoding and temporal barcoding, represent strategies for overcoming this limitation to monitor expression of many genes in a single cell. Super-resolution barcoding employs optical super-resolution microscopy (SRM) and combinatorial labeling via-smFISH (single molecule fluorescence in situ hybridization) to uniquely label individual mRNA species with distinct barcodes resolvable at nanometer resolution. This method dramatically increases the optical space in a cell, allowing a large numbers of barcodes to be visualized simultaneously. As a proof of principle this technology was used to study the S. cerevisiae calcium stress response. The second method, sequential barcoding, reads out a temporal barcode through multiple rounds of oligonucleotide hybridization to the same mRNA. The multiplexing capacity of sequential barcoding increases exponentially with the number of rounds of hybridization, allowing over a hundred genes to be profiled in only a few rounds of hybridization.

The utility of sequential barcoding was further demonstrated by adapting this method to study gene expression in mammalian tissues. Mammalian tissues suffer both from a large amount of auto-fluorescence and light scattering, making detection of smFISH probes on mRNA difficult. An amplified single molecule detection technology, smHCR (single molecule hairpin chain reaction), was developed to allow for the quantification of mRNA in tissue. This technology is demonstrated in combination with light sheet microscopy and background reducing tissue clearing technology, enabling whole-organ sequential barcoding to monitor in situ gene expression directly in intact mammalian tissue.

The methods presented in this thesis, specifically sequential barcoding and smHCR, enable multiplexed transcriptional observations in any tissue of interest. These technologies will serve as a general platform for future transcriptomic studies of complex tissues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A mathematical model is proposed in this thesis for the control mechanism of free fatty acid-glucose metabolism in healthy individuals under resting conditions. The objective is to explain in a consistent manner some clinical laboratory observations such as glucose, insulin and free fatty acid responses to intravenous injection of glucose, insulin, etc. Responses up to only about two hours from the beginning of infusion are considered. The model is an extension of the one for glucose homeostasis proposed by Charette, Kadish and Sridhar (Modeling and Control Aspects of Glucose Homeostasis. Mathematical Biosciences, 1969). It is based upon a systems approach and agrees with the current theories of glucose and free fatty acid metabolism. The description is in terms of ordinary differential equations. Validation of the model is based on clinical laboratory data available at the present time. Finally procedures are suggested for systematically identifying the parameters associated with the free fatty acid portion of the model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

I. The binding of the intercalating dye ethidium bromide to closed circular SV 40 DNA causes an unwinding of the duplex structure and a simultaneous and quantitatively equivalent unwinding of the superhelices. The buoyant densities and sedimentation velocities of both intact (I) and singly nicked (II) SV 40 DNAs were measured as a function of free dye concentration. The buoyant density data were used to determine the binding isotherms over a dye concentration range extending from 0 to 600 µg/m1 in 5.8 M CsCl. At high dye concentrations all of the binding sites in II, but not in I, are saturated. At free dye concentrations less than 5.4 µg/ml, I has a greater affinity for dye than II. At a critical amount of dye bound I and II have equal affinities, and at higher dye concentration I has a lower affinity than II. The number of superhelical turns, τ, present in I is calculated at each dye concentration using Fuller and Waring's (1964) estimate of the angle of duplex unwinding per intercalation. The results reveal that SV 40 DNA I contains about -13 superhelical turns in concentrated salt solutions.

The free energy of superhelix formation is calculated as a function of τ from a consideration of the effect of the superhelical turns upon the binding isotherm of ethidium bromide to SV 40 DNA I. The value of the free energy is about 100 kcal/mole DNA in the native molecule. The free energy estimates are used to calculate the pitch and radius of the superhelix as a function of the number of superhelical turns. The pitch and radius of the native I superhelix are 430 Å and 135 Å, respectively.

A buoyant density method for the isolation and detection of closed circular DNA is described. The method is based upon the reduced binding of the intercalating dye, ethidium bromide, by closed circular DNA. In an application of this method it is found that HeLa cells contain in addition to closed circular mitochondrial DNA of mean length 4.81 microns, a heterogeneous group of smaller DNA molecules which vary in size from 0.2 to 3.5 microns and a paucidisperse group of multiples of the mitochondrial length.

II. The general theory is presented for the sedimentation equilibrium of a macromolecule in a concentrated binary solvent in the presence of an additional reacting small molecule. Equations are derived for the calculation of the buoyant density of the complex and for the determination of the binding isotherm of the reagent to the macrospecies. The standard buoyant density, a thermodynamic function, is defined and the density gradients which characterize the four component system are derived. The theory is applied to the specific cases of the binding of ethidium bromide to SV 40 DNA and of the binding of mercury and silver to DNA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis is an investigation into the nature of data analysis and computer software systems which support this activity.

The first chapter develops the notion of data analysis as an experimental science which has two major components: data-gathering and theory-building. The basic role of language in determining the meaningfulness of theory is stressed, and the informativeness of a language and data base pair is studied. The static and dynamic aspects of data analysis are then considered from this conceptual vantage point. The second chapter surveys the available types of computer systems which may be useful for data analysis. Particular attention is paid to the questions raised in the first chapter about the language restrictions imposed by the computer system and its dynamic properties.

The third chapter discusses the REL data analysis system, which was designed to satisfy the needs of the data analyzer in an operational relational data system. The major limitation on the use of such systems is the amount of access to data stored on a relatively slow secondary memory. This problem of the paging of data is investigated and two classes of data structure representations are found, each of which has desirable paging characteristics for certain types of queries. One representation is used by most of the generalized data base management systems in existence today, but the other is clearly preferred in the data analysis environment, as conceptualized in Chapter I.

This data representation has strong implications for a fundamental process of data analysis -- the quantification of variables. Since quantification is one of the few means of summarizing and abstracting, data analysis systems are under strong pressure to facilitate the process. Two implementations of quantification are studied: one analagous to the form of the lower predicate calculus and another more closely attuned to the data representation. A comparison of these indicates that the use of the "label class" method results in orders of magnitude improvement over the lower predicate calculus technique.