9 resultados para Methods: data analysis
em CaltechTHESIS
Resumo:
In this work, we further extend the recently developed adaptive data analysis method, the Sparse Time-Frequency Representation (STFR) method. This method is based on the assumption that many physical signals inherently contain AM-FM representations. We propose a sparse optimization method to extract the AM-FM representations of such signals. We prove the convergence of the method for periodic signals under certain assumptions and provide practical algorithms specifically for the non-periodic STFR, which extends the method to tackle problems that former STFR methods could not handle, including stability to noise and non-periodic data analysis. This is a significant improvement since many adaptive and non-adaptive signal processing methods are not fully capable of handling non-periodic signals. Moreover, we propose a new STFR algorithm to study intrawave signals with strong frequency modulation and analyze the convergence of this new algorithm for periodic signals. Such signals have previously remained a bottleneck for all signal processing methods. Furthermore, we propose a modified version of STFR that facilitates the extraction of intrawaves that have overlaping frequency content. We show that the STFR methods can be applied to the realm of dynamical systems and cardiovascular signals. In particular, we present a simplified and modified version of the STFR algorithm that is potentially useful for the diagnosis of some cardiovascular diseases. We further explain some preliminary work on the nature of Intrinsic Mode Functions (IMFs) and how they can have different representations in different phase coordinates. This analysis shows that the uncertainty principle is fundamental to all oscillating signals.
Resumo:
The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 10^11 neurons, each making an average of 10^3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis.
It is divided into two parts. The first begins with an exposition of the general techniques of latent variable modeling. A new, extremely general, optimization algorithm is proposed - called Relaxation Expectation Maximization (REM) - that may be used to learn the optimal parameter values of arbitrary latent variable models. This algorithm appears to alleviate the common problem of convergence to local, sub-optimal, likelihood maxima. REM leads to a natural framework for model size selection; in combination with standard model selection techniques the quality of fits may be further improved, while the appropriate model size is automatically and efficiently determined. Next, a new latent variable model, the mixture of sparse hidden Markov models, is introduced, and approximate inference and learning algorithms are derived for it. This model is applied in the second part of the thesis.
The second part brings the technology of part I to bear on two important problems in experimental neuroscience. The first is known as spike sorting; this is the problem of separating the spikes from different neurons embedded within an extracellular recording. The dissertation offers the first thorough statistical analysis of this problem, which then yields the first powerful probabilistic solution. The second problem addressed is that of characterizing the distribution of spike trains recorded from the same neuron under identical experimental conditions. A latent variable model is proposed. Inference and learning in this model leads to new principled algorithms for smoothing and clustering of spike data.
Resumo:
This thesis is an investigation into the nature of data analysis and computer software systems which support this activity.
The first chapter develops the notion of data analysis as an experimental science which has two major components: data-gathering and theory-building. The basic role of language in determining the meaningfulness of theory is stressed, and the informativeness of a language and data base pair is studied. The static and dynamic aspects of data analysis are then considered from this conceptual vantage point. The second chapter surveys the available types of computer systems which may be useful for data analysis. Particular attention is paid to the questions raised in the first chapter about the language restrictions imposed by the computer system and its dynamic properties.
The third chapter discusses the REL data analysis system, which was designed to satisfy the needs of the data analyzer in an operational relational data system. The major limitation on the use of such systems is the amount of access to data stored on a relatively slow secondary memory. This problem of the paging of data is investigated and two classes of data structure representations are found, each of which has desirable paging characteristics for certain types of queries. One representation is used by most of the generalized data base management systems in existence today, but the other is clearly preferred in the data analysis environment, as conceptualized in Chapter I.
This data representation has strong implications for a fundamental process of data analysis -- the quantification of variables. Since quantification is one of the few means of summarizing and abstracting, data analysis systems are under strong pressure to facilitate the process. Two implementations of quantification are studied: one analagous to the form of the lower predicate calculus and another more closely attuned to the data representation. A comparison of these indicates that the use of the "label class" method results in orders of magnitude improvement over the lower predicate calculus technique.
Resumo:
Laser interferometer gravitational wave observatory (LIGO) consists of two complex large-scale laser interferometers designed for direct detection of gravitational waves from distant astrophysical sources in the frequency range 10Hz - 5kHz. Direct detection of space-time ripples will support Einstein's general theory of relativity and provide invaluable information and new insight into physics of the Universe.
Initial phase of LIGO started in 2002, and since then data was collected during six science runs. Instrument sensitivity was improving from run to run due to the effort of commissioning team. Initial LIGO has reached designed sensitivity during the last science run, which ended in October 2010.
In parallel with commissioning and data analysis with the initial detector, LIGO group worked on research and development of the next generation detectors. Major instrument upgrade from initial to advanced LIGO started in 2010 and lasted till 2014.
This thesis describes results of commissioning work done at LIGO Livingston site from 2013 until 2015 in parallel with and after the installation of the instrument. This thesis also discusses new techniques and tools developed at the 40m prototype including adaptive filtering, estimation of quantization noise in digital filters and design of isolation kits for ground seismometers.
The first part of this thesis is devoted to the description of methods for bringing interferometer to the linear regime when collection of data becomes possible. States of longitudinal and angular controls of interferometer degrees of freedom during lock acquisition process and in low noise configuration are discussed in details.
Once interferometer is locked and transitioned to low noise regime, instrument produces astrophysics data that should be calibrated to units of meters or strain. The second part of this thesis describes online calibration technique set up in both observatories to monitor the quality of the collected data in real time. Sensitivity analysis was done to understand and eliminate noise sources of the instrument.
Coupling of noise sources to gravitational wave channel can be reduced if robust feedforward and optimal feedback control loops are implemented. The last part of this thesis describes static and adaptive feedforward noise cancellation techniques applied to Advanced LIGO interferometers and tested at the 40m prototype. Applications of optimal time domain feedback control techniques and estimators to aLIGO control loops are also discussed.
Commissioning work is still ongoing at the sites. First science run of advanced LIGO is planned for September 2015 and will last for 3-4 months. This run will be followed by a set of small instrument upgrades that will be installed on a time scale of few months. Second science run will start in spring 2016 and last for about 6 months. Since current sensitivity of advanced LIGO is already more than factor of 3 higher compared to initial detectors and keeps improving on a monthly basis, upcoming science runs have a good chance for the first direct detection of gravitational waves.
Resumo:
Recent observations of the temperature anisotropies of the cosmic microwave background (CMB) favor an inflationary paradigm in which the scale factor of the universe inflated by many orders of magnitude at some very early time. Such a scenario would produce the observed large-scale isotropy and homogeneity of the universe, as well as the scale-invariant perturbations responsible for the observed (10 parts per million) anisotropies in the CMB. An inflationary epoch is also theorized to produce a background of gravitational waves (or tensor perturbations), the effects of which can be observed in the polarization of the CMB. The E-mode (or parity even) polarization of the CMB, which is produced by scalar perturbations, has now been measured with high significance. Con- trastingly, today the B-mode (or parity odd) polarization, which is sourced by tensor perturbations, has yet to be observed. A detection of the B-mode polarization of the CMB would provide strong evidence for an inflationary epoch early in the universe’s history.
In this work, we explore experimental techniques and analysis methods used to probe the B- mode polarization of the CMB. These experimental techniques have been used to build the Bicep2 telescope, which was deployed to the South Pole in 2009. After three years of observations, Bicep2 has acquired one of the deepest observations of the degree-scale polarization of the CMB to date. Similarly, this work describes analysis methods developed for the Bicep1 three-year data analysis, which includes the full data set acquired by Bicep1. This analysis has produced the tightest constraint on the B-mode polarization of the CMB to date, corresponding to a tensor-to-scalar ratio estimate of r = 0.04±0.32, or a Bayesian 95% credible interval of r < 0.70. These analysis methods, in addition to producing this new constraint, are directly applicable to future analyses of Bicep2 data. Taken together, the experimental techniques and analysis methods described herein promise to open a new observational window into the inflationary epoch and the initial conditions of our universe.
Resumo:
Seismic reflection methods have been extensively used to probe the Earth's crust and suggest the nature of its formative processes. The analysis of multi-offset seismic reflection data extends the technique from a reconnaissance method to a powerful scientific tool that can be applied to test specific hypotheses. The treatment of reflections at multiple offsets becomes tractable if the assumptions of high-frequency rays are valid for the problem being considered. Their validity can be tested by applying the methods of analysis to full wave synthetics.
Three studies illustrate the application of these principles to investigations of the nature of the crust in southern California. A survey shot by the COCORP consortium in 1977 across the San Andreas fault near Parkfield revealed events in the record sections whose arrival time decreased with offset. The reflectors generating these events are imaged using a multi-offset three-dimensional Kirchhoff migration. Migrations of full wave acoustic synthetics having the same limitations in geometric coverage as the field survey demonstrate the utility of this back projection process for imaging. The migrated depth sections show the locations of the major physical boundaries of the San Andreas fault zone. The zone is bounded on the southwest by a near-vertical fault juxtaposing a Tertiary sedimentary section against uplifted crystalline rocks of the fault zone block. On the northeast, the fault zone is bounded by a fault dipping into the San Andreas, which includes slices of serpentinized ultramafics, intersecting it at 3 km depth. These interpretations can be made despite complications introduced by lateral heterogeneities.
In 1985 the Calcrust consortium designed a survey in the eastern Mojave desert to image structures in both the shallow and the deep crust. Preliminary field experiments showed that the major geophysical acquisition problem to be solved was the poor penetration of seismic energy through a low-velocity surface layer. Its effects could be mitigated through special acquisition and processing techniques. Data obtained from industry showed that quality data could be obtained from areas having a deeper, older sedimentary cover, causing a re-definition of the geologic objectives. Long offset stationary arrays were designed to provide reversed, wider angle coverage of the deep crust over parts of the survey. The preliminary field tests and constant monitoring of data quality and parameter adjustment allowed 108 km of excellent crustal data to be obtained.
This dataset, along with two others from the central and western Mojave, was used to constrain rock properties and the physical condition of the crust. The multi-offset analysis proceeded in two steps. First, an increase in reflection peak frequency with offset is indicative of a thinly layered reflector. The thickness and velocity contrast of the layering can be calculated from the spectral dispersion, to discriminate between structures resulting from broad scale or local effects. Second, the amplitude effects at different offsets of P-P scattering from weak elastic heterogeneities indicate whether the signs of the changes in density, rigidity, and Lame's parameter at the reflector agree or are opposed. The effects of reflection generation and propagation in a heterogeneous, anisotropic crust were contained by the design of the experiment and the simplicity of the observed amplitude and frequency trends. Multi-offset spectra and amplitude trend stacks of the three Mojave Desert datasets suggest that the most reflective structures in the middle crust are strong Poisson's ratio (σ) contrasts. Porous zones or the juxtaposition of units of mutually distant origin are indicated. Heterogeneities in σ increase towards the top of a basal crustal zone at ~22 km depth. The transition to the basal zone and to the mantle include increases in σ. The Moho itself includes ~400 m layering having a velocity higher than that of the uppermost mantle. The Moho maintains the same configuration across the Mojave despite 5 km of crustal thinning near the Colorado River. This indicates that Miocene extension there either thinned just the basal zone, or that the basal zone developed regionally after the extensional event.
Resumo:
The cytochromes P450 (P450s) are a remarkable class of heme enzymes that catalyze the metabolism of xenobiotics and the biosynthesis of signaling molecules. Controlled electron flow into the thiolate-ligated heme active site allows P450s to activate molecular oxygen and hydroxylate aliphatic C–H bonds via the formation of high-valent metal-oxo intermediates (compounds I and II). Due to the reactive nature and short lifetimes of these intermediates, many of the fundamental steps in catalysis have not been observed directly. The Gray group and others have developed photochemical methods, known as “flash-quench,” for triggering electron transfer (ET) and generating redox intermediates in proteins in the absence of native ET partners. Photo-triggering affords a high degree of temporal precision for the gating of an ET event; the initial ET and subsequent reactions can be monitored on the nanosecond-to-second timescale using transient absorption (TA) spectroscopies. Chapter 1 catalogues critical aspects of P450 structure and mechanism, including the native pathway for formation of compound I, and outlines the development of photochemical processes that can be used to artificially trigger ET in proteins. Chapters 2 and 3 describe the development of these photochemical methods to establish electronic communication between a photosensitizer and the buried P450 heme. Chapter 2 describes the design and characterization of a Ru-P450-BM3 conjugate containing a ruthenium photosensitizer covalently tethered to the P450 surface, and nanosecond-to-second kinetics of the photo-triggered ET event are presented. By analyzing data at multiple wavelengths, we have identified the formation of multiple ET intermediates, including the catalytically relevant compound II; this intermediate is generated by oxidation of a bound water molecule in the ferric resting state enzyme. The work in Chapter 3 probes the role of a tryptophan residue situated between the photosensitizer and heme in the aforementioned Ru-P450 BM3 conjugate. Replacement of this tryptophan with histidine does not perturb the P450 structure, yet it completely eliminates the ET reactivity described in Chapter 2. The presence of an analogous tryptophan in Ru-P450 CYP119 conjugates also is necessary for observing oxidative ET, but the yield of heme oxidation is lower. Chapter 4 offers a basic description of the theoretical underpinnings required to analyze ET. Single-step ET theory is first presented, followed by extensions to multistep ET: electron “hopping.” The generation of “hopping maps” and use of a hopping map program to analyze the rate advantage of hopping over single-step ET is described, beginning with an established rhenium-tryptophan-azurin hopping system. This ET analysis is then applied to the Ru-tryptophan-P450 systems described in Chapter 2; this strongly supports the presence of hopping in Ru-P450 conjugates. Chapter 5 explores the implementation of flash-quench and other phototriggered methods to examine the native reductive ET and gas binding events that activate molecular oxygen. In particular, TA kinetics that demonstrate heme reduction on the microsecond timescale for four Ru-P450 conjugates are presented. In addition, we implement laser flash-photolysis of P450 ferrous–CO to study the rates of CO rebinding in the thermophilic P450 CYP119 at variable temperature. Chapter 6 describes the development and implementation of air-sensitive potentiometric redox titrations to determine the solution reduction potentials of a series of P450 BM3 mutants, which were designed for non-native cyclopropanation of styrene in vivo. An important conclusion from this work is that substitution of the axial cysteine for serine shifts the wild type reduction potential positive by 130 mV, facilitating reduction by biological redox cofactors in the presence of poorly-bound substrates. While this mutation abolishes oxygenation activity, these mutants are capable of catalyzing the cyclopropanation of styrene, even within the confines of an E. coli cell. Four appendices are also provided, including photochemical heme oxidation in ruthenium-modified nitric oxide synthase (Appendix A), general protocols (Appendix B), Chapter-specific notes (Appendix C) and Matlab scripts used for data analysis (Appendix D).
Resumo:
Planetary atmospheres exist in a seemingly endless variety of physical and chemical environments. There are an equally diverse number of methods by which we can study and characterize atmospheric composition. In order to better understand the fundamental chemistry and physical processes underlying all planetary atmospheres, my research of the past four years has focused on two distinct topics. First, I focused on the data analysis and spectral retrieval of observations obtained by the Ultraviolet Imaging Spectrograph (UVIS) instrument onboard the Cassini spacecraft while in orbit around Saturn. These observations consisted of stellar occultation measurements of Titan's upper atmosphere, probing the chemical composition in the region 300 to 1500 km above Titan's surface. I examined the relative abundances of Titan's two most prevalent chemical species, nitrogen and methane. I also focused on the aerosols that are formed through chemistry involving these two major species, and determined the vertical profiles of aerosol particles as a function of time and latitude. Moving beyond our own solar system, my second topic of investigation involved analysis of infra-red light curves from the Spitzer space telescope, obtained as it measured the light from stars hosting planets of their own. I focused on both transit and eclipse modeling during Spitzer data reduction and analysis. In my initial work, I utilized the data to search for transits of planets a few Earth masses in size. In more recent research, I analyzed secondary eclipses of three exoplanets and constrained the range of possible temperatures and compositions of their atmospheres.
Resumo:
This work deals with two related areas: processing of visual information in the central nervous system, and the application of computer systems to research in neurophysiology.
Certain classes of interneurons in the brain and optic lobes of the blowfly Calliphora phaenicia were previously shown to be sensitive to the direction of motion of visual stimuli. These units were identified by visual field, preferred direction of motion, and anatomical location from which recorded. The present work is addressed to the questions: (1) is there interaction between pairs of these units, and (2) if such relationships can be found, what is their nature. To answer these questions, it is essential to record from two or more units simultaneously, and to use more than a single recording electrode if recording points are to be chosen independently. Accordingly, such techniques were developed and are described.
One must also have practical, convenient means for analyzing the large volumes of data so obtained. It is shown that use of an appropriately designed computer system is a profitable approach to this problem. Both hardware and software requirements for a suitable system are discussed and an approach to computer-aided data analysis developed. A description is given of members of a collection of application programs developed for analysis of neuro-physiological data and operated in the environment of and with support from an appropriate computer system. In particular, techniques developed for classification of multiple units recorded on the same electrode are illustrated as are methods for convenient graphical manipulation of data via a computer-driven display.
By means of multiple electrode techniques and the computer-aided data acquisition and analysis system, the path followed by one of the motion detection units was traced from open optic lobe through the brain and into the opposite lobe. It is further shown that this unit and its mirror image in the opposite lobe have a mutually inhibitory relationship. This relationship is investigated. The existence of interaction between other pairs of units is also shown. For pairs of units responding to motion in the same direction, the relationship is of an excitatory nature; for those responding to motion in opposed directions, it is inhibitory.
Experience gained from use of the computer system is discussed and a critical review of the current system is given. The most useful features of the system were found to be the fast response, the ability to go from one analysis technique to another rapidly and conveniently, and the interactive nature of the display system. The shortcomings of the system were problems in real-time use and the programming barrier—the fact that building new analysis techniques requires a high degree of programming knowledge and skill. It is concluded that computer system of the kind discussed will play an increasingly important role in studies of the central nervous system.