7 resultados para Secondary data analysis
em CaltechTHESIS
Resumo:
This thesis is an investigation into the nature of data analysis and computer software systems which support this activity.
The first chapter develops the notion of data analysis as an experimental science which has two major components: data-gathering and theory-building. The basic role of language in determining the meaningfulness of theory is stressed, and the informativeness of a language and data base pair is studied. The static and dynamic aspects of data analysis are then considered from this conceptual vantage point. The second chapter surveys the available types of computer systems which may be useful for data analysis. Particular attention is paid to the questions raised in the first chapter about the language restrictions imposed by the computer system and its dynamic properties.
The third chapter discusses the REL data analysis system, which was designed to satisfy the needs of the data analyzer in an operational relational data system. The major limitation on the use of such systems is the amount of access to data stored on a relatively slow secondary memory. This problem of the paging of data is investigated and two classes of data structure representations are found, each of which has desirable paging characteristics for certain types of queries. One representation is used by most of the generalized data base management systems in existence today, but the other is clearly preferred in the data analysis environment, as conceptualized in Chapter I.
This data representation has strong implications for a fundamental process of data analysis -- the quantification of variables. Since quantification is one of the few means of summarizing and abstracting, data analysis systems are under strong pressure to facilitate the process. Two implementations of quantification are studied: one analagous to the form of the lower predicate calculus and another more closely attuned to the data representation. A comparison of these indicates that the use of the "label class" method results in orders of magnitude improvement over the lower predicate calculus technique.
Resumo:
The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 10^11 neurons, each making an average of 10^3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis.
It is divided into two parts. The first begins with an exposition of the general techniques of latent variable modeling. A new, extremely general, optimization algorithm is proposed - called Relaxation Expectation Maximization (REM) - that may be used to learn the optimal parameter values of arbitrary latent variable models. This algorithm appears to alleviate the common problem of convergence to local, sub-optimal, likelihood maxima. REM leads to a natural framework for model size selection; in combination with standard model selection techniques the quality of fits may be further improved, while the appropriate model size is automatically and efficiently determined. Next, a new latent variable model, the mixture of sparse hidden Markov models, is introduced, and approximate inference and learning algorithms are derived for it. This model is applied in the second part of the thesis.
The second part brings the technology of part I to bear on two important problems in experimental neuroscience. The first is known as spike sorting; this is the problem of separating the spikes from different neurons embedded within an extracellular recording. The dissertation offers the first thorough statistical analysis of this problem, which then yields the first powerful probabilistic solution. The second problem addressed is that of characterizing the distribution of spike trains recorded from the same neuron under identical experimental conditions. A latent variable model is proposed. Inference and learning in this model leads to new principled algorithms for smoothing and clustering of spike data.
Resumo:
In this work, we further extend the recently developed adaptive data analysis method, the Sparse Time-Frequency Representation (STFR) method. This method is based on the assumption that many physical signals inherently contain AM-FM representations. We propose a sparse optimization method to extract the AM-FM representations of such signals. We prove the convergence of the method for periodic signals under certain assumptions and provide practical algorithms specifically for the non-periodic STFR, which extends the method to tackle problems that former STFR methods could not handle, including stability to noise and non-periodic data analysis. This is a significant improvement since many adaptive and non-adaptive signal processing methods are not fully capable of handling non-periodic signals. Moreover, we propose a new STFR algorithm to study intrawave signals with strong frequency modulation and analyze the convergence of this new algorithm for periodic signals. Such signals have previously remained a bottleneck for all signal processing methods. Furthermore, we propose a modified version of STFR that facilitates the extraction of intrawaves that have overlaping frequency content. We show that the STFR methods can be applied to the realm of dynamical systems and cardiovascular signals. In particular, we present a simplified and modified version of the STFR algorithm that is potentially useful for the diagnosis of some cardiovascular diseases. We further explain some preliminary work on the nature of Intrinsic Mode Functions (IMFs) and how they can have different representations in different phase coordinates. This analysis shows that the uncertainty principle is fundamental to all oscillating signals.
Resumo:
Chlorine oxide species have received considerable attention in recent years due to their central role in the balance of stratospheric ozone. Many questions pertaining to the behavior of such species still remain unanswered and plague the ability of researchers to develop accurate chemical models of the stratosphere. Presented in this thesis are three experiments that study various properties of some specific chlorine oxide species.
In the first chapter, the reaction between ClONO_2 and protonated water clusters is investigated to elucidate a possible reaction mechanism for the heterogeneous reaction of chlorine nitrate on ice. The ionic products were various forms of protonated nitric acid, NO_2 +(H_20)_m, m = 0, 1, 2. These products are analogous to products previously reported in the literature for the neutral reaction occurring on ice surfaces. Our results support the hypothesis that the heterogeneous reaction is acid-catalyzed.
In the second chapter, the photochemistry of ClONO_2 was investigated at two wavelengths, 193 and 248 nm, using the technique of photofragmentation translational spectroscopy. At both wavelengths, the predominant dissociation pathways were Cl + NO_3 and ClO + NO_2. Channel assignments were confirmed by momentum matching the counterfragments from each channel. A one-dimensional stratospheric model using the new 248 nm branching ratio determined how our results would affect the predicted Cl_x and NO_x partitioning in the stratosphere.
Chapter three explores the photodissociation dynamics of Cl_2O at 193, 248 and 308 nm. At 193 nm, we found evidence for the concerted reaction channel, Cl_2 + O. The ClO + Cl channel was also accessed, however, the majority of the ClO fragments were formed with sufficient internal energies for spontaneous secondary dissociation to occur. At 248 and 308 nm, we only observed only the ClO + Cl channel. . Some of the ClO formed at 248 nm was formed internally hot and spontaneously dissociated. Bimodal translational energy distributions of the ClO and Cl products indicate two pathways leading to the same product exist.
Appendix A, B and C discuss the details of data analysis techniques used in Chapters 1 and 2. The development of a molecular beam source of ClO dimer is presented in Appendix D.
Resumo:
Planetary atmospheres exist in a seemingly endless variety of physical and chemical environments. There are an equally diverse number of methods by which we can study and characterize atmospheric composition. In order to better understand the fundamental chemistry and physical processes underlying all planetary atmospheres, my research of the past four years has focused on two distinct topics. First, I focused on the data analysis and spectral retrieval of observations obtained by the Ultraviolet Imaging Spectrograph (UVIS) instrument onboard the Cassini spacecraft while in orbit around Saturn. These observations consisted of stellar occultation measurements of Titan's upper atmosphere, probing the chemical composition in the region 300 to 1500 km above Titan's surface. I examined the relative abundances of Titan's two most prevalent chemical species, nitrogen and methane. I also focused on the aerosols that are formed through chemistry involving these two major species, and determined the vertical profiles of aerosol particles as a function of time and latitude. Moving beyond our own solar system, my second topic of investigation involved analysis of infra-red light curves from the Spitzer space telescope, obtained as it measured the light from stars hosting planets of their own. I focused on both transit and eclipse modeling during Spitzer data reduction and analysis. In my initial work, I utilized the data to search for transits of planets a few Earth masses in size. In more recent research, I analyzed secondary eclipses of three exoplanets and constrained the range of possible temperatures and compositions of their atmospheres.
Resumo:
The differential energy spectra of cosmic-ray protons and He nuclei have been measured at energies up to 315 MeV/nucleon using balloon- and satellite-borne instruments. These spectra are presented for solar quiet times for the years 1966 through 1970. The data analysis is verified by extensive accelerator calibrations of the detector systems and by calculations and measurements of the production of secondary protons in the atmosphere.
The spectra of protons and He nuclei in this energy range are dominated by the solar modulation of the local interstellar spectra. The transport equation governing this process includes as parameters the solar-wind velocity, V, and a diffusion coefficient, K(r,R), which is assumed to be a scalar function of heliocentric radius, r, and magnetic rigidity, R. The interstellar spectra, jD, enter as boundary conditions on the solutions to the transport equation. Solutions to the transport equation have been calculated for a broad range of assumed values for K(r,R) and jD and have been compared with the measured spectra.
It is found that the solutions may be characterized in terms of a dimensionless parameter, ψ(r,R) = ∞∫r V dr'/(K(r',R). The amount of modulation is roughly proportional to ψ. At high energies or far from the Sun, where the modulation is weak, the solution is determined primarily by the value of ψ (and the interstellar spectrum) and is not sensitive to the radial dependence of the diffusion coefficient. At low energies and for small r, where the effects of adiabatic deceleration are found to be large, the spectra are largely determined by the radial dependence of the diffusion coefficient and are not very sensitive to the magnitude of ψ or to the interstellar spectra. This lack of sensitivity to jD implies that the shape of the spectra at Earth cannot be used to determine the interstellar intensities at low energies.
Values of ψ determined from electron data were used to calculate the spectra of protons and He nuclei near Earth. Interstellar spectra of the form jD α (W - 0.25m)-2.65 for both protons and He nuclei were found to yield the best fits to the measured spectra for these values of ψ, where W is the total energy and m is the rest energy. A simple model for the diffusion coefficient was used in which the radial and rigidity dependence are separable and K is independent of radius inside a modulation region which has a boundary at a distance D. Good agreement was found between the measured and calculated spectra for the years 1965 through 1968, using typical boundary distances of 2.7 and 6.1 A.U. The proton spectra observed in 1969 and 1970 were flatter than in previous years. This flattening could be explained in part by an increase in D, but also seemed to require that a noticeable fraction of the observed protons at energies as high at 50 to 100 MeV be attributed to quiet-time solar emission. The turnup in the spectra at low energies observed in all years was also attributed to solar emission. The diffusion coefficient used to fit the 1965 spectra is in reasonable agreement with that determined from the power spectra of the interplanetary magnetic field (Jokipii and Coleman, 1968). We find a factor of roughly 3 increase in ψ from 1965 to 1970, corresponding to the roughly order of magnitude decrease in the proton intensity at 250 MeV. The change in ψ might be attributed to a decrease in the diffusion coefficient, or, if the diffusion coefficient is essentially unchanged over that period (Mathews et al., 1971), might be attributed to an increase in the boundary distance, D.
Resumo:
Laser interferometer gravitational wave observatory (LIGO) consists of two complex large-scale laser interferometers designed for direct detection of gravitational waves from distant astrophysical sources in the frequency range 10Hz - 5kHz. Direct detection of space-time ripples will support Einstein's general theory of relativity and provide invaluable information and new insight into physics of the Universe.
Initial phase of LIGO started in 2002, and since then data was collected during six science runs. Instrument sensitivity was improving from run to run due to the effort of commissioning team. Initial LIGO has reached designed sensitivity during the last science run, which ended in October 2010.
In parallel with commissioning and data analysis with the initial detector, LIGO group worked on research and development of the next generation detectors. Major instrument upgrade from initial to advanced LIGO started in 2010 and lasted till 2014.
This thesis describes results of commissioning work done at LIGO Livingston site from 2013 until 2015 in parallel with and after the installation of the instrument. This thesis also discusses new techniques and tools developed at the 40m prototype including adaptive filtering, estimation of quantization noise in digital filters and design of isolation kits for ground seismometers.
The first part of this thesis is devoted to the description of methods for bringing interferometer to the linear regime when collection of data becomes possible. States of longitudinal and angular controls of interferometer degrees of freedom during lock acquisition process and in low noise configuration are discussed in details.
Once interferometer is locked and transitioned to low noise regime, instrument produces astrophysics data that should be calibrated to units of meters or strain. The second part of this thesis describes online calibration technique set up in both observatories to monitor the quality of the collected data in real time. Sensitivity analysis was done to understand and eliminate noise sources of the instrument.
Coupling of noise sources to gravitational wave channel can be reduced if robust feedforward and optimal feedback control loops are implemented. The last part of this thesis describes static and adaptive feedforward noise cancellation techniques applied to Advanced LIGO interferometers and tested at the 40m prototype. Applications of optimal time domain feedback control techniques and estimators to aLIGO control loops are also discussed.
Commissioning work is still ongoing at the sites. First science run of advanced LIGO is planned for September 2015 and will last for 3-4 months. This run will be followed by a set of small instrument upgrades that will be installed on a time scale of few months. Second science run will start in spring 2016 and last for about 6 months. Since current sensitivity of advanced LIGO is already more than factor of 3 higher compared to initial detectors and keeps improving on a monthly basis, upcoming science runs have a good chance for the first direct detection of gravitational waves.